Skip to content

Alleria1809/dsci560_app

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

dsci560_app

APP GitHub:

https://github.com/Alleria1809/dsci560_app.git

yelp_crawler.ipynb

Crawl information and attributes from Yelp using selenium.

EDA.ipynb

Exploratory Data Analysis steps for collected data, e.g. encoding, statistical analysis, plotting, and other visualizations.

prediction_modeling.ipynb

Use different models to predict the risk levels.

record_linkage.ipynb:

Read data from the LA open dataset and Yelp crawled data. Use RLTK package to handle the two datasets. Apply Blocking and Entity Linking techniques to combine the data.

segmentation.ipynb

Run PCA to reduce the dimension. Run KMeans to cluster the data. Use t-SNE to generate 2-D visualizations. Apply LDA topic modeling to detect keywords of the restaurant comments in each cluster.

NN.ipynb

Use TensorFlow framework to build neural network models for multiclass classification.

recommendation.ipynb

Generate tag sets for each restaurant. Compute Jaccard similarities. Recommendation algorithms for both recommendation functions - inputting features & inputting name.

Project Video

https://drive.google.com/file/d/1i-z4BUMXxMZFXgBARAiYcsB-Vs2owMNM/view?usp=sharing

Presentation

Please refer to the Final_Presentation.pdf