top of page

Projects

Samples of My Work

White File Folders

Data Engineering on PHI data

Confidential

  • Working with researchers and Ph.D. students to build a data retrieval system to handle Protected Health Information of size 300 gigabytes, and to consolidate all the variables and data points at a single location.

  • Using random sampling techniques to reduce the data set size (to 2%) to perform statistical analysis, and then converting the SQL queries to run on the university’s supercomputing resources to obtain the results on full data set.

Reviewing CVs

Social Media Analysis for Financial Decision-Making

Social Media Mining

  • Research project utilizing the live data sets from Reddit, Twitter, Youtube and, S&P500 to do analysis using statistical and NLP techniques. NLP (Natural Language Processing) techniques include Sentiment Analysis and LDA topic modeling.

Typing on Computer

Predicting review ratings from review text

Yelp Dataset challenge

  • Build a collaborative filtering based recommender systems as a part of the project to recommend friends on Yelp data.

  • A study of different machine learning and deep learning models to predict review ratings from review text achieving 65% accuracy on two layer LSTM based model. Used various NLP based feature engineering to improve prediction accuracy.

Projects: Projects
bottom of page