ML

All in One Data Cleaning App

Developed an Automated Data Cleaning App which performs 15 most important data cleaning features i.e. handling missing values, correlation tests, statistical tests, balancing dataset using sampling techniques etc. It reduces the overall data cleaning time by nearly 30-40%.

Mobility Impact on the new Covid19 cases Visualization in Tableau

Created an interactive dashboard that visualized the impact of mobility in different areas on the new covid19 cases. Proven the results both visually in Tableau and statistically using a Linear regression model showing that the mobility in parks is causing high number of new cases.

Prediction of Diabetes using Insurance Claims Dataset

Built a predictive model to find the risk of diabetes in patients using Insurance Claims dataset. Featurized the Insurance claims data, and train and optimize a predictive SVC model and found out probability of diabetes in each patient.

Prediction of EMR Usage using Hints Survery Form Data

Transformed data using Spark RRD operations on 5000K reviews. Built a product recommendation system using ALS Collaborative Filtering obtaining an RMSE value of 0.91. The data set contains data for 287,209 products with 5,074,160 reviews and ratings by 1, 57,386 users.

Screening Tool for Chronic Kidney Disease Prediction

Created an interactive screening tool using R Shiny that predicts the risk of having a chronic Kidney disease using the Logistic regression model attaining 97% recall. The screening tool can be used by doctors in finding the probablity of Chronic Kidney Disease in patients.

Twitter Sentimental Analysis using Logistic Regression

Analyzed Twitter Sentiment by implementing Logistic Regression from scratch. Used twitter dataset having 1 million tweets and applied logistic regression on it to find the sentiment associated with each tweet which would help in classifying of tweets.

Walmart Sales Forecasting using Time Series Analysis

Predicted the department wise sales for 45 Walmart stores modeling the effects of markdowns on holidays using the ARIMA and Holt-Winters model. The historical sales data is provided for 45 Walmart stores and each store contains a number of departments.