BD

Amazon Product Recommendation System

Transformed data using Spark RRD operations on 5000K reviews. Built a product recommendation system using ALS Collaborative Filtering obtaining an RMSE value of 0.91. The data set contains data for 287,209 products with 5,074,160 reviews and ratings by 1, 57,386 users.

Find Amazons most potential customers using Spark Operations

Performed Apache Spark operations and found out the list of customers who are active on Amazon. This would be helpful to Amazon as it is conducting an A/B testing experiment on potential target users and want to know if customers list which they have are ACTIVE users or not.

Mapper Reducer Implementation from Scratch

Developed a mapper reducer function which can perform the same functionalities of a mapper reducer used in big data applications. Implemented Multi-threading operations for faster functionality of Mapper-Reducer function by parallel processing of data.

Twitter Sentimental Analysis using Logistic Regression

Analyzed Twitter Sentiment by implementing Logistic Regression from scratch. Used twitter dataset having 1 million tweets and applied logistic regression on it to find the sentiment associated with each tweet which would help in classifying of tweets.