An interactive, multi-dimensional graph showing feature ratings and their popularity for a given product. The size of the circle represents the reliability of the data and the color represents the sentiment. The Y-axis represents the normalized hidden feature rating. The X-axis represents the aspect rank or how often people discussed it in the reviews. Hovering over the circles on the original graph shows the inset pop-ups shown here.

Fooreviews is a product review analysis site. It takes a product URL from Amazon and applies a machine learning model to determine hidden product features. The user can see individually rated product features (up to 50 features in some cases), with a corresponding sentiment, reliability of the prediction, and a word cloud representing the predicted sentiment.

The idea for this project came from my struggle with Amazon product reviews. A lot of times a product would be rated 5-stars or be in the high 4s. But the actual product is nothing like the glorified reviews. Fake reviews are rampant. There are tools out there like ReviewMeta but they didn't quite address my concerns. I didn't only want to know if the reviews were fake, but I also wanted to know what the product was good at. My hypothesis was that given enough legitimate reviews, there must be a general consensus on what aspects of the product were good and what aspects were bad. In addition, I predicted that reviews submitted over the lifetime of the product would be more reliable than the ones submitted right after purchase.

I set out to find hidden product features like so: I would scrape all the reviews of the product and run a semi-supervised clustering. And then I would label the clusters manually and perform analytics on the reviews making up the clusters. I used several machine learning tools and techniques to accomplish this. The reviews needed a lot of NLP processing so i used spaCy for that. And then I used Gensim's implementation of Latent Dirichlet Allocation (LDA) to perform topic discovery. The topic discovery was done on a bag of words, generating a cluster of ten words for each topic. The ML model cannot predict how many clusters there were so this is where it needed supervision (and made the product unscalable!).

Once I had the topics, I needed to find the reviews and individual sentences that were relevant for each topic from my corpus. I used doc2vec, a more advanced version of word2vec, to perform word and sentence association. I used the cluster of words from the topic discovery phase to query the doc2vec model. This gave me an output of all the sentences from the reviews that were semantically related to the topic. The output was good but it wasn't perfect. So there was another round of supervision where I manually flagged each output as relevant or not.

Hidden product features discovered for Acer Aspire Laptop after topic modeling

After the ML process, I used numpy for calculations and data generation for the website. The following images show the result:


Analyzing a single product was costly. Analyzing an entirely new domain of products was time-intensive. In fact, I only focused on three domains--appliance, consumer electronics, and travel bags. For every new domain, I scraped the reviews of tens of thousands products on Amazon to train the base ML model. This model was used throughout the ML process to make predictions on any new product within that domain. The training required a lot of manual fine-tuning and iteration.

Fooreviews product search page

Although I analyzed several products in each domain, it wasn't really feasible to scale it or make it real time. The ML process demanded a lot of GPU processing and was too slow for consumer use.


Copyright © 2023 Biz Melesse. All Rights Reserved