π§ Keywords
NLP, PySpark, AWS, Big Data, Sentiment Analysis, TextBlob, Data Visualization, text analytics, dashboards, cloud tools
π§© Problem
Understanding public sentiment on AI helps track trends β but analyzing thousands of tweets manually isnβt scalable.
This project uses large-scale text data to extract meaningful insights from social media conversations around AI.
βοΈ What I Did
- Processed a dataset of 10K+ pre-collected tweets using PySpark on Databricks to clean and structure the text for analysis.
- Applied TextBlob to assign sentiment labels (positive, neutral, negative) to tweets, since the data had no pre-existing annotations.
- Exported processed data to Amazon S3, and queried it via Amazon Athena for flexible, scalable analysis.
- Designed an interactive dashboard using Amazon QuickSight to visualize sentiment distribution, trends over time, and key word patterns.
π Outcome
- Built an end-to-end cloud-based NLP pipeline for sentiment analysis using big data tools.
- Analyzed 10K+ tweets to uncover public sentiment trends around AI.
- Gained hands-on experience with PySpark, AWS services, and visual storytelling with real-world social media data.
π οΈΒ Tech Stack: