Music Album Review Sentiment Analysis

A machine learning system that classifies music album reviews into 6 sentiment categories using Bi-Directional LSTM, Gated Recurrent Unit and traditional ML models.

Python TensorFlow Keras Scikit-learn Pandas NLTK Matplotlib

Project Overview

This sentiment analysis system processes music album reviews and classifies them into 6 sentiment categories (0-5 stars) using advanced natural language processing techniques.

The project involved extensive data preprocessing, exploratory analysis, and implementation of both traditional machine learning models (Logistic Regression, SVM, Random Forest) and deep learning approaches (Bi-Directional LSTM). The system handles class imbalance through oversampling and achieves strong performance metrics.

We use BiDirectional LSTM over GRU even its accuracy is better than Bidirectional LSTM because Bidirectioanl LSTM can better perform on ambiguous sentences due to its dual traversal.

Key Innovation: The combination of traditional ML models with a Bi-Directional LSTM architecture provides robust sentiment classification, with the LSTM model particularly effective at capturing contextual relationships in review text.

Technical Highlights

Data Processing

78,162 music album reviews
Handled class imbalance with oversampling
Text cleaning and tokenization

Model Architecture

Bi-Directional LSTM with embedding layer
Global max pooling
Dense layers with dropout

Evaluation

Accuracy, Precision, Recall metrics
Confusion matrix analysis
Comparison with traditional ML models

Technologies

TensorFlow/Keras for deep learning
Scikit-learn for traditional ML
Pandas/Numpy for data manipulation

Model Performance

Accuracy

86.4%

Bi-Directional LSTM

Precision

85.2%

Weighted average

Recall

86.4%

Weighted average

Model Comparison

The Bi-Directional LSTM outperformed traditional machine learning models:

Logistic Regression: 78.3% accuracy
Random Forest: 81.6% accuracy
SVM: 79.8% accuracy

Figure 2: Confusion matrix showing classification performance across 6 sentiment classes

Project Links

View Source Code View Visualizations Download Report

📊 Project Details

Status: Completed
Dataset: 78,162 reviews
Classes: 6 sentiment levels
Framework: TensorFlow 2.x

Technology Stack

Core Libraries

TensorFlow Keras Scikit-learn Pandas Numpy

NLP

NLTK Tokenizer Word Embeddings

Visualization

Matplotlib Seaborn Plotly

Dataset Information

Original Distribution

5-star: 29,520 (37.8%)
4-star: 39,045 (49.9%)
3-star: 4,430 (5.7%)
2-star: 4,245 (5.4%)
1-star: 525 (0.7%)
0.5-star: 397 (0.5%)

After Oversampling

Balanced to 39,045 samples per class

Significant class imbalance required oversampling for model training

Data Visualizations

Original rating distribution showed extreme imbalance with most reviews being 4 or 5 stars. Oversampling was applied to create balanced classes for training.

Review length analysis showed most reviews under 600 words. The 95th percentile was used to determine optimal padding length for the LSTM model.

The Bi-Directional LSTM architecture with embedding layer, global max pooling, and dense layers with dropout achieved the best performance.

Technical Challenges & Solutions

The original dataset had extreme class imbalance with 49.9% 4-star and 37.8% 5-star reviews, while other classes were underrepresented.

Solution: Implemented oversampling to balance all classes to 39,045 samples each, ensuring the model wouldn't be biased toward majority classes.

Result: Model learned to recognize patterns in all sentiment classes equally.

Review lengths varied from a few words to thousands, making consistent model input challenging.

Solution: Analyzed length distribution and used the 95th percentile (612 words) as max length for padding/truncation.

Result: Balanced information retention with computational efficiency.

Needed to determine whether traditional ML or deep learning would perform better for this task.

Solution: Implemented and compared Logistic Regression, SVM, Random Forest, and Bi-Directional LSTM models.

Result: Bi-Directional LSTM achieved superior performance (86.4% accuracy vs 78-82% for traditional models).

Key Code Implementation

Bi-Directional LSTM Model Architecture

model = tf.keras.Sequential([
    tf.keras.layers.Embedding(input_dim=30000, 
                            output_dim=128, 
                            input_length=max_len),
    tf.keras.layers.Bidirectional(
        tf.keras.layers.LSTM(64, return_sequences=True)),
    tf.keras.layers.GlobalMaxPooling1D(),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(6, activation='softmax')
])

Data Preprocessing

# Handle class imbalance through oversampling
dfs = [df[df['Rating_Class'] == i] for i in range(6)]
max_count = max(len(d) for d in dfs)
dfs_balanced = [resample(d, replace=True, 
                        n_samples=max_count, 
                        random_state=42) 
                for d in dfs]
df_balanced = pd.concat(dfs_balanced)

Text Tokenization

tokenizer = Tokenizer(num_words=30000, oov_token='')
tokenizer.fit_on_texts(df_balanced['Review'])
sequences = tokenizer.texts_to_sequences(df_balanced['Review'])
padded = pad_sequences(sequences, maxlen=max_len, 
                      padding='post', truncating='post')

Switcher

Music Album Review Sentiment Analysis

Project Overview

Technical Highlights

Data Processing

Model Architecture

Evaluation

Technologies

Model Performance

Accuracy

86.4%

Precision

85.2%

Recall

86.4%

Model Comparison

Project Links

📊 Project Details

Technology Stack

Core Libraries

NLP

Visualization

Dataset Information

Original Distribution

After Oversampling

Data Visualizations

Technical Challenges & Solutions

Class Imbalance

Variable Review Lengths

Model Selection

Key Code Implementation

Bi-Directional LSTM Model Architecture

Data Preprocessing

Text Tokenization