Sentiment Analyzer The Sentiment Analyzer skill guides you through implementing sentiment analysis systems that understand the emotional tone and opinion in text. From simple positive/negative classification to nuanced aspect-based sentiment and emotion detection, this skill covers the full spectrum of sentiment analysis capabilities. Sentiment analysis is deceptively complex. Sarcasm, context, domain-specific language, and cultural nuances all challenge simple approaches. This skill helps you choose the right techniques for your accuracy requirements, whether that's fast rule-based systems, fine-tuned classifiers, or LLM-based analysis. Whether you're analyzing customer reviews, social media mentions, support tickets, or survey responses, this skill ensures your sentiment analysis captures the true voice of your users. Core Workflows Workflow 1: Choose Sentiment Analysis Approach Define requirements: Granularity: Binary, ternary, or continuous? Aspects: Overall or aspect-based? Emotions: Sentiment or specific emotions? Languages: Single or multilingual? Volume: Batch or real-time? Evaluate options: Approach Speed Accuracy Customizable Best For Rule-based (VADER) Very fast Moderate Limited Social media, quick analysis Pre-trained (RoBERTa) Fast Good Fine-tunable General text Fine-tuned Fast Best Requires data Domain-specific LLM (GPT-4, Claude) Slow Excellent Prompt-based Nuanced, complex Select based on tradeoffs Plan implementation Workflow 2: Implement Sentiment Pipeline Preprocess text: def preprocess_for_sentiment ( text ) :
Preserve sentiment-relevant features
text
normalize_unicode ( text )
Handle social media conventions
text
expand_contractions ( text )
don't -> do not
text
normalize_elongation ( text )
loooove -> love
text
handle_negation ( text )
Mark negation scope
Preserve but normalize emoji/emoticons
text
convert_emoji_to_text ( text )
:) -> [HAPPY]
return text Analyze sentiment: class SentimentAnalyzer : def init ( self , model_type = "transformer" ) : if model_type == "transformer" : self . model = pipeline ( "sentiment-analysis" , model = "cardiffnlp/twitter-roberta-base-sentiment" ) elif model_type == "vader" : self . model = SentimentIntensityAnalyzer ( ) def analyze ( self , text ) : preprocessed = preprocess_for_sentiment ( text ) result = self . model ( preprocessed ) return { "text" : text , "sentiment" : result [ "label" ] , "confidence" : result [ "score" ] } Aggregate for insights: Overall sentiment distribution Sentiment over time Sentiment by segment/topic Validate results Workflow 3: Aspect-Based Sentiment Analysis Identify aspects to track: Product features (price, quality, service) Experience dimensions (speed, accuracy, friendliness) Custom aspects for your domain Extract aspects from text: def extract_aspects ( text , aspect_list ) :
Find mentions of known aspects
found_aspects
[ ] for aspect in aspect_list : if aspect . lower ( ) in text . lower ( ) : found_aspects . append ( aspect )
Also extract using NER or LLM for unknown aspects
extracted
extract_noun_phrases ( text ) return found_aspects + extracted Analyze sentiment per aspect: def aspect_sentiment ( text , aspects ) : results = { } for aspect in aspects :
Extract sentences mentioning aspect
relevant
extract_aspect_context ( text , aspect )
Analyze sentiment of relevant text
- if
- relevant
- :
- sentiment
- =
- analyze_sentiment
- (
- relevant
- )
- results
- [
- aspect
- ]
- =
- sentiment
- return
- results
- Aggregate
- aspect sentiments across documents
- Quick Reference
- Action
- Command/Trigger
- Analyze sentiment
- "Analyze sentiment of [text]"
- Choose approach
- "Best sentiment analysis for [use case]"
- Aspect-based
- "Sentiment by feature for [reviews]"
- Detect emotions
- "Detect emotions in [text]"
- Handle sarcasm
- "How to handle sarcasm in sentiment"
- Aggregate results
- "Summarize sentiment trends"
- Best Practices
- Preserve Sentiment Signals
-
- Don't preprocess away important cues
- Keep punctuation (!! vs .)
- Preserve capitalization patterns
- Keep emoji/emoticons (convert to text)
- Handle negation explicitly
- Match Model to Domain
-
- Pre-trained models have domain bias
- Twitter models work differently than product review models
- Fine-tune or select domain-appropriate models
- Test on your actual data before deploying
- Handle Negation Properly
-
- "Not bad" isn't negative
- Rule-based: Mark negation scope
- Neural models: Usually handle automatically
- Test negation cases explicitly
- Consider Context
-
- Sentiment depends on context
- "Cheap" is positive for budget items, negative for luxury
- Use aspect-based analysis for nuance
- Include surrounding context when possible
- Validate with Humans
-
- Machine sentiment != human sentiment
- Sample and manually verify results
- Calculate agreement metrics
- Iterate on disagreements
- Report Uncertainty
- Not all text has clear sentiment Neutral is a valid class Low confidence predictions should be flagged Consider abstaining on ambiguous cases Advanced Techniques LLM-Based Nuanced Sentiment Use language models for complex analysis: def llm_sentiment_analysis ( text , aspects = None ) : prompt = f"""Analyze the sentiment of the following text. Text: " { text } " Provide: 1. Overall sentiment (positive/negative/neutral/mixed) 2. Confidence (0-1) 3. Key positive aspects mentioned 4. Key negative aspects mentioned 5. Notable emotional tones (joy, frustration, surprise, etc.) { "Also rate sentiment specifically for these aspects: " + ", " . join ( aspects ) if aspects else "" } Respond in JSON format.""" response = llm . complete ( prompt ) return json . loads ( response ) Emotion Detection Beyond positive/negative to specific emotions: from transformers import pipeline
Multi-label emotion classification
emotion_classifier
pipeline ( "text-classification" , model = "SamLowe/roberta-base-go_emotions" , top_k = None ) def detect_emotions ( text ) : results = emotion_classifier ( text ) [ 0 ]
Filter to significant emotions
significant
[ r for r in results if r [ "score" ]
0.1 ] return sorted ( significant , key = lambda x : x [ "score" ] , reverse = True )
Example output:
[{"label": "admiration", "score": 0.45},
{"label": "joy", "score": 0.32},
{"label": "gratitude", "score": 0.28}]
Comparative Sentiment Detect sentiment comparisons: def comparative_sentiment ( text ) : """ Detect: "A is better than B" patterns """ prompt = f"""Analyze this text for comparative sentiment. Text: " { text } " If the text compares entities, identify: 1. Entity A (the preferred/better one) 2. Entity B (the less preferred/worse one) 3. Dimension of comparison (price, quality, etc.) 4. Strength of preference (slight, moderate, strong) If no comparison, respond with: {{"comparison": false}} Respond in JSON.""" return llm . complete ( prompt ) Temporal Sentiment Tracking Analyze sentiment over time: def sentiment_timeline ( documents , time_field , window = "day" ) : """ Track sentiment trends over time. """
Analyze each document
results
[ ] for doc in documents : sentiment = analyze_sentiment ( doc [ "text" ] ) results . append ( { "timestamp" : doc [ time_field ] , "sentiment" : sentiment [ "score" ] , "text" : doc [ "text" ] } )
Aggregate by time window
df
pd . DataFrame ( results ) df [ "window" ] = df [ "timestamp" ] . dt . floor ( window ) trends = df . groupby ( "window" ) . agg ( { "sentiment" : [ "mean" , "std" , "count" ] , "text" : lambda x : list ( x ) [ : 3 ]
Sample texts
} ) return trends Sarcasm Detection Handle sarcasm before sentiment analysis: def detect_sarcasm ( text ) : """ Detect potential sarcasm indicators. """ indicators = { "exaggeration" : bool ( re . search ( r'\b(best|worst|ever|always|never)\b' , text . lower ( ) ) ) , "air_quotes" : '"' in text , "ellipsis" : "..." in text , "positive_negative_mix" : has_mixed_signals ( text ) , "hashtags" : "#sarcasm" in text . lower ( ) or "#not" in text . lower ( ) }
Use model for detection
sarcasm_score
sarcasm_model . predict ( text ) return { "is_sarcastic" : sarcasm_score
0.5 , "confidence" : sarcasm_score , "indicators" : indicators } def sentiment_with_sarcasm ( text ) : sarcasm = detect_sarcasm ( text ) base_sentiment = analyze_sentiment ( text ) if sarcasm [ "is_sarcastic" ] and sarcasm [ "confidence" ]
0.7 :
Flip sentiment
return flip_sentiment ( base_sentiment ) return base_sentiment Common Pitfalls to Avoid Using generic models on domain-specific text Preprocessing away sentiment-relevant features (emoji, punctuation) Ignoring negation handling Treating neutral as absence of opinion vs explicit neutrality Not validating model outputs against human judgment Assuming sarcasm doesn't exist in your data Over-weighting extreme sentiments in aggregation Reporting sentiment without confidence/uncertainty