- Survey Analyzer
- Comprehensive survey data analysis with Likert scales, cross-tabs, and sentiment analysis.
- Features
- Likert Scale Analysis
-
- Agreement scale scoring and visualization
- Cross-Tabulation
-
- Relationship analysis between categorical variables
- Frequency Analysis
-
- Response distributions and percentages
- Sentiment Scoring
-
- Text response sentiment analysis
- Open-Ended Analysis
-
- Theme extraction from text responses
- Statistical Tests
-
- Chi-square, correlations, significance testing
- Visualizations
-
- Bar charts, heatmaps, word clouds, distribution plots
- Report Generation
- Comprehensive PDF/HTML reports Quick Start from survey_analyzer import SurveyAnalyzer analyzer = SurveyAnalyzer ( )
Load survey data
analyzer . load_csv ( 'survey_responses.csv' )
Analyze Likert scale question
results
analyzer . likert_analysis ( 'satisfaction' , scale_type = 'agreement' ) print ( f"Mean score: { results [ 'mean_score' ] : .2f } " )
Cross-tabulation
crosstab
analyzer . crosstab ( 'age_group' , 'product_preference' ) print ( crosstab )
Generate report
analyzer . generate_report ( 'survey_report.pdf' ) CLI Usage
Analyze Likert scale
python survey_analyzer.py --data survey.csv --likert satisfaction --output results.pdf
Cross-tabulation
python survey_analyzer.py --data survey.csv --crosstab age_group product --output crosstab.png
Sentiment analysis
python survey_analyzer.py --data survey.csv --sentiment comments --output sentiment.html
Full report
python survey_analyzer.py --data survey.csv --report --output full_report.pdf API Reference SurveyAnalyzer Class class SurveyAnalyzer : def init ( self )
Data Loading
def load_csv ( self , filepath , ** kwargs ) -
'SurveyAnalyzer' def load_data ( self , data : pd . DataFrame ) -
'SurveyAnalyzer'
Likert Scale Analysis
def likert_analysis ( self , column , scale_type = 'agreement' ) -
Dict def likert_comparison ( self , columns : List [ str ] ) -
pd . DataFrame def plot_likert ( self , column , output , scale_type = 'agreement' ) -
str
Frequency Analysis
def frequency_table ( self , column ) -
pd . DataFrame def multiple_choice ( self , column , delimiter = ',' ) -
pd . DataFrame def plot_frequencies ( self , column , output , top_n = None ) -
str
Cross-Tabulation
def crosstab ( self , row_var , col_var , normalize = None ) -
pd . DataFrame def chi_square_test ( self , row_var , col_var ) -
Dict def plot_crosstab ( self , row_var , col_var , output ) -
str
Sentiment Analysis
def sentiment_analysis ( self , column ) -
pd . DataFrame def sentiment_summary ( self , column ) -
Dict def plot_sentiment ( self , column , output ) -
str
Open-Ended Analysis
def word_frequency ( self , column , top_n = 20 ) -
pd . DataFrame def word_cloud ( self , column , output ) -
str def extract_themes ( self , column , n_themes = 5 ) -
List [ str ]
Statistics
def satisfaction_score ( self , columns : List [ str ] ) -
Dict def response_rate ( self ) -
Dict def demographics_summary ( self , columns : List [ str ] ) -
pd . DataFrame
Reporting
def generate_report ( self , output , format = 'pdf' ) -
str def summary ( self ) -
str Likert Scale Analysis Standard Scales
5-point agreement scale
analyzer . likert_analysis ( 'satisfaction' , scale_type = 'agreement' )
1=Strongly Disagree, 2=Disagree, 3=Neutral, 4=Agree, 5=Strongly Agree
5-point frequency scale
analyzer . likert_analysis ( 'usage' , scale_type = 'frequency' )
1=Never, 2=Rarely, 3=Sometimes, 4=Often, 5=Always
Custom scale
analyzer . likert_analysis ( 'rating' , scale_type = 'custom' , labels = [ 'Poor' , 'Fair' , 'Good' , 'Excellent' ] ) Results results = analyzer . likert_analysis ( 'satisfaction' )
{
'mean_score': 4.2,
'median': 4,
'mode': 5,
'distribution': {1: 2, 2: 5, 3: 15, 4: 40, 5: 38},
'percentages': {1: 2%, 2: 5%, 3: 15%, 4: 40%, 5: 38%},
'top_2_box': 78%, # % Agree + Strongly Agree
'bottom_2_box': 7% # % Disagree + Strongly Disagree
}
Visualization
Stacked bar chart
analyzer . plot_likert ( 'satisfaction' , 'likert_chart.png' )
Compare multiple questions
analyzer . likert_comparison ( [ 'quality' , 'value' , 'service' ] ) analyzer . plot_likert_comparison ( [ 'quality' , 'value' , 'service' ] , 'comparison.png' ) Frequency Analysis Single Choice freq = analyzer . frequency_table ( 'age_group' )
Count Percentage
18-24 45 22.5%
25-34 78 39.0%
35-44 52 26.0%
45+ 25 12.5%
Plot
analyzer . plot_frequencies ( 'age_group' , 'age_distribution.png' ) Multiple Choice For questions allowing multiple selections:
Data format: "Option A, Option B, Option C"
results
analyzer . multiple_choice ( 'features_liked' , delimiter = ',' )
Count Percentage
Price 120 60%
Quality 95 47.5%
Design 80 40%
Durability 70 35%
analyzer . plot_frequencies ( 'features_liked' , 'features.png' , top_n = 10 ) Cross-Tabulation Basic Cross-Tab crosstab = analyzer . crosstab ( 'age_group' , 'satisfaction' )
Satisfied Neutral Dissatisfied
18-24 30 10 5
25-34 60 15 3
35-44 40 8 4
45+ 18 5 2
With percentages
crosstab_pct
analyzer . crosstab ( 'age_group' , 'satisfaction' , normalize = 'index' )
Row percentages
Statistical Testing result = analyzer . chi_square_test ( 'age_group' , 'satisfaction' )
{
'statistic': 12.45,
'p_value': 0.014,
'significant': True,
'interpretation': 'There is a significant relationship between
age_group and satisfaction (p=0.014)'
}
Visualization
Heatmap
analyzer . plot_crosstab ( 'age_group' , 'satisfaction' , 'crosstab_heatmap.png' ) Sentiment Analysis Analyze open-ended text responses:
Analyze all comments
sentiment_df
analyzer . sentiment_analysis ( 'comments' )
comment polarity sentiment
0 "Great product!" 0.8 Positive
1 "Could be better" 0.1 Neutral
2 "Very disappointed" -0.6 Negative
Summary
summary
analyzer . sentiment_summary ( 'comments' )
{
'positive': 65%,
'neutral': 20%,
'negative': 15%,
'avg_polarity': 0.35
}
Visualize
analyzer . plot_sentiment ( 'comments' , 'sentiment_distribution.png' ) Open-Ended Analysis Word Frequency words = analyzer . word_frequency ( 'comments' , top_n = 20 )
Word Frequency
0 great 45
1 quality 38
2 price 32
...
Word Cloud analyzer . word_cloud ( 'comments' , 'wordcloud.png' ) Theme Extraction themes = analyzer . extract_themes ( 'feedback' , n_themes = 5 )
['product quality', 'customer service', 'pricing',
'delivery speed', 'user experience']
Satisfaction Metrics Net Promoter Score (NPS) nps = analyzer . nps_score ( 'recommendation' )
0-10 scale
{
'promoters': 65%, # 9-10
'passives': 25%, # 7-8
'detractors': 10%, # 0-6
'nps': 55
}
Overall Satisfaction satisfaction = analyzer . satisfaction_score ( [ 'product_quality' , 'customer_service' , 'value_for_money' , 'ease_of_use' ] )
{
'overall_score': 4.3,
'category_scores': {...},
'satisfaction_rate': 86% # % scoring 4-5
}
Demographics Analysis demographics = analyzer . demographics_summary ( [ 'age_group' , 'gender' , 'location' , 'income_range' ] )
Returns frequency tables for each demographic variable
Response Rate Analysis response_rate = analyzer . response_rate ( )
{
'total_respondents': 200,
'completion_rate': 85%,
'average_time': '5m 30s',
'dropout_points': {
'question_5': 8%,
'question_12': 5%
}
}
Report Generation Comprehensive Report analyzer . generate_report ( 'survey_report.pdf' , format = 'pdf' ) Report includes: Executive summary Response rate and demographics Question-by-question analysis Likert scale visualizations Cross-tabulations Sentiment analysis Key findings and recommendations Custom Report Sections analyzer . set_report_sections ( [ 'executive_summary' , 'demographics' , 'likert_questions' , 'cross_tabs' , 'sentiment' , 'recommendations' ] ) Advanced Features Filter by Segment
Analyze subset of responses
analyzer . filter ( 'age_group' , '25-34' ) results = analyzer . likert_analysis ( 'satisfaction' ) analyzer . clear_filter ( ) Compare Segments comparison = analyzer . compare_segments ( segment_col = 'age_group' , metric_col = 'satisfaction' )
Shows how different segments scored the metric
Trend Analysis For longitudinal surveys: trends = analyzer . trend_analysis ( metric = 'satisfaction' , time_col = 'survey_date' , period = 'month' ) analyzer . plot_trends ( trends , 'satisfaction_trend.png' ) Dependencies pandas>=2.0.0 numpy>=1.24.0 scipy>=1.10.0 textblob>=0.17.0 matplotlib>=3.7.0 seaborn>=0.12.0 wordcloud>=1.9.0 reportlab>=4.0.0