Survey Analyzer

Comprehensive survey data analysis with Likert scales, cross-tabs, and sentiment analysis.

Features

Likert Scale Analysis

Agreement scale scoring and visualization

Cross-Tabulation

Relationship analysis between categorical variables

Frequency Analysis

Response distributions and percentages

Sentiment Scoring

Text response sentiment analysis

Open-Ended Analysis

Theme extraction from text responses

Statistical Tests

Chi-square, correlations, significance testing

Visualizations

Bar charts, heatmaps, word clouds, distribution plots
Report Generation: Comprehensive PDF/HTML reports Quick Start from survey_analyzer import SurveyAnalyzer analyzer = SurveyAnalyzer ( )

Load survey data

analyzer . load_csv ( 'survey_responses.csv' )

Analyze Likert scale question

results

analyzer . likert_analysis ( 'satisfaction' , scale_type = 'agreement' ) print ( f"Mean score: { results [ 'mean_score' ] : .2f } " )

Cross-tabulation

crosstab

analyzer . crosstab ( 'age_group' , 'product_preference' ) print ( crosstab )

Generate report

analyzer . generate_report ( 'survey_report.pdf' ) CLI Usage

Analyze Likert scale

python survey_analyzer.py --data survey.csv --likert satisfaction --output results.pdf

Cross-tabulation

python survey_analyzer.py --data survey.csv --crosstab age_group product --output crosstab.png

Sentiment analysis

python survey_analyzer.py --data survey.csv --sentiment comments --output sentiment.html

Full report

python survey_analyzer.py --data survey.csv --report --output full_report.pdf API Reference SurveyAnalyzer Class class SurveyAnalyzer : def init ( self )

Data Loading

def load_csv ( self , filepath , ** kwargs ) -

'SurveyAnalyzer' def load_data ( self , data : pd . DataFrame ) -

'SurveyAnalyzer'

Likert Scale Analysis

def likert_analysis ( self , column , scale_type = 'agreement' ) -

Dict def likert_comparison ( self , columns : List [ str ] ) -

pd . DataFrame def plot_likert ( self , column , output , scale_type = 'agreement' ) -

str

Frequency Analysis

def frequency_table ( self , column ) -

pd . DataFrame def multiple_choice ( self , column , delimiter = ',' ) -

pd . DataFrame def plot_frequencies ( self , column , output , top_n = None ) -

str

Cross-Tabulation

def crosstab ( self , row_var , col_var , normalize = None ) -

pd . DataFrame def chi_square_test ( self , row_var , col_var ) -

Dict def plot_crosstab ( self , row_var , col_var , output ) -

str

Sentiment Analysis

def sentiment_analysis ( self , column ) -

pd . DataFrame def sentiment_summary ( self , column ) -

Dict def plot_sentiment ( self , column , output ) -

str

Open-Ended Analysis

def word_frequency ( self , column , top_n = 20 ) -

pd . DataFrame def word_cloud ( self , column , output ) -

str def extract_themes ( self , column , n_themes = 5 ) -

List [ str ]

Statistics

def satisfaction_score ( self , columns : List [ str ] ) -

Dict def response_rate ( self ) -

Dict def demographics_summary ( self , columns : List [ str ] ) -

pd . DataFrame

Reporting

def generate_report ( self , output , format = 'pdf' ) -

str def summary ( self ) -

str Likert Scale Analysis Standard Scales

5-point agreement scale

analyzer . likert_analysis ( 'satisfaction' , scale_type = 'agreement' )

1=Strongly Disagree, 2=Disagree, 3=Neutral, 4=Agree, 5=Strongly Agree

5-point frequency scale

analyzer . likert_analysis ( 'usage' , scale_type = 'frequency' )

1=Never, 2=Rarely, 3=Sometimes, 4=Often, 5=Always

Custom scale

analyzer . likert_analysis ( 'rating' , scale_type = 'custom' , labels = [ 'Poor' , 'Fair' , 'Good' , 'Excellent' ] ) Results results = analyzer . likert_analysis ( 'satisfaction' )

{

'mean_score': 4.2,

'median': 4,

'mode': 5,

'distribution': {1: 2, 2: 5, 3: 15, 4: 40, 5: 38},

'percentages': {1: 2%, 2: 5%, 3: 15%, 4: 40%, 5: 38%},

'top_2_box': 78%, # % Agree + Strongly Agree

'bottom_2_box': 7% # % Disagree + Strongly Disagree

}

Visualization

Stacked bar chart

analyzer . plot_likert ( 'satisfaction' , 'likert_chart.png' )

Compare multiple questions

analyzer . likert_comparison ( [ 'quality' , 'value' , 'service' ] ) analyzer . plot_likert_comparison ( [ 'quality' , 'value' , 'service' ] , 'comparison.png' ) Frequency Analysis Single Choice freq = analyzer . frequency_table ( 'age_group' )

Count Percentage

18-24 45 22.5%

25-34 78 39.0%

35-44 52 26.0%

45+ 25 12.5%

Plot

analyzer . plot_frequencies ( 'age_group' , 'age_distribution.png' ) Multiple Choice For questions allowing multiple selections:

Data format: "Option A, Option B, Option C"

results

analyzer . multiple_choice ( 'features_liked' , delimiter = ',' )

Count Percentage

Price 120 60%

Quality 95 47.5%

Design 80 40%

Durability 70 35%

analyzer . plot_frequencies ( 'features_liked' , 'features.png' , top_n = 10 ) Cross-Tabulation Basic Cross-Tab crosstab = analyzer . crosstab ( 'age_group' , 'satisfaction' )

Satisfied Neutral Dissatisfied

18-24 30 10 5

25-34 60 15 3

35-44 40 8 4

45+ 18 5 2

With percentages

crosstab_pct

analyzer . crosstab ( 'age_group' , 'satisfaction' , normalize = 'index' )

Row percentages

Statistical Testing result = analyzer . chi_square_test ( 'age_group' , 'satisfaction' )

{

'statistic': 12.45,

'p_value': 0.014,

'significant': True,

'interpretation': 'There is a significant relationship between

age_group and satisfaction (p=0.014)'

}

Visualization

Heatmap

analyzer . plot_crosstab ( 'age_group' , 'satisfaction' , 'crosstab_heatmap.png' ) Sentiment Analysis Analyze open-ended text responses:

Analyze all comments

sentiment_df

analyzer . sentiment_analysis ( 'comments' )

comment polarity sentiment

0 "Great product!" 0.8 Positive

1 "Could be better" 0.1 Neutral

2 "Very disappointed" -0.6 Negative

Summary

summary

analyzer . sentiment_summary ( 'comments' )

{

'positive': 65%,

'neutral': 20%,

'negative': 15%,

'avg_polarity': 0.35

}

Visualize

analyzer . plot_sentiment ( 'comments' , 'sentiment_distribution.png' ) Open-Ended Analysis Word Frequency words = analyzer . word_frequency ( 'comments' , top_n = 20 )

Word Frequency

0 great 45

1 quality 38

2 price 32

...

Word Cloud analyzer . word_cloud ( 'comments' , 'wordcloud.png' ) Theme Extraction themes = analyzer . extract_themes ( 'feedback' , n_themes = 5 )

['product quality', 'customer service', 'pricing',

'delivery speed', 'user experience']

Satisfaction Metrics Net Promoter Score (NPS) nps = analyzer . nps_score ( 'recommendation' )

0-10 scale

{

'promoters': 65%, # 9-10

'passives': 25%, # 7-8

'detractors': 10%, # 0-6

'nps': 55

}

Overall Satisfaction satisfaction = analyzer . satisfaction_score ( [ 'product_quality' , 'customer_service' , 'value_for_money' , 'ease_of_use' ] )

{

'overall_score': 4.3,

'category_scores': {...},

'satisfaction_rate': 86% # % scoring 4-5

}

Demographics Analysis demographics = analyzer . demographics_summary ( [ 'age_group' , 'gender' , 'location' , 'income_range' ] )

Returns frequency tables for each demographic variable

Response Rate Analysis response_rate = analyzer . response_rate ( )

{

'total_respondents': 200,

'completion_rate': 85%,

'average_time': '5m 30s',

'dropout_points': {

'question_5': 8%,

'question_12': 5%

}

Report Generation Comprehensive Report analyzer . generate_report ( 'survey_report.pdf' , format = 'pdf' ) Report includes: Executive summary Response rate and demographics Question-by-question analysis Likert scale visualizations Cross-tabulations Sentiment analysis Key findings and recommendations Custom Report Sections analyzer . set_report_sections ( [ 'executive_summary' , 'demographics' , 'likert_questions' , 'cross_tabs' , 'sentiment' , 'recommendations' ] ) Advanced Features Filter by Segment

Analyze subset of responses

analyzer . filter ( 'age_group' , '25-34' ) results = analyzer . likert_analysis ( 'satisfaction' ) analyzer . clear_filter ( ) Compare Segments comparison = analyzer . compare_segments ( segment_col = 'age_group' , metric_col = 'satisfaction' )

Shows how different segments scored the metric

Trend Analysis For longitudinal surveys: trends = analyzer . trend_analysis ( metric = 'satisfaction' , time_col = 'survey_date' , period = 'month' ) analyzer . plot_trends ( trends , 'satisfaction_trend.png' ) Dependencies pandas>=2.0.0 numpy>=1.24.0 scipy>=1.10.0 textblob>=0.17.0 matplotlib>=3.7.0 seaborn>=0.12.0 wordcloud>=1.9.0 reportlab>=4.0.0

安装

Load survey data

Analyze Likert scale question

results

Cross-tabulation

crosstab

Generate report

Analyze Likert scale

Cross-tabulation

Sentiment analysis

Full report

Data Loading

Likert Scale Analysis

Frequency Analysis

Cross-Tabulation

Sentiment Analysis

Open-Ended Analysis

Statistics

Reporting

5-point agreement scale

1=Strongly Disagree, 2=Disagree, 3=Neutral, 4=Agree, 5=Strongly Agree

5-point frequency scale

1=Never, 2=Rarely, 3=Sometimes, 4=Often, 5=Always

Custom scale

{

'mean_score': 4.2,

'median': 4,

'mode': 5,

'distribution': {1: 2, 2: 5, 3: 15, 4: 40, 5: 38},

'percentages': {1: 2%, 2: 5%, 3: 15%, 4: 40%, 5: 38%},

'top_2_box': 78%, # % Agree + Strongly Agree

'bottom_2_box': 7% # % Disagree + Strongly Disagree

}

Stacked bar chart

Compare multiple questions

Count Percentage

18-24 45 22.5%

25-34 78 39.0%

35-44 52 26.0%

45+ 25 12.5%

Plot

Data format: "Option A, Option B, Option C"

results

Count Percentage

Price 120 60%

Quality 95 47.5%

Design 80 40%

Durability 70 35%

Satisfied Neutral Dissatisfied

18-24 30 10 5

25-34 60 15 3

35-44 40 8 4

45+ 18 5 2

With percentages

crosstab_pct

Row percentages

{

'statistic': 12.45,

'p_value': 0.014,

'significant': True,

'interpretation': 'There is a significant relationship between

age_group and satisfaction (p=0.014)'

}

Heatmap

Analyze all comments

sentiment_df

comment polarity sentiment

0 "Great product!" 0.8 Positive

1 "Could be better" 0.1 Neutral

2 "Very disappointed" -0.6 Negative

Summary

summary

{

'positive': 65%,

'neutral': 20%,

'negative': 15%,

'avg_polarity': 0.35

}

Visualize

Word Frequency