audio-analyzer

安装量: 61
排名: #12243

安装

npx skills add https://github.com/dkyazzentwatwa/chatgpt-skills --skill audio-analyzer

Audio Analyzer

A comprehensive toolkit for analyzing audio files. Extract detailed information about audio including tempo, musical key, frequency content, loudness metrics, and generate professional visualizations.

Quick Start from scripts.audio_analyzer import AudioAnalyzer

Analyze an audio file

analyzer = AudioAnalyzer("song.mp3") analyzer.analyze()

Get all analysis results

results = analyzer.get_results() print(f"BPM: {results['tempo']['bpm']}") print(f"Key: {results['key']['key']} {results['key']['mode']}")

Generate visualizations

analyzer.plot_waveform("waveform.png") analyzer.plot_spectrogram("spectrogram.png")

Full report

analyzer.save_report("analysis_report.json")

Features Tempo/BPM Detection: Accurate beat tracking with confidence score Key Detection: Musical key and mode (major/minor) identification Frequency Analysis: Spectrum, dominant frequencies, frequency bands Loudness Metrics: RMS, peak, LUFS, dynamic range Waveform Visualization: Multi-channel waveform plots Spectrogram: Time-frequency visualization with customization Chromagram: Pitch class visualization for harmonic analysis Beat Grid: Visual beat markers overlaid on waveform Export Formats: JSON report, PNG/SVG visualizations API Reference Initialization

From file

analyzer = AudioAnalyzer("audio.mp3")

With custom sample rate

analyzer = AudioAnalyzer("audio.wav", sr=44100)

Analysis Methods

Run full analysis

analyzer.analyze()

Individual analyses

analyzer.analyze_tempo() # BPM and beat positions analyzer.analyze_key() # Musical key detection analyzer.analyze_loudness() # RMS, peak, LUFS analyzer.analyze_frequency() # Spectrum analysis analyzer.analyze_dynamics() # Dynamic range

Results Access

Get all results as dict

results = analyzer.get_results()

Individual results

tempo = analyzer.get_tempo() # {'bpm': 120, 'confidence': 0.85, 'beats': [...]} key = analyzer.get_key() # {'key': 'C', 'mode': 'major', 'confidence': 0.72} loudness = analyzer.get_loudness() # {'rms_db': -14.2, 'peak_db': -0.5, 'lufs': -14.0} freq = analyzer.get_frequency() # {'dominant_freq': 440, 'spectrum': [...]}

Visualization Methods

Waveform

analyzer.plot_waveform( output="waveform.png", figsize=(12, 4), color="#1f77b4", show_rms=True )

Spectrogram

analyzer.plot_spectrogram( output="spectrogram.png", figsize=(12, 6), cmap="magma", # viridis, plasma, inferno, magma freq_scale="log", # linear, log, mel max_freq=8000 # Hz )

Chromagram (pitch classes)

analyzer.plot_chromagram( output="chromagram.png", figsize=(12, 4) )

Onset strength / beat grid

analyzer.plot_beats( output="beats.png", figsize=(12, 4), show_strength=True )

Combined dashboard

analyzer.plot_dashboard( output="dashboard.png", figsize=(14, 10) )

Export

JSON report with all analysis

analyzer.save_report("report.json")

Summary text

summary = analyzer.get_summary() print(summary)

Analysis Details Tempo Detection

Uses beat tracking algorithm to detect:

BPM: Beats per minute (tempo) Beat positions: Timestamps of detected beats Confidence: Reliability score (0-1) tempo = analyzer.get_tempo()

{

'bpm': 128.0,

'confidence': 0.89,

'beats': [0.0, 0.469, 0.938, 1.406, ...], # seconds

'beat_count': 256

}

Key Detection

Analyzes harmonic content to identify:

Key: Root note (C, C#, D, etc.) Mode: Major or minor Confidence: Detection confidence Key profile: Correlation with each key key = analyzer.get_key()

{

'key': 'A',

'mode': 'minor',

'confidence': 0.76,

'profile':

}

Loudness Metrics

Comprehensive loudness analysis:

RMS dB: Root mean square level Peak dB: Maximum sample level LUFS: Integrated loudness (broadcast standard) Dynamic Range: Difference between loud and quiet sections loudness = analyzer.get_loudness()

{

'rms_db': -14.2,

'peak_db': -0.3,

'lufs': -14.0,

'dynamic_range_db': 12.5,

'crest_factor': 8.2

}

Frequency Analysis

Spectrum analysis including:

Dominant frequency: Strongest frequency component Frequency bands: Energy in bass, mid, treble Spectral centroid: "Brightness" of audio Spectral rolloff: Frequency below which 85% of energy exists freq = analyzer.get_frequency()

{

'dominant_freq': 440.0,

'spectral_centroid': 2150.3,

'spectral_rolloff': 4200.5,

'bands': {

'sub_bass': -28.5, # 20-60 Hz

'bass': -18.2, # 60-250 Hz

'low_mid': -12.1, # 250-500 Hz

'mid': -10.8, # 500-2000 Hz

'high_mid': -14.3, # 2000-4000 Hz

'high': -22.1 # 4000-20000 Hz

}

}

CLI Usage

Full analysis with all visualizations

python audio_analyzer.py --input song.mp3 --output-dir ./analysis/

Just tempo and key

python audio_analyzer.py --input song.mp3 --analyze tempo key --output report.json

Generate specific visualization

python audio_analyzer.py --input song.mp3 --plot spectrogram --output spec.png

Dashboard view

python audio_analyzer.py --input song.mp3 --dashboard --output dashboard.png

Batch analyze directory

python audio_analyzer.py --input-dir ./songs/ --output-dir ./reports/

CLI Arguments Argument Description Default --input Input audio file Required --input-dir Directory of audio files - --output Output file path - --output-dir Output directory . --analyze Analysis types: tempo, key, loudness, frequency, all all --plot Plot type: waveform, spectrogram, chromagram, beats, dashboard - --format Output format: json, txt json --sr Sample rate for analysis 22050 Examples Song Analysis analyzer = AudioAnalyzer("track.mp3") analyzer.analyze()

print(f"Tempo: {analyzer.get_tempo()['bpm']:.1f} BPM") print(f"Key: {analyzer.get_key()['key']} {analyzer.get_key()['mode']}") print(f"Loudness: {analyzer.get_loudness()['lufs']:.1f} LUFS")

analyzer.plot_dashboard("track_analysis.png")

Podcast Quality Check analyzer = AudioAnalyzer("podcast.mp3") analyzer.analyze_loudness()

loudness = analyzer.get_loudness() if loudness['lufs'] > -16: print("Warning: Audio may be too loud for podcast standards") elif loudness['lufs'] < -20: print("Warning: Audio may be too quiet") else: print("Loudness is within podcast standards (-16 to -20 LUFS)")

Batch Analysis import os from scripts.audio_analyzer import AudioAnalyzer

results = [] for filename in os.listdir("./songs"): if filename.endswith(('.mp3', '.wav', '.flac')): analyzer = AudioAnalyzer(f"./songs/{filename}") analyzer.analyze() results.append({ 'file': filename, 'bpm': analyzer.get_tempo()['bpm'], 'key': f"{analyzer.get_key()['key']} {analyzer.get_key()['mode']}", 'lufs': analyzer.get_loudness()['lufs'] })

Sort by BPM for DJ set

results.sort(key=lambda x: x['bpm'])

Supported Formats

Input formats (via librosa/soundfile):

MP3 WAV FLAC OGG M4A/AAC AIFF

Output formats:

JSON (analysis report) PNG (visualizations) SVG (visualizations) TXT (summary) Dependencies librosa>=0.10.0 soundfile>=0.12.0 matplotlib>=3.7.0 numpy>=1.24.0 scipy>=1.10.0

Limitations Key detection works best with melodic content (less accurate for drums/percussion) BPM detection may struggle with free-tempo or complex time signatures Very short clips (<5 seconds) may have reduced accuracy LUFS calculation is simplified (not full ITU-R BS.1770-4)

返回排行榜