tooluniverse-gwas-study-explorer

安装量: 109
排名: #7762

安装

npx skills add https://github.com/mims-harvard/tooluniverse --skill tooluniverse-gwas-study-explorer
GWAS Study Deep Dive & Meta-Analysis
Compare GWAS studies, perform meta-analyses, and assess replication across cohorts
Overview
The GWAS Study Deep Dive & Meta-Analysis skill enables comprehensive comparison of genome-wide association studies (GWAS) for the same trait, meta-analysis of genetic loci across studies, and systematic assessment of replication and study quality. It integrates data from the NHGRI-EBI GWAS Catalog and Open Targets Genetics to provide a complete picture of the genetic architecture of complex traits.
Key Capabilities
Study Comparison
Compare all GWAS studies for a trait, assessing sample sizes, ancestries, and platforms
Meta-Analysis
Aggregate effect sizes across studies and calculate heterogeneity statistics
Replication Assessment
Identify replicated vs novel findings across discovery and replication cohorts
Quality Evaluation
Assess statistical power, ancestry diversity, and data availability
Use Cases
1. Comprehensive Trait Analysis
Scenario
"I want to understand all available GWAS data for type 2 diabetes"
Workflow
:
Search for all T2D studies in GWAS Catalog
Filter by sample size and ancestry
Extract top associations from each study
Identify consistently replicated loci
Assess ancestry-specific effects
Outcome
Complete landscape of T2D genetics with replicated findings and population-specific signals
2. Locus-Specific Meta-Analysis
Scenario
"Is the TCF7L2 association with T2D consistent across all studies?"
Workflow
:
Retrieve all TCF7L2 (rs7903146) associations for T2D
Calculate combined effect size and p-value
Assess heterogeneity (I² statistic)
Generate forest plot data
Interpret heterogeneity level
Outcome
Quantitative assessment of effect size consistency with heterogeneity interpretation
3. Replication Analysis
Scenario
"Which findings from the discovery cohort replicated in the independent sample?"
Workflow
:
Get top hits from discovery study
Check for presence and significance in replication study
Assess direction consistency
Calculate replication rate
Identify novel vs failed replication
Outcome
Systematic replication report with success rates and failed findings
4. Multi-Ancestry Comparison
Scenario
"Are T2D loci consistent across European and East Asian populations?"
Workflow
:
Filter studies by ancestry
Compare top associations between populations
Identify shared vs population-specific loci
Assess allele frequency differences
Evaluate transferability of genetic risk scores
Outcome
Ancestry-specific genetic architecture with transferability assessment
Statistical Methods
Meta-Analysis Approach
This skill implements standard GWAS meta-analysis methods:
Fixed-Effects Model
:
Used when heterogeneity is low (I² < 25%)
Weights studies by inverse variance
Assumes true effect size is the same across studies
Random-Effects Model
(recommended when I² > 50%):
Accounts for between-study variation
More conservative than fixed-effects
Better for diverse ancestries or methodologies
Heterogeneity Assessment
:
The
I² statistic
measures the percentage of variance due to between-study heterogeneity:
I² = [(Q - df) / Q] × 100%
where Q = Cochran's Q statistic
df = degrees of freedom (n_studies - 1)
Interpretation Guidelines
:
I² < 25%
Low heterogeneity → fixed-effects appropriate
I² = 25-50%
Moderate heterogeneity → investigate sources
I² = 50-75%
Substantial heterogeneity → random-effects preferred
I² > 75%
Considerable heterogeneity → meta-analysis may not be appropriate
Sources of Heterogeneity
Common reasons for high I²:
Ancestry differences
Different allele frequencies and LD structure
Phenotype heterogeneity
Trait definition varies across studies
Platform differences
Imputation quality and coverage
Winner's curse
Discovery studies overestimate effect sizes
Cohort characteristics
Age, sex, environmental factors
Recommendations
:
Perform subgroup analysis by ancestry
Use meta-regression to investigate sources
Consider excluding outlier studies
Apply genomic control correction
Study Quality Assessment
Quality Metrics
The skill evaluates studies based on:
1. Sample Size
:
Power to detect associations (80% power requires n > 10,000 for OR=1.2)
Precision of effect size estimates
Ability to detect modest effects
2. Ancestry Diversity
:
Single-ancestry vs multi-ancestry
Population stratification control
Transferability of findings
3. Data Availability
:
Summary statistics available for meta-analysis
Individual-level data vs summary-level
Imputation quality scores
4. Genotyping Quality
:
Platform density and coverage
Imputation reference panel
Quality control measures
5. Statistical Rigor
:
Genome-wide significance threshold (p < 5×10⁻⁸)
Multiple testing correction
Replication in independent cohort
Quality Tiers
Tier 1 (High Quality)
:
n ≥ 50,000
Summary statistics available
Multi-ancestry or large single-ancestry
Imputed to high-quality reference
Independent replication
Tier 2 (Moderate Quality)
:
n ≥ 10,000
Standard GWAS platform
Adequate power for common variants
Some data availability
Tier 3 (Limited)
:
n < 10,000
Limited power
May miss modest effects
Use with caution
Best Practices
Before Meta-Analysis
Check phenotype consistency
Ensure studies measure the same trait
Verify ancestry overlap
High heterogeneity expected if ancestries differ
Harmonize alleles
Align effect alleles across studies
Quality control
Exclude low-quality studies or associations
Interpreting Results
Genome-wide significance
p < 5×10⁻⁸ (Bonferroni for ~1M independent tests)
Replication threshold
p < 0.05 in independent cohort
Direction consistency
Effect should be same direction across studies
Heterogeneity
I² > 50% suggests caution in interpretation
Common Pitfalls
Don't
:
Meta-analyze without checking heterogeneity
Ignore ancestry differences
Over-interpret nominal p-values
Assume replication failure means false positive
Do
:
Always report I² statistic
Perform sensitivity analyses
Consider ancestry-stratified analysis
Account for winner's curse in discovery studies
Limitations & Caveats
Data Limitations
Incomplete Overlap
Studies may analyze different SNPs
Cohort Overlap
Some cohorts participate in multiple studies (inflates significance)
Publication Bias
Significant findings more likely to be published
Winner's Curse
Discovery studies overestimate effect sizes
Imputation Quality
Varies across studies and populations
Statistical Limitations
Heterogeneity
High I² may preclude meaningful meta-analysis
Sample Size Differences
Large studies dominate fixed-effects models
Allele Frequency Differences
Same variant has different effects across ancestries
Linkage Disequilibrium
Fine-mapping needed to identify causal variants
Gene-Environment Interactions
Not captured in standard meta-analysis
Interpretation Guidelines
When I² > 75%
:
Meta-analysis results should be interpreted with extreme caution
Investigate sources of heterogeneity systematically
Consider ancestry-specific or subgroup analyses
Descriptive comparison may be more appropriate than meta-analysis
When Studies Conflict
:
Check for methodological differences
Verify phenotype definitions match
Investigate population stratification
Consider conditional analysis
Scientific References
Key Publications
GWAS Best Practices
:
Visscher et al. (2017). "10 Years of GWAS Discovery"
American Journal of Human Genetics
101(1): 5-22
PMID: 28686856
DOI: 10.1016/j.ajhg.2017.06.005
Meta-Analysis Methods
:
Evangelou & Ioannidis (2013). "Meta-analysis methods for genome-wide association studies and beyond"
Nature Reviews Genetics
14: 379-389
PMID: 23657481
Heterogeneity Interpretation
:
Higgins et al. (2003). "Measuring inconsistency in meta-analyses"
BMJ
327: 557-560
PMID: 12958120
Multi-Ancestry GWAS
:
Peterson et al. (2019). "Genome-wide Association Studies in Ancestrally Diverse Populations"
Nature Reviews Genetics
20: 409-422
PMID: 30926972
Replication Standards
:
Chanock et al. (2007). "Replicating genotype-phenotype associations"
Nature
447: 655-660
PMID: 17554299
Tools Used
GWAS Catalog API
gwas_search_studies
Find studies by trait
gwas_get_study_by_id
Get detailed study metadata
gwas_get_associations_for_study
Retrieve study associations
gwas_get_associations_for_snp
Get SNP associations across studies
gwas_search_associations
Search associations by trait
Open Targets Genetics GraphQL API
OpenTargets_search_gwas_studies_by_disease
Disease-based study search
OpenTargets_get_gwas_study
Detailed study information with LD populations
OpenTargets_get_variant_credible_sets
Fine-mapped loci for variant
OpenTargets_get_study_credible_sets
All credible sets for study
OpenTargets_get_variant_info
Variant annotation and allele frequencies
Glossary
Association
Statistical relationship between a genetic variant and a trait
Credible Set
Set of variants likely to contain the causal variant (from fine-mapping)
Effect Size
Magnitude of genetic association (beta coefficient or odds ratio)
Fine-Mapping
Statistical method to identify causal variants within a locus
Genome-Wide Significance
p < 5×10⁻⁸, accounting for ~1M independent tests
Heterogeneity (I²)
Percentage of variance due to between-study differences
L2G (Locus-to-Gene)
Score predicting which gene is affected by a GWAS locus
LD (Linkage Disequilibrium)
Non-random association of alleles at different loci
Meta-Analysis
Statistical combination of results from multiple studies
Replication
Independent confirmation of an association in a new cohort
Summary Statistics
Per-SNP statistics (p-value, beta, SE) from GWAS
Winner's Curse
Overestimation of effect size in discovery studies
Next Steps
After running this skill, consider:
Fine-Mapping
Use credible sets from Open Targets to identify causal variants
Functional Follow-Up
Investigate biological mechanisms of replicated loci
Genetic Risk Scores
Calculate polygenic risk scores using validated loci
Drug Target Identification
Use L2G scores to prioritize therapeutic targets
Cross-Trait Analysis
Look for pleiotropy with related traits
Version History
v1.0
(2026-02-13): Initial release with study comparison, meta-analysis, and replication assessment
Created by
ToolUniverse GWAS Analysis Team
Last Updated
2026-02-13
License
Open source (MIT)
返回排行榜