tooluniverse-gwas-snp-interpretation

安装量: 106
排名: #7964

安装

npx skills add https://github.com/mims-harvard/tooluniverse --skill tooluniverse-gwas-snp-interpretation
GWAS SNP Interpretation Skill
Overview
Interpret genetic variants (SNPs) from GWAS studies by aggregating evidence from multiple sources to provide comprehensive clinical and biological context.
Use Cases:
"Interpret rs7903146" (TCF7L2 diabetes variant)
"What diseases is rs429358 associated with?" (APOE Alzheimer's variant)
"Clinical significance of rs1801133" (MTHFR variant)
"Is rs12913832 in any fine-mapped loci?" (Eye color variant)
What It Does
The skill provides a comprehensive interpretation of SNPs by:
SNP Annotation
Retrieves basic variant information including genomic coordinates, alleles, functional consequence, and mapped genes
Association Discovery
Finds all GWAS trait/disease associations with statistical significance
Fine-Mapping Evidence
Identifies credible sets the variant belongs to (fine-mapped causal loci)
Gene Mapping
Uses Locus-to-Gene (L2G) predictions to identify likely causal genes
Clinical Summary
Aggregates evidence into actionable clinical significance
Workflow
User Input: rs7903146
[1] SNP Lookup
→ Get location, consequence, MAF
→ gwas_get_snp_by_id
[2] Association Search
→ Find all trait/disease associations
→ gwas_get_associations_for_snp
[3] Fine-Mapping (Optional)
→ Get credible set membership
→ OpenTargets_get_variant_credible_sets
[4] Gene Predictions
→ Extract L2G scores for causal genes
→ (embedded in credible sets)
[5] Clinical Summary
→ Aggregate evidence
→ Identify key traits and genes
Output: Comprehensive Interpretation Report
Data Sources
GWAS Catalog (EMBL-EBI)
SNP annotations
Functional consequences, mapped genes, population frequencies
Associations
P-values, effect sizes, study metadata
Coverage
350,000+ publications, 670,000+ associations
Open Targets Genetics
Fine-mapping
Statistical credible sets from SuSiE, FINEMAP methods
L2G predictions
Machine learning-based gene prioritization
Colocalization
QTL evidence for causal genes
Coverage
UK Biobank, FinnGen, and other large cohorts
Input Parameters
Required
rs_id
(str): dbSNP rs identifier
Format: "rs" + number (e.g., "rs7903146")
Must be valid rsID in GWAS Catalog
Optional
include_credible_sets
(bool, default=True): Query fine-mapping data
True: Complete interpretation (slower, ~10-30s)
False: Fast associations only (~2-5s)
p_threshold
(float, default=5e-8): Genome-wide significance threshold
max_associations
(int, default=100): Maximum associations to retrieve
Output Format
Returns
SNPInterpretationReport
containing:
1. SNP Basic Info
{
'rs_id'
:
'rs7903146'
,
'chromosome'
:
'10'
,
'position'
:
112998590
,
'ref_allele'
:
'C'
,
'alt_allele'
:
'T'
,
'consequence'
:
'intron_variant'
,
'mapped_genes'
:
[
'TCF7L2'
]
,
'maf'
:
0.293
}
2. Trait Associations
[
{
'trait'
:
'Type 2 diabetes'
,
'p_value'
:
1.2e-128
,
'beta'
:
'0.28 unit increase'
,
'study_id'
:
'GCST010555'
,
'pubmed_id'
:
'33536258'
,
'effect_allele'
:
'T'
}
,
.
.
.
]
3. Credible Sets (Fine-Mapping)
[
{
'study_id'
:
'GCST90476118'
,
'trait'
:
'Renal failure'
,
'finemapping_method'
:
'SuSiE-inf'
,
'p_value'
:
3.5e-42
,
'predicted_genes'
:
[
{
'gene'
:
'TCF7L2'
,
'score'
:
0.863
}
]
,
'region'
:
'10:112950000-113050000'
}
,
.
.
.
]
4. Clinical Significance
Genome-wide significant associations with 100 traits/diseases:
- Type 2 diabetes
- Diabetic retinopathy
- HbA1c levels
...
Identified in 20 fine-mapped loci.
Predicted causal genes: TCF7L2
Example Usage
See
QUICK_START.md
for platform-specific examples.
Tools Used
GWAS Catalog Tools
gwas_get_snp_by_id
Get SNP annotation
gwas_get_associations_for_snp
Get all trait associations
Open Targets Tools
OpenTargets_get_variant_info
Get variant details with population frequencies
OpenTargets_get_variant_credible_sets
Get fine-mapping credible sets with L2G
Interpretation Guide
P-value Significance Levels
p < 5e-8
Genome-wide significant (strong evidence)
p < 5e-6
Suggestive (moderate evidence)
p < 0.05
Nominal (weak evidence)
L2G Score Interpretation
> 0.5
High confidence causal gene
0.1-0.5
Moderate confidence
< 0.1
Low confidence
Clinical Actionability
High
Multiple genome-wide significant associations + in credible sets + high L2G scores
Moderate
Genome-wide significant associations but limited fine-mapping
Low
Suggestive associations or limited replication
Limitations
Variant ID Conversion
OpenTargets requires chr_pos_ref_alt format, which may need allele lookup
Population Specificity
Associations may vary by ancestry
Effect Sizes
Beta values are study-dependent (different phenotype scales)
Causality
Associations don't prove causation; fine-mapping improves confidence
Currency
Data reflects published GWAS; latest studies may not be included
Best Practices
Use Full Interpretation
Enable
include_credible_sets=True
for clinical decisions
Check Multiple Variants
Look at other variants in the same locus
Validate Populations
Consider ancestry-specific effect sizes
Review Publications
Check original studies for context
Integrate Evidence
Combine with functional data, eQTLs, pQTLs
Technical Notes
Performance
Fast mode
(no credible sets): 2-5 seconds
Full mode
(with credible sets): 10-30 seconds
Bottleneck
OpenTargets GraphQL API rate limits Error Handling Invalid rs_id: Returns error message No associations: Returns empty list with note API failures: Graceful degradation (returns partial results)
返回排行榜