tooluniverse-target-research

安装量: 153
排名: #5622

安装

npx skills add https://github.com/mims-harvard/tooluniverse --skill tooluniverse-target-research
Comprehensive Target Intelligence Gatherer
Gather complete target intelligence by exploring 9 parallel research paths. Supports targets identified by gene symbol, UniProt accession, Ensembl ID, or gene name.
KEY PRINCIPLES
:
Report-first approach
- Create report file FIRST, then populate progressively
Tool parameter verification
- Verify params via
get_tool_info
before calling unfamiliar tools
Evidence grading
- Grade all claims by evidence strength (T1-T4). See
EVIDENCE_GRADING.md
Citation requirements
- Every fact must have inline source attribution
Mandatory completeness
- All sections must exist with data minimums or explicit "No data" notes
Disambiguation first
- Resolve all identifiers before research
Negative results documented
- "No drugs found" is data; empty sections are failures
Collision-aware literature search
- Detect and filter naming collisions
English-first queries
- Always use English terms in tool calls, even if the user writes in another language. Translate gene names, disease names, and search terms to English. Only try original-language terms as a fallback if English returns no results. Respond in the user's language
When to Use This Skill
Apply when users:
Ask about a drug target, protein, or gene
Need target validation or assessment
Request druggability analysis
Want comprehensive target profiling
Ask "what do we know about [target]?"
Need target-disease associations
Request safety profile for a target
When NOT to use
Simple protein lookup, drug-only queries, disease-centric queries, sequence retrieval, structure download -- use specialized skills instead. Phase 0: Tool Parameter Verification (CRITICAL) BEFORE calling ANY tool for the first time , verify its parameters: tool_info = tu . tools . get_tool_info ( tool_name = "Reactome_map_uniprot_to_pathways" )

Reveals: takes id not uniprot_id

Known Parameter Corrections
Tool
WRONG Parameter
CORRECT Parameter
Reactome_map_uniprot_to_pathways
uniprot_id
id
ensembl_get_xrefs
gene_id
id
GTEx_get_median_gene_expression
gencode_id
only
gencode_id
+
operation="median"
OpenTargets_*
ensemblID
ensemblId
(camelCase)
STRING_get_protein_interactions
single ID
protein_ids
(list),
species
intact_get_interactions
gene symbol
identifier
(UniProt accession)
GTEx Versioned ID Fallback (CRITICAL)
GTEx often requires versioned Ensembl IDs. If
ENSG00000123456
returns empty, try
ENSG00000123456.{version}
from
ensembl_lookup_gene
.
Critical Workflow Requirements
1. Report-First Approach (MANDATORY)
DO NOT
show the search process or tool outputs to the user. Instead:
Create the report file FIRST
(
[TARGET]_target_report.md
) with all section headers and
[Researching...]
placeholders. See
REPORT_FORMAT.md
for template.
Progressively update
each section as data is retrieved.
Methodology in appendix only
- create separate
[TARGET]_methods_appendix.md
if requested.
2. Evidence Grading (MANDATORY)
Grade every claim by evidence strength using T1-T4 tiers. See
EVIDENCE_GRADING.md
for tier definitions, required locations, and citation format.
Core Strategy: 9 Research Paths
Target Query (e.g., "EGFR" or "P00533")
|
+- IDENTIFIER RESOLUTION (always first)
| +- Check if GPCR -> GPCRdb_get_protein
|
+- PATH 0: Open Targets Foundation (ALWAYS FIRST - fills gaps in all other paths)
|
+- PATH 1: Core Identity (names, IDs, sequence, organism)
| +- InterProScan_scan_sequence for novel domain prediction
+- PATH 2: Structure & Domains (3D structure, domains, binding sites)
| +- If GPCR: GPCRdb_get_structures (active/inactive states)
+- PATH 3: Function & Pathways (GO terms, pathways, biological role)
+- PATH 4: Protein Interactions (PPI network, complexes)
+- PATH 5: Expression Profile (tissue expression, single-cell)
+- PATH 6: Variants & Disease (mutations, clinical significance)
| +- DisGeNET_search_gene for curated gene-disease associations
+- PATH 7: Drug Interactions (known drugs, druggability, safety)
| +- Pharos_get_target for TDL classification (Tclin/Tchem/Tbio/Tdark)
| +- BindingDB_get_ligands_by_uniprot for known ligands
| +- PubChem_search_assays_by_target_gene for HTS data
| +- If GPCR: GPCRdb_get_ligands (curated agonists/antagonists)
| +- DepMap_get_gene_dependencies for target essentiality
+- PATH 8: Literature & Research (publications, trends)
For detailed code implementations of each path, see
IMPLEMENTATION.md
.
Identifier Resolution (Phase 1)
Resolve ALL identifiers before any research path. Required IDs:
UniProt accession
(for protein data, structure, interactions)
Ensembl gene ID
+ versioned ID (for Open Targets, GTEx)
Gene symbol
(for DGIdb, gnomAD, literature)
Entrez gene ID
(for KEGG, MyGene)
ChEMBL target ID
(for bioactivity)
Synonyms/full name
(for collision-aware literature search)
After resolution, check if target is a GPCR via
GPCRdb_get_protein
. See
IMPLEMENTATION.md
for resolution and GPCR detection code.
PATH 0: Open Targets Foundation (ALWAYS FIRST)
Populates baseline data for Sections 5, 6, 8, 9, 10, 11 before specialized queries.
Endpoint
Report Section
Data Type
OpenTargets_get_diseases_phenotypes_by_target_ensemblId
8
Diseases/phenotypes
OpenTargets_get_target_tractability_by_ensemblId
9
Druggability assessment
OpenTargets_get_target_safety_profile_by_ensemblId
10
Safety liabilities
OpenTargets_get_target_interactions_by_ensemblId
6
PPI network
OpenTargets_get_target_gene_ontology_by_ensemblId
5
GO annotations
OpenTargets_get_publications_by_target_ensemblId
11
Literature
OpenTargets_get_biological_mouse_models_by_ensemblId
8/10
Mouse KO phenotypes
OpenTargets_get_chemical_probes_by_target_ensemblId
9
Chemical probes
OpenTargets_get_associated_drugs_by_target_ensemblId
9
Known drugs
PATH 1: Core Identity
Tools
:
UniProt_get_entry_by_accession
,
UniProt_get_function_by_accession
,
UniProt_get_recommended_name_by_accession
,
UniProt_get_alternative_names_by_accession
,
UniProt_get_subcellular_location_by_accession
,
MyGene_get_gene_annotation
Populates
Sections 2 (Identifiers), 3 (Basic Information)
PATH 2: Structure & Domains
Use 3-step structure search chain (do NOT rely solely on PDB text search):
UniProt PDB cross-references
(most reliable)
Sequence-based PDB search
(catches missing annotations)
Domain-based search
(for multi-domain proteins)
AlphaFold
(always check)
Tools
:
UniProt_get_entry_by_accession
(PDB xrefs),
get_protein_metadata_by_pdb_id
,
PDB_search_similar_structures
,
alphafold_get_prediction
,
InterPro_get_protein_domains
,
UniProt_get_ptm_processing_by_accession
GPCR targets
Also query
GPCRdb_get_structures
for active/inactive state data.
Populates
Section 4 (Structural Biology)
See
IMPLEMENTATION.md
for the 3-step chain code.
PATH 3: Function & Pathways
Tools
:
GO_get_annotations_for_gene
,
Reactome_map_uniprot_to_pathways
,
kegg_get_gene_info
,
WikiPathways_search
,
enrichr_gene_enrichment_analysis
Populates
Section 5 (Function & Pathways)
PATH 4: Protein Interactions
Tools
:
STRING_get_protein_interactions
,
intact_get_interactions
,
intact_get_complex_details
,
BioGRID_get_interactions
,
HPA_get_protein_interactions_by_gene
Minimum
20 interactors OR documented explanation.
Populates
Section 6 (Protein-Protein Interactions)
PATH 5: Expression Profile
GTEx with versioned ID fallback + HPA as backup. For comprehensive HPA data, also query cell line expression comparison.
Tools
:
GTEx_get_median_gene_expression
,
HPA_get_rna_expression_by_source
,
HPA_get_comprehensive_gene_details_by_ensembl_id
,
HPA_get_subcellular_location
,
HPA_get_cancer_prognostics_by_gene
,
HPA_get_comparative_expression_by_gene_and_cellline
,
CELLxGENE_get_expression_data
Populates
Section 7 (Expression Profile)
See
IMPLEMENTATION.md
for GTEx fallback and HPA extended expression code.
PATH 6: Variants & Disease
Separate SNVs from CNVs in ClinVar results. Integrate DisGeNET for curated gene-disease association scores.
Tools
:
gnomad_get_gene_constraints
,
clinvar_search_variants
,
OpenTargets_get_diseases_phenotypes_by_target_ensembl
,
DisGeNET_search_gene
,
civic_get_variants_by_gene
,
cBioPortal_get_mutations
Required
All 4 constraint scores (pLI, LOEUF, missense Z, pRec).
Populates
Section 8 (Genetic Variation & Disease)
PATH 7: Druggability & Target Validation
Comprehensive druggability assessment including TDL classification, binding data, screening data, and essentiality.
Tools
:
OpenTargets_get_target_tractability_by_ensemblID
,
DGIdb_get_gene_druggability
,
DGIdb_get_drug_gene_interactions
,
ChEMBL_search_targets
,
ChEMBL_get_target_activities
,
Pharos_get_target
,
BindingDB_get_ligands_by_uniprot
,
PubChem_search_assays_by_target_gene
,
DepMap_get_gene_dependencies
,
OpenTargets_get_target_safety_profile_by_ensemblID
,
OpenTargets_get_biological_mouse_models_by_ensemblID
GPCR targets
Also query
GPCRdb_get_ligands
.
Populates
Sections 9 (Druggability), 10 (Safety), 12 (Competitive Landscape)
Key Data Sources for Druggability
Source
What It Provides
Pharos TDL
Tclin/Tchem/Tbio/Tdark classification
BindingDB
Experimental Ki/IC50/Kd values
PubChem BioAssay
HTS screening hits and dose-response
DepMap
CRISPR essentiality across cancer cell lines
ChEMBL
Bioactivity records and compound counts
See
IMPLEMENTATION.md
for detailed code and
REFERENCE.md
for complete tool parameter tables.
PATH 8: Literature & Research (Collision-Aware)
Detect collisions
- Check if gene symbol has non-biological meanings
Build seed queries
- Symbol in title with bio context, full name, UniProt accession
Apply collision filter
- Add NOT terms for off-topic meanings
Expand via citations
- For sparse targets (<30 papers), use citation network
Classify by evidence tier
- T1-T4 based on title/abstract keywords
Tools
:
PubMed_search_articles
,
PubMed_get_related
,
EuropePMC_search_articles
,
EuropePMC_get_citations
,
PubTator3_LiteratureSearch
,
OpenTargets_get_publications_by_target_ensemblID
Populates
Section 11 (Literature & Research Landscape) See IMPLEMENTATION.md for collision-aware search code. Retry Logic & Fallback Chains Primary Tool Fallback 1 Fallback 2 ChEMBL_get_target_activities GtoPdb_get_target_ligands OpenTargets drugs intact_get_interactions STRING_get_protein_interactions OpenTargets interactions GO_get_annotations_for_gene OpenTargets GO MyGene GO GTEx_get_median_gene_expression HPA_get_rna_expression Document as unavailable gnomad_get_gene_constraints OpenTargets constraint - DGIdb_get_drug_gene_interactions OpenTargets drugs GtoPdb NEVER silently skip failed tools. Always document failures and fallbacks in the report. Completeness Audit (REQUIRED before finalizing) Run the checklist in EVIDENCE_GRADING.md before finalizing any report: Data minimums met for PPIs, expression, diseases, constraints, druggability Negative results documented explicitly T1-T4 grades in Executive Summary, Disease Associations, Key Papers, Recommendations Every data point has source attribution Report Template Create [TARGET]_target_report.md with all 15 sections initialized. See REPORT_FORMAT.md for the full template with section headers, table formats, and completeness checklist. Initial file structure:

1. Executive Summary ## 9. Druggability & Pharmacology

2. Target Identifiers ## 10. Safety Profile

3. Basic Information ## 11. Literature & Research

4. Structural Biology ## 12. Competitive Landscape

5. Function & Pathways ## 13. Summary & Recommendations

6. Protein-Protein Interactions ## 14. Data Sources & Methodology

7. Expression Profile ## 15. Data Gaps & Limitations

8. Genetic Variation & Disease

Quick Reference: Tool Parameters Tool Parameter Notes Reactome_map_uniprot_to_pathways id NOT uniprot_id ensembl_get_xrefs id NOT gene_id GTEx_get_median_gene_expression gencode_id , operation Try versioned ID if empty OpenTargets_* ensemblId camelCase, not ensemblID STRING_get_protein_interactions protein_ids , species List format for IDs intact_get_interactions identifier UniProt accession Reference Files File Contents IMPLEMENTATION.md Detailed code for identifier resolution, GPCR detection, each PATH implementation, retry logic EVIDENCE_GRADING.md T1-T4 tier definitions, citation format, completeness audit checklist, data minimums REPORT_FORMAT.md Full report template with all 15 sections, table formats, section-specific guidance REFERENCE.md Complete tool reference (225+ tools) organized by category with parameters EXAMPLES.md Worked examples: EGFR full profile, KRAS druggability, target comparison, CDK4 validation, Alzheimer's targets

返回排行榜