Comprehensive metabolomics research skill that identifies metabolites, analyzes studies, and searches metabolomics databases. Generates structured research reports with annotated metabolite information, study details, and database statistics.
Use Case
Use this skill when asked to:
Identify or annotate metabolites (HMDB IDs, chemical properties, pathways)
Retrieve metabolomics study information from MetaboLights or Metabolomics Workbench
Search for metabolomics studies by keywords or disease
Analyze metabolite profiles or datasets
Generate comprehensive metabolomics research reports
Example queries:
"What is the HMDB ID and pathway information for glucose?"
"Get study details for MTBLS1"
"Find metabolomics studies related to diabetes"
"Analyze these metabolites: glucose, lactate, pyruvate"
Databases Covered
Primary metabolite databases:
HMDB
(Human Metabolome Database): 220,000+ metabolites with structures, pathways, and biological roles
MetaboLights
Public metabolomics repository with thousands of studies
Metabolomics Workbench
NIH Common Fund metabolomics data repository
PubChem
Chemical properties and bioactivity data (fallback)
Research Workflow
The skill executes a 4-phase analysis pipeline:
Phase 1: Metabolite Identification & Annotation
For each metabolite in the input list:
Search HMDB by metabolite name
Retrieve HMDB ID, chemical formula, molecular weight
Get detailed metabolite information (description, pathways)
Fallback to PubChem for CID and chemical properties if HMDB unavailable
Phase 2: Study Details Retrieval
For provided study IDs:
Detect database type (MTBLS = MetaboLights, ST = Metabolomics Workbench)
Retrieve study metadata (title, description, organism, status)
Extract experimental design and data availability
Phase 3: Study Search
For keyword searches:
Search MetaboLights studies by query term
Return matching study IDs with preview information
Consistent formatting (bold labels, tables for results)
Database overview:
Available databases and statistics
Recent studies sample
Integration information
Error handling:
Graceful error messages for unavailable data
Fallback strategies documented in output
"N/A" for missing fields (not blank)
Implementation Notes
SOAP Tool Handling
HMDB tools are SOAP-based
and require special parameter handling:
HMDB_search
Requires
operation="search"
parameter
HMDB_get_metabolite
Requires
operation="get_metabolite"
parameter
Do not use
endpoint
or
method
parameters (not applicable to SOAP)
Response Format Variations
Tools return different response formats - handle all three:
Standard format
:
{status: "success", data: [...], metadata: {...}}
Direct list
:
[...]
(e.g., metabolights_list_studies)
Direct dict
:
{field1: ..., field2: ...}
(e.g., some detail endpoints)
Always check response type with
isinstance()
before accessing fields.
Fallback Strategy
Follow this hierarchy for robustness:
Primary source
Try main database first (HMDB for metabolites, MetaboLights for studies)
Fallback source
Use alternative database if primary fails (PubChem for chemical properties)
Default behavior
Show error message with context, continue with remaining phases
Progressive Report Writing
Write report incrementally to avoid memory issues:
Create output file early in pipeline
Append sections as each phase completes
Flush to disk regularly for long analyses
Return file path for user access
Tool Discovery
The skill automatically discovers and uses these tools from ToolUniverse:
HMDB Tools:
HMDB_search
Search metabolites by name
HMDB_get_metabolite
Get detailed metabolite information
MetaboLights Tools:
metabolights_list_studies
List available studies
metabolights_search_studies
Search studies by keyword
metabolights_get_study
Get study details by ID
Metabolomics Workbench Tools:
MetabolomicsWorkbench_get_study
Get study information
MetabolomicsWorkbench_search_compound_by_name
Search compounds
PubChem Tools:
PubChem_get_CID_by_compound_name
Get PubChem CID
PubChem_get_compound_properties_by_CID
Get chemical properties
No manual tool configuration required - all tools loaded automatically.
Common Issues
Issue: HMDB returns "Error querying HMDB: 0"
Cause
HMDB search returned empty results or index error accessing first result
Solution
This is expected for uncommon metabolites; PubChem fallback will be attempted
Issue: Study details show "N/A" for all fields
Cause
Study ID not found or API unavailable
Solution
Verify study ID format (MTBLS or ST), check if study is public
Issue: Tool not found errors
Cause
Missing API keys for some databases
Solution
Check
.env.template
, add required API keys to
.env
file (most metabolomics tools work without keys)
Issue: Large metabolite lists cause slow execution
Cause
Pipeline queries each metabolite individually
Solution
Reports limit to first 10 metabolites; consider batching for >20 metabolites
Tool Parameter Reference
HMDB Tools (SOAP)
Tool
Required Parameters
Optional Parameters
Response Format
Notes
HMDB_search
operation="search"
,
query
-
{status, data: []}
SOAP tool - operation required
HMDB_get_metabolite
operation="get_metabolite"
,
hmdb_id
-
{status, data: {}}
SOAP tool - operation required
MetaboLights Tools (REST)
Tool
Required Parameters
Optional Parameters
Response Format
Notes
metabolights_list_studies
-
size
(default: 10)
{status, data: []}
or
[...]
May return direct list
metabolights_search_studies
query
-
{status, data: []}
Returns study IDs
metabolights_get_study
study_id
-
{status, data: {}}
Full study metadata
Metabolomics Workbench Tools (REST)
Tool
Required Parameters
Optional Parameters
Response Format
Notes
MetabolomicsWorkbench_get_study
study_id
output_item
(default: "summary")
{status, data: {}}
Data may be text
MetabolomicsWorkbench_search_compound_by_name
compound_name
-
{status, data: {}}
Compound information
PubChem Tools (REST)
Tool
Required Parameters
Optional Parameters
Response Format
Notes
PubChem_get_CID_by_compound_name
compound_name
-
{status, data: {cid}}
Returns CID
PubChem_get_compound_properties_by_CID
cid
-
{status, data: {}}
Chemical properties
Important
All parameter names and requirements apply to
both Python SDK and MCP implementations
.
Summary
The Metabolomics Research skill provides comprehensive metabolomics analysis through a 4-phase pipeline that:
Identifies metabolites
using HMDB (primary) and PubChem (fallback) databases
Retrieves study details
from MetaboLights and Metabolomics Workbench repositories
Searches studies
by keywords across metabolomics databases
Generates structured reports
with all findings in readable markdown format
Key Features:
✅ 100% test coverage with working pipeline
✅ Handles SOAP tools correctly (HMDB requires
operation
parameter)
✅ Implements fallback strategies (HMDB → PubChem)
✅ Graceful error handling (continues if one phase fails)
✅ Progressive report writing (memory-efficient)
✅ Implementation-agnostic documentation (works with Python SDK and MCP)
Best for:
Metabolite annotation and pathway analysis
Study discovery and data retrieval
Comprehensive metabolomics research reports
Multi-database metabolomics queries
Limitations:
HMDB may not have all metabolites (fallback to PubChem)
Some studies require authentication or are not public
Large metabolite lists (>10) auto-limited in reports
API rate limits may affect large-scale queries
Quick Start
See
QUICK_START.md
for:
Python SDK implementation with code examples
MCP integration instructions
Step-by-step tutorial for common workflows
Advanced usage patterns