hypogenic

安装量: 132
排名: #6522

安装

npx skills add https://github.com/davila7/claude-code-templates --skill hypogenic

Hypogenic Overview

Hypogenic provides automated hypothesis generation and testing using large language models to accelerate scientific discovery. The framework supports three approaches: HypoGeniC (data-driven hypothesis generation), HypoRefine (synergistic literature and data integration), and Union methods (mechanistic combination of literature and data-driven hypotheses).

Quick Start

Get started with Hypogenic in minutes:

Install the package

uv pip install hypogenic

Clone example datasets

git clone https://github.com/ChicagoHAI/HypoGeniC-datasets.git ./data

Run basic hypothesis generation

hypogenic_generation --config ./data/your_task/config.yaml --method hypogenic --num_hypotheses 20

Run inference on generated hypotheses

hypogenic_inference --config ./data/your_task/config.yaml --hypotheses output/hypotheses.json

Or use Python API:

from hypogenic import BaseTask

Create task with your configuration

task = BaseTask(config_path="./data/your_task/config.yaml")

Generate hypotheses

task.generate_hypotheses(method="hypogenic", num_hypotheses=20)

Run inference

results = task.inference(hypothesis_bank="./output/hypotheses.json")

When to Use This Skill

Use this skill when working on:

Generating scientific hypotheses from observational datasets Testing multiple competing hypotheses systematically Combining literature insights with empirical patterns Accelerating research discovery through automated hypothesis ideation Domains requiring hypothesis-driven analysis: deception detection, AI-generated content identification, mental health indicators, predictive modeling, or other empirical research Key Features

Automated Hypothesis Generation

Generate 10-20+ testable hypotheses from data in minutes Iterative refinement based on validation performance Support for both API-based (OpenAI, Anthropic) and local LLMs

Literature Integration

Extract insights from research papers via PDF processing Combine theoretical foundations with empirical patterns Systematic literature-to-hypothesis pipeline with GROBID

Performance Optimization

Redis caching reduces API costs for repeated experiments Parallel processing for large-scale hypothesis testing Adaptive refinement focuses on challenging examples

Flexible Configuration

Template-based prompt engineering with variable injection Custom label extraction for domain-specific tasks Modular architecture for easy extension

Proven Results

8.97% improvement over few-shot baselines 15.75% improvement over literature-only approaches 80-84% hypothesis diversity (non-redundant insights) Human evaluators report significant decision-making improvements Core Capabilities 1. HypoGeniC: Data-Driven Hypothesis Generation

Generate hypotheses solely from observational data through iterative refinement.

Process:

Initialize with a small data subset to generate candidate hypotheses Iteratively refine hypotheses based on performance Replace poorly-performing hypotheses with new ones from challenging examples

Best for: Exploratory research without existing literature, pattern discovery in novel datasets

  1. HypoRefine: Literature and Data Integration

Synergistically combine existing literature with empirical data through an agentic framework.

Process:

Extract insights from relevant research papers (typically 10 papers) Generate theory-grounded hypotheses from literature Generate data-driven hypotheses from observational patterns Refine both hypothesis banks through iterative improvement

Best for: Research with established theoretical foundations, validating or extending existing theories

  1. Union Methods

Mechanistically combine literature-only hypotheses with framework outputs.

Variants:

Literature ∪ HypoGeniC: Combines literature hypotheses with data-driven generation Literature ∪ HypoRefine: Combines literature hypotheses with integrated approach

Best for: Comprehensive hypothesis coverage, eliminating redundancy while maintaining diverse perspectives

Installation

Install via pip:

uv pip install hypogenic

Optional dependencies:

Redis server (port 6832): Enables caching of LLM responses to significantly reduce API costs during iterative hypothesis generation s2orc-doc2json: Required for processing literature PDFs in HypoRefine workflows GROBID: Required for PDF preprocessing (see Literature Processing section)

Clone example datasets:

For HypoGeniC examples

git clone https://github.com/ChicagoHAI/HypoGeniC-datasets.git ./data

For HypoRefine/Union examples

git clone https://github.com/ChicagoHAI/Hypothesis-agent-datasets.git ./data

Dataset Format

Datasets must follow HuggingFace datasets format with specific naming conventions:

Required files:

_train.json: Training data _val.json: Validation data _test.json: Test data

Required keys in JSON:

text_features_1 through text_features_n: Lists of strings containing feature values label: List of strings containing ground truth labels

Example (headline click prediction):

{ "headline_1": [ "What Up, Comet? You Just Got PROBED", "Scientists Made a Breakthrough in Quantum Computing" ], "headline_2": [ "Scientists Everywhere Were Holding Their Breath Today. Here's Why.", "New Quantum Computer Achieves Milestone" ], "label": [ "Headline 2 has more clicks than Headline 1", "Headline 1 has more clicks than Headline 2" ] }

Important notes:

All lists must have the same length Label format must match your extract_label() function output format Feature keys can be customized to match your domain (e.g., review_text, post_content, etc.) Configuration

Each task requires a config.yaml file specifying:

Required elements:

Dataset paths (train/val/test) Prompt templates for: Observations generation Batched hypothesis generation Hypothesis inference Relevance checking Adaptive methods (for HypoRefine)

Template capabilities:

Dataset placeholders for dynamic variable injection (e.g., ${text_features_1}, ${num_hypotheses}) Custom label extraction functions for domain-specific parsing Role-based prompt structure (system, user, assistant roles)

Configuration structure:

task_name: your_task_name

train_data_path: ./your_task_train.json val_data_path: ./your_task_val.json test_data_path: ./your_task_test.json

prompt_templates: # Extra keys for reusable prompt components observations: | Feature 1: ${text_features_1} Feature 2: ${text_features_2} Observation: ${label}

# Required templates batched_generation: system: "Your system prompt here" user: "Your user prompt with ${num_hypotheses} placeholder"

inference: system: "Your inference system prompt" user: "Your inference user prompt"

# Optional templates for advanced features few_shot_baseline: {...} is_relevant: {...} adaptive_inference: {...} adaptive_selection: {...}

Refer to references/config_template.yaml for a complete example configuration.

Literature Processing (HypoRefine/Union Methods)

To use literature-based hypothesis generation, you must preprocess PDF papers:

Step 1: Setup GROBID (first time only)

bash ./modules/setup_grobid.sh

Step 2: Add PDF files Place research papers in literature/YOUR_TASK_NAME/raw/

Step 3: Process PDFs

Start GROBID service

bash ./modules/run_grobid.sh

Process PDFs for your task

cd examples python pdf_preprocess.py --task_name YOUR_TASK_NAME

This converts PDFs to structured format for hypothesis extraction. Automated literature search will be supported in future releases.

CLI Usage Hypothesis Generation hypogenic_generation --help

Key parameters:

Task configuration file path Model selection (API-based or local) Generation method (HypoGeniC, HypoRefine, or Union) Number of hypotheses to generate Output directory for hypothesis banks Hypothesis Inference hypogenic_inference --help

Key parameters:

Task configuration file path Hypothesis bank file path Test dataset path Inference method (default or multi-hypothesis) Output file for results Python API Usage

For programmatic control and custom workflows, use Hypogenic directly in your Python code:

Basic HypoGeniC Generation from hypogenic import BaseTask

Clone example datasets first

git clone https://github.com/ChicagoHAI/HypoGeniC-datasets.git ./data

Load your task with custom extract_label function

task = BaseTask( config_path="./data/your_task/config.yaml", extract_label=lambda text: extract_your_label(text) )

Generate hypotheses

task.generate_hypotheses( method="hypogenic", num_hypotheses=20, output_path="./output/hypotheses.json" )

Run inference

results = task.inference( hypothesis_bank="./output/hypotheses.json", test_data="./data/your_task/your_task_test.json" )

HypoRefine/Union Methods

For literature-integrated approaches

git clone https://github.com/ChicagoHAI/Hypothesis-agent-datasets.git ./data

Generate with HypoRefine

task.generate_hypotheses( method="hyporefine", num_hypotheses=15, literature_path="./literature/your_task/", output_path="./output/" )

This generates 3 hypothesis banks:

- HypoRefine (integrated approach)

- Literature-only hypotheses

- Literature∪HypoRefine (union)

Multi-Hypothesis Inference from examples.multi_hyp_inference import run_multi_hypothesis_inference

Test multiple hypotheses simultaneously

results = run_multi_hypothesis_inference( config_path="./data/your_task/config.yaml", hypothesis_bank="./output/hypotheses.json", test_data="./data/your_task/your_task_test.json" )

Custom Label Extraction

The extract_label() function is critical for parsing LLM outputs. Implement it based on your task:

def extract_label(llm_output: str) -> str: """Extract predicted label from LLM inference text.

Default behavior: searches for 'final answer:\s+(.*)' pattern.
Customize for your domain-specific output format.
"""
import re
match = re.search(r'final answer:\s+(.*)', llm_output, re.IGNORECASE)
if match:
    return match.group(1).strip()
return llm_output.strip()

Important: Extracted labels must match the format of label values in your dataset for correct accuracy calculation.

Workflow Examples Example 1: Data-Driven Hypothesis Generation (HypoGeniC)

Scenario: Detecting AI-generated content without prior theoretical framework

Steps:

Prepare dataset with text samples and labels (human vs. AI-generated) Create config.yaml with appropriate prompt templates Run hypothesis generation: hypogenic_generation --config config.yaml --method hypogenic --num_hypotheses 20

Run inference on test set: hypogenic_inference --config config.yaml --hypotheses output/hypotheses.json --test_data data/test.json

Analyze results for patterns like formality, grammatical precision, and tone differences Example 2: Literature-Informed Hypothesis Testing (HypoRefine)

Scenario: Deception detection in hotel reviews building on existing research

Steps:

Collect 10 relevant papers on linguistic deception cues Prepare dataset with genuine and fraudulent reviews Configure config.yaml with literature processing and data generation templates Run HypoRefine: hypogenic_generation --config config.yaml --method hyporefine --papers papers/ --num_hypotheses 15

Test hypotheses examining pronoun frequency, detail specificity, and other linguistic patterns Compare literature-based and data-driven hypothesis performance Example 3: Comprehensive Hypothesis Coverage (Union Method)

Scenario: Mental stress detection maximizing hypothesis diversity

Steps:

Generate literature hypotheses from mental health research papers Generate data-driven hypotheses from social media posts Run Union method to combine and deduplicate: hypogenic_generation --config config.yaml --method union --literature_hypotheses lit_hyp.json

Inference captures both theoretical constructs (posting behavior changes) and data patterns (emotional language shifts) Performance Optimization

Caching: Enable Redis caching to reduce API costs and computation time for repeated LLM calls

Parallel Processing: Leverage multiple workers for large-scale hypothesis generation and testing

Adaptive Refinement: Use challenging examples to iteratively improve hypothesis quality

Expected Outcomes

Research using hypogenic has demonstrated:

14.19% accuracy improvement in AI-content detection tasks 7.44% accuracy improvement in deception detection tasks 80-84% of hypothesis pairs offering distinct, non-redundant insights High helpfulness ratings from human evaluators across multiple research domains Troubleshooting

Issue: Generated hypotheses are too generic Solution: Refine prompt templates in config.yaml to request more specific, testable hypotheses

Issue: Poor inference performance Solution: Ensure dataset has sufficient training examples, adjust hypothesis generation parameters, or increase number of hypotheses

Issue: Label extraction failures Solution: Implement custom extract_label() function for domain-specific output parsing

Issue: GROBID PDF processing fails Solution: Ensure GROBID service is running (bash ./modules/run_grobid.sh) and PDFs are valid research papers

Creating Custom Tasks

To add a new task or dataset to Hypogenic:

Step 1: Prepare Your Dataset

Create three JSON files following the required format:

your_task_train.json your_task_val.json your_task_test.json

Each file must have keys for text features (text_features_1, etc.) and label.

Step 2: Create config.yaml

Define your task configuration with:

Task name and dataset paths Prompt templates for observations, generation, inference Any extra keys for reusable prompt components Placeholder variables (e.g., ${text_features_1}, ${num_hypotheses}) Step 3: Implement extract_label Function

Create a custom label extraction function that parses LLM outputs for your domain:

from hypogenic import BaseTask

def extract_my_label(llm_output: str) -> str: """Custom label extraction for your task.

Must return labels in same format as dataset 'label' field.
"""
# Example: Extract from specific format
if "Final prediction:" in llm_output:
    return llm_output.split("Final prediction:")[-1].strip()

# Fallback to default pattern
import re
match = re.search(r'final answer:\s+(.*)', llm_output, re.IGNORECASE)
return match.group(1).strip() if match else llm_output.strip()

Use your custom task

task = BaseTask( config_path="./your_task/config.yaml", extract_label=extract_my_label )

Step 4: (Optional) Process Literature

For HypoRefine/Union methods:

Create literature/your_task_name/raw/ directory Add relevant research paper PDFs Run GROBID preprocessing Process with pdf_preprocess.py Step 5: Generate and Test

Run hypothesis generation and inference using CLI or Python API:

CLI approach

hypogenic_generation --config your_task/config.yaml --method hypogenic --num_hypotheses 20 hypogenic_inference --config your_task/config.yaml --hypotheses output/hypotheses.json

Or use Python API (see Python API Usage section)

Repository Structure

Understanding the repository layout:

hypothesis-generation/ ├── hypogenic/ # Core package code ├── hypogenic_cmd/ # CLI entry points ├── hypothesis_agent/ # HypoRefine agent framework ├── literature/ # Literature processing utilities ├── modules/ # GROBID and preprocessing modules ├── examples/ # Example scripts │ ├── generation.py # Basic HypoGeniC generation │ ├── union_generation.py # HypoRefine/Union generation │ ├── inference.py # Single hypothesis inference │ ├── multi_hyp_inference.py # Multiple hypothesis inference │ └── pdf_preprocess.py # Literature PDF processing ├── data/ # Example datasets (clone separately) ├── tests/ # Unit tests └── IO_prompting/ # Prompt templates and experiments

Key directories:

hypogenic/: Main package with BaseTask and generation logic examples/: Reference implementations for common workflows literature/: Tools for PDF processing and literature extraction modules/: External tool integrations (GROBID, etc.) Related Publications HypoBench (2025)

Liu, H., Huang, S., Hu, J., Zhou, Y., & Tan, C. (2025). HypoBench: Towards Systematic and Principled Benchmarking for Hypothesis Generation. arXiv preprint arXiv:2504.11524.

Paper: https://arxiv.org/abs/2504.11524 Description: Benchmarking framework for systematic evaluation of hypothesis generation methods

BibTeX:

@misc{liu2025hypobenchsystematicprincipledbenchmarking, title={HypoBench: Towards Systematic and Principled Benchmarking for Hypothesis Generation}, author={Haokun Liu and Sicong Huang and Jingyu Hu and Yangqiaoyu Zhou and Chenhao Tan}, year={2025}, eprint={2504.11524}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={https://arxiv.org/abs/2504.11524}, }

Literature Meets Data (2024)

Liu, H., Zhou, Y., Li, M., Yuan, C., & Tan, C. (2024). Literature Meets Data: A Synergistic Approach to Hypothesis Generation. arXiv preprint arXiv:2410.17309.

Paper: https://arxiv.org/abs/2410.17309 Code: https://github.com/ChicagoHAI/hypothesis-generation Description: Introduces HypoRefine and demonstrates synergistic combination of literature-based and data-driven hypothesis generation

BibTeX:

@misc{liu2024literaturemeetsdatasynergistic, title={Literature Meets Data: A Synergistic Approach to Hypothesis Generation}, author={Haokun Liu and Yangqiaoyu Zhou and Mingxuan Li and Chenfei Yuan and Chenhao Tan}, year={2024}, eprint={2410.17309}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={https://arxiv.org/abs/2410.17309}, }

Hypothesis Generation with Large Language Models (2024)

Zhou, Y., Liu, H., Srivastava, T., Mei, H., & Tan, C. (2024). Hypothesis Generation with Large Language Models. In Proceedings of EMNLP Workshop of NLP for Science.

Paper: https://aclanthology.org/2024.nlp4science-1.10/ Description: Original HypoGeniC framework for data-driven hypothesis generation

BibTeX:

@inproceedings{zhou2024hypothesisgenerationlargelanguage, title={Hypothesis Generation with Large Language Models}, author={Yangqiaoyu Zhou and Haokun Liu and Tejes Srivastava and Hongyuan Mei and Chenhao Tan}, booktitle = {Proceedings of EMNLP Workshop of NLP for Science}, year={2024}, url={https://aclanthology.org/2024.nlp4science-1.10/}, }

Additional Resources Official Links GitHub Repository: https://github.com/ChicagoHAI/hypothesis-generation PyPI Package: https://pypi.org/project/hypogenic/ License: MIT License Issues & Support: https://github.com/ChicagoHAI/hypothesis-generation/issues Example Datasets

Clone these repositories for ready-to-use examples:

HypoGeniC examples (data-driven only)

git clone https://github.com/ChicagoHAI/HypoGeniC-datasets.git ./data

HypoRefine/Union examples (literature + data)

git clone https://github.com/ChicagoHAI/Hypothesis-agent-datasets.git ./data

Community & Contributions Contributors: 7+ active contributors Stars: 89+ on GitHub Topics: research-tool, interpretability, hypothesis-generation, scientific-discovery, llm-application

For contributions or questions, visit the GitHub repository and check the issues page.

Local Resources references/

config_template.yaml - Complete example configuration file with all required prompt templates and parameters. This includes:

Full YAML structure for task configuration Example prompt templates for all methods Placeholder variable documentation Role-based prompt examples scripts/

Scripts directory is available for:

Custom data preparation utilities Format conversion tools Analysis and evaluation scripts Integration with external tools assets/

Assets directory is available for:

Example datasets and templates Sample hypothesis banks Visualization outputs Documentation supplements

返回排行榜