for building production-ready prompts across standard tasks, RAG workflows, agent orchestration, structured outputs, hidden reasoning, and multi-step planning.
All content is
operational
, not theoretical. Focus on patterns, checklists, and copy-paste templates.
Quick Start (60 seconds)
Pick a pattern from the decision tree (structured output, extractor, RAG, tools/agent, rewrite, classification).
Add evals: 10–20 cases while iterating, 50–200 before release, plus adversarial injection cases.
Model Notes (2026)
This skill includes Claude Code + Codex CLI optimizations:
Action directives
Frame for implementation, not suggestions
Parallel tool execution
Independent tool calls can run simultaneously
Long-horizon task management
State tracking, incremental progress, context compaction resilience
Positive framing
Describe desired behavior rather than prohibitions
Style matching
Prompt formatting influences output style
Domain-specific patterns
Specialized guidance for frontend, research, and agentic coding
Style-adversarial resilience
Stress-test refusals with poetic/role-play rewrites; normalize or decline stylized harmful asks before tool use
Prefer “brief justification” over requesting chain-of-thought. When using private reasoning patterns, instruct: think internally; output only the final answer.
Quick Reference
Task
Pattern to Use
Key Components
When to Use
Machine-parseable output
Structured Output
JSON schema, "JSON-only" directive, no prose
API integrations, data extraction
Field extraction
Deterministic Extractor
Exact schema, missing->null, no transformations
Form data, invoice parsing
Use retrieved context
RAG Workflow
Context relevance check, chunk citations, explicit missing info
Knowledge bases, documentation search
Internal reasoning
Hidden Chain-of-Thought
Internal reasoning, final answer only
Classification, complex decisions
Tool-using agent
Tool/Agent Planner
Plan-then-act, one tool per turn
Multi-step workflows, API calls
Text transformation
Rewrite + Constrain
Style rules, meaning preservation, format spec
Content adaptation, summarization
Classification
Decision Tree
Ordered branches, mutually exclusive, JSON result
Routing, categorization, triage
Decision Tree: Choosing the Right Pattern
User needs: [Prompt Type]
|-- Output must be machine-readable?
| |-- Extract specific fields only? -> Deterministic Extractor Pattern
True expertise in prompting extends beyond writing instructions to shaping the entire context in which the model operates. Context engineering encompasses:
Conversation history
What prior turns inform the current response
Retrieved context (RAG)
External knowledge injected into the prompt
Structured inputs
JSON schemas, system/user message separation
Tool outputs
Results from previous tool calls that shape next steps
Context Engineering vs Prompt Engineering
Aspect
Prompt Engineering
Context Engineering
Focus
Instruction text
Full input pipeline
Scope
Single prompt
RAG + history + tools
Optimization
Word choice, structure
Information architecture
Goal
Clear instructions
Optimal context window
Key Context Engineering Patterns
1. Context Prioritization
Place most relevant information first; models attend more strongly to early context.
2. Context Compression
Summarize history, truncate tool outputs, select most relevant RAG chunks.
3. Context Separation
Use clear delimiters (
,
,
) to separate instruction types.
4. Dynamic Context
Adjust context based on task complexity - simple tasks need less context, complex tasks need more.
Core Concepts vs Implementation Practices
Core Concepts (Vendor-Agnostic)
Prompt contract
inputs, allowed tools, output schema, max tokens, and refusal rules.
Context engineering
conversation history, RAG context, tool outputs, and structured inputs shape model behavior.
Determinism controls
temperature/top_p, constrained decoding/structured outputs, and strict formatting.
Cost & latency budgets
prompt length and max output drive tokens and tail latency; enforce hard limits and measure p95/p99.
Evaluation
golden sets + regression gates + A/B + post-deploy monitoring.
Security
prompt injection, data exfiltration, and tool misuse are primary threats (OWASP LLM Top 10:
https://owasp.org/www-project-top-10-for-large-language-model-applications/
).
Implementation Practices (Model/Platform-Specific)
Use model-specific structured output features when available; keep a schema validator as the source of truth.
Align tracing/metrics with OpenTelemetry GenAI semantic conventions (
https://opentelemetry.io/docs/specs/semconv/gen-ai/
).
Do / Avoid
Do
Do keep prompts small and modular; centralize shared fragments (policies, schemas, style).
Do add a prompt eval harness and block merges on regressions.
Do prefer "brief justification" over requesting chain-of-thought; treat hidden reasoning as model-internal.
Avoid
Avoid prompt sprawl (many near-duplicates with no owner or tests).
Avoid brittle multi-step chains without intermediate validation.
Avoid mixing policy and product copy in the same prompt (harder to audit and update).
Navigation: Core Patterns
Core Patterns
- 7 production-grade prompt patterns
Structured Output (JSON), Deterministic Extractor, RAG Workflow
Hidden Chain-of-Thought, Tool/Agent Planner, Rewrite + Constrain, Decision Tree
Each pattern includes structure template and validation checklist
Navigation: Best Practices
Best Practices (Core)
- Foundation rules for production-grade prompts
System instruction design, output contract specification, action directives
Context handling, error recovery, positive framing, style matching, style-adversarial red teaming
Anti-patterns, Claude 4+ specific optimizations
Production Guidelines
- Deployment and operational guidance
Evaluation & testing (Prompt CI/CD), model parameters, few-shot selection
Safety & guardrails, conversation memory, context compaction resilience
Answer engineering, decomposition, multilingual/multimodal, benchmarking
CI/CD Tools
(2026): Promptfoo, DeepEval integration patterns
Security
(2026): PromptGuard 4-layer defense, Microsoft Prompt Shields, taint tracking
Quality Checklists
- Validation checklists before deployment
Prompt QA, JSON validation, agent workflow checks
RAG workflow, safety & security, performance optimization
Testing coverage, anti-patterns, quality score rubric
Domain-Specific Patterns
- Claude 4+ optimized patterns for specialized domains
Frontend/visual code: Creativity encouragement, design variations, micro-interactions
Research tasks: Success criteria, verification, hypothesis tracking
Agentic coding: No speculation rule, principled implementation, investigation patterns
Cross-domain best practices and quality modifiers
Navigation: Specialized Patterns
RAG Patterns
- Retrieval-augmented generation workflows
Context grounding, chunk citation, missing information handling
Agent and Tool Patterns
- Tool use and agent orchestration
Plan-then-act workflows, tool calling, multi-step reasoning, generate-verify-revise chains
Multi-Agent Orchestration
(2026): centralized, handoff, federated patterns; plan-and-execute (90% cost reduction)
Extraction Patterns
- Deterministic field extraction
Schema-based extraction, null handling, no hallucinations
Reasoning Patterns (Hidden CoT)
- Internal reasoning without visible output
Hidden reasoning, final answer only, classification workflows
Extended Thinking API
(Claude 4+): budget management, think tool, multishot patterns
Additional Patterns
- Extended prompt engineering techniques
Advanced patterns, edge cases, optimization strategies
Prompt Testing & CI/CD
- Automated prompt evaluation pipelines
Promptfoo, DeepEval integration, regression detection, A/B testing, quality gates
Multimodal Prompt Patterns
- Vision, audio, and document input patterns
Image description, OCR+LLM, bounding box prompts, Whisper conditioning, video frame analysis
Prompt Security & Defense
- Securing LLM applications against adversarial attacks
Injection detection (PromptGuard, Prompt Shields), defense-in-depth, taint tracking, red team testing
Navigation: Templates
Templates are copy-paste ready and organized by complexity:
Quick Templates
Quick Template
- Fast, minimal prompt structure
Standard Templates
Standard Template
- Production-grade operational prompt
Agent Template
- Tool-using agent with planning
RAG Template
- Retrieval-augmented generation
Chain-of-Thought Template
- Hidden reasoning pattern
JSON Extractor Template
- Deterministic field extraction
Prompt Evaluation Template
- Regression tests, A/B testing, rollout gates
External Resources
External references are listed in
data/sources.json
:
Official documentation (OpenAI, Anthropic, Google)
LLM frameworks (LangChain, LlamaIndex)
Vector databases (Pinecone, Weaviate, FAISS)
Evaluation tools (OpenAI Evals, HELM)
Safety guides and standards
RAG and retrieval resources
Freshness Rule (2026)
When asked for “latest” prompting recommendations, prefer provider docs and standards from
data/sources.json
. If web search is unavailable, state the constraint and avoid overconfident “current best” claims.