Python Skills for LlamaFarm

Shared Python best practices and code review checklists for all Python components in the LlamaFarm monorepo.

Applicable Components Component Path Python Key Dependencies Server server/ 3.12+ FastAPI, Celery, Pydantic, structlog RAG rag/ 3.11+ LlamaIndex, ChromaDB, Celery Universal Runtime runtimes/universal/ 3.11+ PyTorch, transformers, FastAPI Config config/ 3.11+ Pydantic, JSONSchema Common common/ 3.10+ HuggingFace Hub Quick Reference Topic File Key Points Patterns patterns.md Dataclasses, Pydantic, comprehensions, imports Async async.md async/await, asyncio, concurrent execution Typing typing.md Type hints, generics, protocols, Pydantic Testing testing.md Pytest fixtures, mocking, async tests Errors error-handling.md Custom exceptions, logging, context managers Security security.md Path traversal, injection, secrets, deserialization Code Style

LlamaFarm uses ruff with shared configuration in ruff.toml:

line-length = 88 target-version = "py311" select = ["E", "F", "I", "B", "UP", "SIM"]

Key rules:

E, F: Core pyflakes and pycodestyle I: Import sorting (isort) B: Bugbear (common pitfalls) UP: Upgrade syntax to modern Python SIM: Simplify code patterns Architecture Patterns Settings with pydantic-settings from pydantic_settings import BaseSettings

class Settings(BaseSettings, env_file=".env"): LOG_LEVEL: str = "INFO" HOST: str = "0.0.0.0" PORT: int = 8000

settings = Settings() # Singleton at module level

Structured Logging with structlog from core.logging import FastAPIStructLogger # Server from core.logging import RAGStructLogger # RAG from core.logging import UniversalRuntimeLogger # Runtime

logger = FastAPIStructLogger(name) logger.info("Operation completed", extra={"count": 10, "duration_ms": 150})

Abstract Base Classes for Extensibility from abc import ABC, abstractmethod

class Component(ABC): def init(self, name: str, config: dict[str, Any] | None = None): self.name = name or self.class.name self.config = config or {}

@abstractmethod
def process(self, documents: list[Document]) -> ProcessingResult:
    pass

Dataclasses for Internal Data from dataclasses import dataclass, field

@dataclass class Document: content: str metadata: dict[str, Any] = field(default_factory=dict) id: str = field(default_factory=lambda: str(uuid.uuid4()))

Pydantic Models for API Boundaries from pydantic import BaseModel, Field, ConfigDict

class EmbeddingRequest(BaseModel): model: str input: str | list[str] encoding_format: Literal["float", "base64"] | None = "float"

model_config = ConfigDict(str_strip_whitespace=True)

Directory Structure

Each Python component follows this structure:

component/ ├── pyproject.toml # UV-managed dependencies ├── core/ # Core functionality │ ├── init.py │ ├── settings.py # Pydantic Settings │ └── logging.py # structlog setup ├── services/ # Business logic (server) ├── models/ # ML models (runtime) ├── tasks/ # Celery tasks (rag) ├── utils/ # Utility functions └── tests/ ├── conftest.py # Shared fixtures └── test_*.py

Review Checklist Summary

When reviewing Python code in LlamaFarm:

Patterns (Medium priority)

Modern Python syntax (3.10+ type hints) Dataclass vs Pydantic used appropriately No mutable default arguments

Async (High priority)

No blocking calls in async functions Proper asyncio.Lock usage Cancellation handled correctly

Typing (Medium priority)

Complete return type hints Generic types parameterized Pydantic v2 patterns

Testing (Medium priority)

Fixtures properly scoped Async tests use pytest-asyncio Mocks cleaned up

Errors (High priority)

Custom exceptions with context Structured logging with extra dict Proper exception chaining

Security (Critical priority)

Path traversal prevention Input sanitization Safe deserialization

See individual topic files for detailed checklists with grep patterns.

python-skills

安装