model-deployment

安装量: 54
排名: #13747

安装

npx skills add https://github.com/secondsky/claude-skills --skill model-deployment

ML Model Deployment Deploy trained models to production with proper serving and monitoring. Deployment Options Method Use Case Latency REST API Web services Medium Batch Large-scale processing N/A Streaming Real-time Low Edge On-device Very low FastAPI Model Server from fastapi import FastAPI from pydantic import BaseModel import joblib import numpy as np app = FastAPI ( ) model = joblib . load ( 'model.pkl' ) class PredictionRequest ( BaseModel ) : features : list [ float ] class PredictionResponse ( BaseModel ) : prediction : float probability : float @app . get ( '/health' ) def health ( ) : return { 'status' : 'healthy' } @app . post ( '/predict' , response_model = PredictionResponse ) def predict ( request : PredictionRequest ) : features = np . array ( request . features ) . reshape ( 1 , - 1 ) prediction = model . predict ( features ) [ 0 ] probability = model . predict_proba ( features ) [ 0 ] . max ( ) return PredictionResponse ( prediction = prediction , probability = probability ) Docker Deployment FROM python:3.11-slim WORKDIR /app COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt COPY model.pkl . COPY app.py . EXPOSE 8000 CMD [ "uvicorn" , "app:app" , "--host" , "0.0.0.0" , "--port" , "8000" ] Model Monitoring class ModelMonitor : def init ( self ) : self . predictions = [ ] self . latencies = [ ] def log_prediction ( self , input_data , prediction , latency ) : self . predictions . append ( { 'input' : input_data , 'prediction' : prediction , 'latency' : latency , 'timestamp' : datetime . now ( ) } ) def detect_drift ( self , reference_distribution ) :

Compare current predictions to reference

pass Deployment Checklist Model validated on test set API endpoints documented Health check endpoint Authentication configured Logging and monitoring setup Model versioning in place Rollback procedure documented Quick Start: Deploy Model in 6 Steps

1. Save trained model

import joblib joblib.dump ( model, 'model.pkl' )

2. Create FastAPI app (see references/fastapi-production-server.md)

app.py with /predict and /health endpoints

3. Create Dockerfile

cat

Dockerfile << 'EOF' FROM python:3.11-slim WORKDIR /app COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt COPY app.py model.pkl ./ CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"] EOF

4. Build and test locally

docker build -t model-api:v1.0.0 . docker run -p 8000 :8000 model-api:v1.0.0

5. Push to registry

docker tag model-api:v1.0.0 registry.example.com/model-api:v1.0.0 docker push registry.example.com/model-api:v1.0.0

6. Deploy to Kubernetes

kubectl apply
-f
deployment.yaml
kubectl rollout status deployment/model-api
Known Issues Prevention
1. No Health Checks = Downtime
Problem
Load balancer sends traffic to unhealthy pods, causing 503 errors.
Solution
Implement both liveness and readiness probes:

app.py

@app . get ( "/health" )

Liveness: Is service alive?

async def health ( ) : return { "status" : "healthy" } @app . get ( "/ready" )

Readiness: Can handle traffic?

async def ready ( ) : try : _ = model_store . model

Verify model loaded

return { "status" : "ready" } except : raise HTTPException ( 503 , "Not ready" )

deployment.yaml

livenessProbe
:
httpGet
:
path
:
/health
port
:
8000
initialDelaySeconds
:
30
readinessProbe
:
httpGet
:
path
:
/ready
port
:
8000
initialDelaySeconds
:
5
2. Model Not Found Errors in Container
Problem
:
FileNotFoundError: model.pkl
when container starts.
Solution
Verify model file is copied in Dockerfile and path matches:

❌ Wrong: Model in wrong directory

COPY model.pkl /app/models/ # But code expects /app/model.pkl

✅ Correct: Consistent paths

COPY model.pkl /models/model.pkl ENV MODEL_PATH=/models/model.pkl

In Python:

model_path = os.getenv("MODEL_PATH", "/models/model.pkl")
3. Unhandled Input Validation = 500 Errors
Problem
Invalid inputs crash API with unhandled exceptions.
Solution
Use Pydantic for automatic validation: from pydantic import BaseModel , Field , validator class PredictionRequest ( BaseModel ) : features : List [ float ] = Field ( . . . , min_items = 1 , max_items = 100 ) @validator ( 'features' ) def validate_finite ( cls , v ) : if not all ( np . isfinite ( val ) for val in v ) : raise ValueError ( "All features must be finite" ) return v

FastAPI auto-validates and returns 422 for invalid requests

@app . post ( "/predict" ) async def predict ( request : PredictionRequest ) :

Request is guaranteed valid here

pass
4. No Drift Monitoring = Silent Degradation
Problem
Model performance degrades over time, no one notices until users complain.
Solution
Implement drift detection (see references/model-monitoring-drift.md): monitor = ModelMonitor ( reference_data = training_data , drift_threshold = 0.1 ) @app . post ( "/predict" ) async def predict ( request : PredictionRequest ) : prediction = model . predict ( features ) monitor . log_prediction ( features , prediction , latency )

Alert if drift detected

if
monitor
.
should_retrain
(
)
:
alert_manager
.
send_alert
(
"Model drift detected - retrain recommended"
)
return
prediction
5. Missing Resource Limits = OOM Kills
Problem
Pod killed by Kubernetes OOMKiller, service goes down.
Solution
Set memory/CPU limits and requests: resources : requests : memory : "512Mi"

Guaranteed

cpu : "500m" limits : memory : "1Gi"

Max allowed

cpu : "1000m"

Monitor actual usage:

kubectl top pods
6. No Rollback Plan = Stuck on Bad Deploy
Problem
New model version has bugs, no way to revert quickly.
Solution
Tag images with versions, keep previous deployment:

Deploy with version tag

kubectl set image deployment/model-api model-api = registry/model-api:v1.2.0

If issues, rollback to previous

kubectl rollout undo deployment/model-api

Or specify version

kubectl
set
image deployment/model-api model-api
=
registry/model-api:v1.1.0
7. Synchronous Prediction = Slow Batch Processing
Problem
Processing 10,000 predictions one-by-one takes hours.
Solution
Implement batch endpoint: @app . post ( "/predict/batch" ) async def predict_batch ( request : BatchPredictionRequest ) :

Process all at once (vectorized)

features

np . array ( request . instances ) predictions = model . predict ( features )

Much faster!

return
{
"predictions"
:
predictions
.
tolist
(
)
}
8. No CI/CD Validation = Deploy Bad Models
Problem
Deploying model that fails basic tests, breaking production.
Solution
Validate in CI pipeline (see references/cicd-ml-models.md):

.github/workflows/deploy.yml

-
name
:
Validate model performance
run
:
|
python scripts/validate_model.py \
--model model.pkl \
--test-data test.csv \
--min-accuracy 0.85 # Fail if below threshold
Best Practices
Version everything
Models (semantic versioning), Docker images, deployments
Monitor continuously
Latency, error rate, drift, resource usage
Test before deploy
Unit tests, integration tests, performance benchmarks
Deploy gradually
Canary (10%), then full rollout
Plan for rollback
Keep previous version, document procedure
Log predictions
Enable debugging and drift detection
Set resource limits
Prevent OOM kills and resource contention
Use health checks
Enable proper load balancing
When to Load References
Load reference files for detailed implementations:
FastAPI Production Server
Load
references/fastapi-production-server.md
for complete production-ready FastAPI implementation with error handling, validation (Pydantic models), logging, health/readiness probes, batch predictions, model versioning, middleware, exception handlers, and performance optimizations (caching, async)
Model Monitoring & Drift
Load
references/model-monitoring-drift.md
for ModelMonitor implementation with KS-test drift detection, Jensen-Shannon divergence, Prometheus metrics integration, alert configuration (Slack, email), continuous monitoring service, and dashboard endpoints
Containerization & Deployment
Load
references/containerization-deployment.md
for multi-stage Dockerfiles, model versioning in containers, Docker Compose setup, A/B testing with Nginx, Kubernetes deployments (rolling update, blue-green, canary), GitHub Actions CI/CD, and deployment checklists
CI/CD for ML Models
Load references/cicd-ml-models.md for complete GitHub Actions pipeline with model validation, data validation, automated testing, security scanning, performance benchmarks, automated rollback, and deployment strategies
返回排行榜