MLOps Fundamentals: The Lifecycle of Machine Learning¶
Estimated time to read: 3 minutes
Machine Learning Operations (MLOps) is the application of DevOps principles to the machine learning lifecycle. While traditional DevOps focuses on code and binary versions, MLOps must handle Code + Data + Models.
This guide covers the foundational stages of a production-grade MLOps environment.
1. MLOps vs. Traditional DevOps¶
| Feature | DevOps | MLOps |
|---|---|---|
| Primary Unit | Code / Binary | Code + Data + Model |
| Success Metric | Uptime, Latency, Throughput | Accuracy, Precision, Drift |
| Testing | Unit & Integration tests | Model validation, Data quality tests |
| Monitoring | System health | Prediction drift, Data drift |
2. The MLOps Lifecycle¶
graph LR
A[Data Prep] --> B[Model Training]
B --> C[Model Validation]
C --> D[Model Registry]
D --> E[Deployment / Serving]
E --> F[Monitoring]
F --> G{Drift Detected?}
G -- Yes --> A
G -- No --> F Stage 1: Data Preparation & Feature Engineering¶
ML models are only as good as the data they are trained on. This stage involves cleaning, transforming, and tracking data versions. * Feature Store: A centralized repository for sharing and managing features across training and serving. See Feature Engineering, DevOps & Observability.
Stage 2: Model Training & Experiment Tracking¶
Instead of just committing code, you are tracking experiments: hyper-parameters, data versions, and training results. * Tools: MLflow, Weights & Biases, Kubeflow.
Stage 3: Model Registry¶
A versioned store for your trained models. You never deploy a "loose" file from a developer's laptop. You deploy a versioned artefact from the registry.
Stage 4: Deployment & Serving¶
Models can be deployed as: * Batch Prediction: Scoring millions of rows on a schedule (e.g., in BigQuery ML). * Real-time Serving: An API endpoint that returns a prediction in milliseconds (e.g., via Vertex AI or Seldon).
Stage 5: Monitoring & Retraining¶
You must monitor for Drift: * Data Drift: The input data distribution changes (e.g., a new sensor version starts sending different values). * Model Drift: The model's performance degrades in production because the real-world relationship between inputs and outputs has changed.
3. MLOps Maturity Levels¶
The OpsAtScale Data Maturity Assessment maps these capabilities.
🔴 Level 0: Manual (No MLOps)¶
- Manual data prep and training.
- Models passed as files (e.g.,
.pkl) via email or Slack. - No monitoring of performance.
🟡 Level 1: Automated Pipeline¶
- Code to train is versioned.
- Automated pipeline triggers training when new data arrives.
- Model validation gates before deployment.
🔵 Level 2: CI/CD/CT (Continuous Training)¶
- CT: The system automatically retrains the model when performance drifts or new data is available.
- A/B Testing: New models are compared against "champion" models before full rollout. See A/B Testing with BigQuery ML.
4. Summary Checklist¶
- Versioning: Are you versioning your data samples as well as your code?
- Registry: Are you using a formal Model Registry for all deployments?
- Monitoring: Do you have alerts for when prediction accuracy drops below a threshold?
- Drift: Do you monitor the statistical distribution of your input features?
- Reproducibility: Can you rebuild a model from 6 months ago with the exact same data and code?