TQUSI0682_5573 - Full Stack MLOps Engineer (Databricks / ML Applications)

Job Type: Contract

Work Mode: Hybrid (2 Days from office)

MLOps Engineer - CI/CD for ML Models


Position Overview

We are seeking an MLOps Engineer to build and maintain CI/CD pipelines for machine learning models and scripts. This role bridges the gap between data science and production engineering, ensuring ML models are deployed reliably, monitored effectively, and updated seamlessly in production environments.


Key Responsibilities


  1. Build and deploy ML applications on Databricks (end-to-end)
  2. Develop CI/CD pipelines for ML workflows and data pipelines
  3. Work with Databricks (Delta Lake, notebooks, jobs, workflows)
  4. Build APIs (Python/FastAPI) to serve ML models
  5. Containerize and deploy applications using Docker & Kubernetes
  6. Implement monitoring, logging, and model performance tracking
  7. Collaborate with data scientists to productionize models


Required Qualifications

Technical Skills

Programming & Scripting:

  • Python (advanced) - Primary language for ML and automation
  • Bash/Shell scripting for automation
  • YAML for configuration management
  • Understanding of software engineering best practices

CI/CD Tools:

  • GitHub Actions, GitLab CI/CD, or Jenkins - Building automated pipelines
  • Experience with pipeline-as-code concepts
  • Automated testing frameworks (pytest, unittest)

Containerization & Orchestration:

  • Docker - Container creation and management (required)
  • Kubernetes - Container orchestration (intermediate level)
  • Docker Compose for local development
  • Container registries (Docker Hub, ECR, ACR, GCR)

Cloud Platforms:

  • Experience with AWS, Azure, or GCP (at least one)
  • Cloud ML services (SageMaker, Azure ML, Vertex AI)
  • Cloud storage (S3, Blob Storage, GCS)
  • Compute services (EC2, VMs, Cloud Run)

MLOps Tools:

  • MLflow - Experiment tracking and model registry
  • DVC (Data Version Control) - Data and model versioning
  • Weights & Biases, Neptune.ai, or similar (nice to have)

Infrastructure as Code:

  • Terraform or CloudFormation/ARM templates
  • Experience managing infrastructure through code
  • Understanding of state management

Version Control:

  • Git (advanced) - Branching strategies, merge workflows
  • GitHub/GitLab/Bitbucket repository management

ML Knowledge

Understanding of ML Workflows:

  • Familiarity with ML model training and inference
  • Understanding of model formats (pickle, ONNX, SavedModel, TorchScript)
  • Knowledge of ML frameworks (scikit-learn, TensorFlow, PyTorch) - not required to build models, but must understand how they work
  • Awareness of ML lifecycle (training, validation, deployment, monitoring)

Model Serving:

  • FastAPI or Flask - Building REST APIs for model serving
  • TensorFlow Serving, TorchServe, or ONNX Runtime (nice to have)
  • Understanding of model optimization (quantization, pruning)

Monitoring & Observability

Monitoring Tools:

  • Prometheus & Grafana - Metrics and dashboards
  • ELK Stack (Elasticsearch, Logstash, Kibana) or similar for logging
  • Cloud monitoring (CloudWatch, Azure Monitor, Stackdriver)

ML-Specific Monitoring:

  • Model drift detection (Evidently AI, Arize, WhyLabs)
  • Data quality monitoring
  • Performance metrics tracking


DevOps & Software Engineering

Best Practices:

  • Agile/Scrum methodologies
  • Code review processes
  • Documentation standards
  • Security best practices for ML systems

Testing:

  • Unit testing, integration testing
  • Test-driven development (TDD) concepts
  • Data validation and schema testing


Experience Requirements

  • 3-5+ years in DevOps, MLOps, or software engineering
  • 1-2+ years specifically working with ML model deployment and CI/CD
  • Proven track record of building and maintaining production ML systems
  • Experience with cloud platforms and containerization
  • Hands-on experience with CI/CD pipeline development

Want To
WORK FOR YOU?

GET THE QUOTE

Want To
WORK WITH US?

CAREER