Machine Learning Engineer
September 2021 – April 2024 · Gurgaon, India
Productionized DS, Data Engineering, and BI services/pipelines on GCP (GKE, KServe, Triton Inference Server, PostgreSQL, Airflow and Vertex AI), implementing platform best practices, monitoring, and cost optimization.
ML Platform
Serving
Migrated 30+ DS workloads to GKE to enable better scaling, reduced latency and management of deployments (rollouts, canary deployment, monitoring, alerts). Robust CI/CD pipelines using Cloud Build for seamless and automated deployment of microservices and batch services on GKE, Cloud Functions, and Cloud Run.
Inference & Request-Response Store
EdgeDB on GKE with provisioned PostgreSQL instance as backend, with FastAPI endpoints. API employed across 10+ DS Services for real-time caching and retrieval of inferences with 5ms latency, as well as in batch processing jobs with traffic coming from Queue (PubSub) or an Orchestrator (Airflow). Integrated CI/CD pipelines for automated deployments.
Feature Store
Feature-server on GKE using FastAPI with BigQuery as offline store and Redis as online store. Features at runtime, previously loaded through CSVs and required frequent update. Automated materialization pipelines, GCS-triggered updates sync online/offline stores for 10+ DS services with <10ms latency. CI/CD enabled for robust rollouts.
Architecture ↗Customer Interaction Analytics
Airflow-orchestrated NLP pipeline for sales audio transcription, information extraction, summarization, and Q&A using GPT. Leverages Google Cloud Functions, Cloud Storage, KServe InferenceService, Triton, MongoDB and EdgeDB.
Orchestration Platform
Airflow on GKE — 70+ diverse git-synced DAGs running 1000+ times daily, covering 3800+ task executions supporting DS-BI teams. Execution of tasks encompassing Cloud Function triggering, PubSub messaging, EdgeDB integrations, Snowflake-to-Google Sheets sync and GCP service integration. Implemented RBAC to enhance team visibility, established automated log cleanup ensuring a streamlined, well-maintained system.
Data Flow Pipeline
Syncing data for 2500 Sheets with roughly 15000 Jobs/day between Snowflake and Google Sheets using Airflow, Pub-Sub, EdgeDB and GKE.
NBFC DS Migration
Led migration from AWS to GCP, decomposing monolith into scalable GKE microservices. Decoupled DB operations, reduced latency from 3 seconds to 503ms (99.99%), eliminating 100% timeouts previously at 8-9%. Streamlined data flow via Pub/Sub, enabling EdgeDB updates and hourly Snowflake sync. CI/CD pipelines orchestrated for seamless deployment.
Architecture ↗Triton Python SDK
Developed a pip-installable, reusable Python wheel package for seamless integration with Triton Inference Server, supporting high-performance model inference by handling both HTTP and GRPC requests in synchronous and asynchronous modes, with built-in logging, error handling, and configuration flexibility.
Data Platform R&D
Experimenting with deploying Doris, StarRocks, and associated open-source tools on GKE to construct a scalable and high-performance data platform, incorporating S3-compatible object storage via MINIO/GCS, implementing access control and data retention policies, and ensuring effective Kubernetes cluster monitoring and alerting through Grafana dashboards utilizing Prometheus.
Audio Embeddings Search
Experimented Minio and Milvus integration to extract audio embeddings from Minio, enabling efficient search for similar content, Redis for metadata storage.
Architecture ↗