← Back to portfolio
CARS24

Machine Learning Engineer

September 2021 – April 2024 · Gurgaon, India

Productionized DS, Data Engineering, and BI services/pipelines on GCP (GKE, KServe, Triton Inference Server, PostgreSQL, Airflow and Vertex AI), implementing platform best practices, monitoring, and cost optimization.

★ Rookie Award Q3 2022

ML Platform

Serving

Migrated 30+ DS workloads to GKE to enable better scaling, reduced latency and management of deployments (rollouts, canary deployment, monitoring, alerts). Robust CI/CD pipelines using Cloud Build for seamless and automated deployment of microservices and batch services on GKE, Cloud Functions, and Cloud Run.

Inference & Request-Response Store

EdgeDB on GKE with provisioned PostgreSQL instance as backend, with FastAPI endpoints. API employed across 10+ DS Services for real-time caching and retrieval of inferences with 5ms latency, as well as in batch processing jobs with traffic coming from Queue (PubSub) or an Orchestrator (Airflow). Integrated CI/CD pipelines for automated deployments.

Feature Store

Feature-server on GKE using FastAPI with BigQuery as offline store and Redis as online store. Features at runtime, previously loaded through CSVs and required frequent update. Automated materialization pipelines, GCS-triggered updates sync online/offline stores for 10+ DS services with <10ms latency. CI/CD enabled for robust rollouts.

Architecture ↗

Customer Interaction Analytics

Airflow-orchestrated NLP pipeline for sales audio transcription, information extraction, summarization, and Q&A using GPT. Leverages Google Cloud Functions, Cloud Storage, KServe InferenceService, Triton, MongoDB and EdgeDB.

Orchestration Platform

Airflow on GKE — 70+ diverse git-synced DAGs running 1000+ times daily, covering 3800+ task executions supporting DS-BI teams. Execution of tasks encompassing Cloud Function triggering, PubSub messaging, EdgeDB integrations, Snowflake-to-Google Sheets sync and GCP service integration. Implemented RBAC to enhance team visibility, established automated log cleanup ensuring a streamlined, well-maintained system.

Data Flow Pipeline

Syncing data for 2500 Sheets with roughly 15000 Jobs/day between Snowflake and Google Sheets using Airflow, Pub-Sub, EdgeDB and GKE.

NBFC DS Migration

Led migration from AWS to GCP, decomposing monolith into scalable GKE microservices. Decoupled DB operations, reduced latency from 3 seconds to 503ms (99.99%), eliminating 100% timeouts previously at 8-9%. Streamlined data flow via Pub/Sub, enabling EdgeDB updates and hourly Snowflake sync. CI/CD pipelines orchestrated for seamless deployment.

Architecture ↗

Triton Python SDK

Developed a pip-installable, reusable Python wheel package for seamless integration with Triton Inference Server, supporting high-performance model inference by handling both HTTP and GRPC requests in synchronous and asynchronous modes, with built-in logging, error handling, and configuration flexibility.

Data Platform R&D

Experimenting with deploying Doris, StarRocks, and associated open-source tools on GKE to construct a scalable and high-performance data platform, incorporating S3-compatible object storage via MINIO/GCS, implementing access control and data retention policies, and ensuring effective Kubernetes cluster monitoring and alerting through Grafana dashboards utilizing Prometheus.

Audio Embeddings Search

Experimented Minio and Milvus integration to extract audio embeddings from Minio, enabling efficient search for similar content, Redis for metadata storage.

Architecture ↗
Published Blogs
Technologies
GCPGKEKServeTriton Inference ServerPostgreSQLAirflowVertexAIEdgeDBFastAPIFeastRedisBigQueryCloud BuildPub/SubSnowflakeMongoDBDorisStarRocksMINIOMilvusGrafanaPrometheus
View GCP Architecture Overview ↗