03.2024 - 08.2024Candy.AIRemote

AI/ML Lead (Contract)

Single-handedly scaled AI capabilities from nascent to millions of users. Architected hyperscale Kubernetes infrastructure on GCP. Designed state-of-the-art Inference Platform with 400% efficiency improvement. Pioneered 'SuperBooga' broker-based solution for high-throughput generative inference.

5 direct reportsAIInfrastructure

Key wins

Scale from thousands to millions

Built from scratch: pure IaaC

GKE GPU cost-efficient fleets

Technologies

GCPGKETerraformHelmGitOpsK6PyTorchStableDiffusionLLM

Responsibilities & achievements

01Single-handedly revolutionized a startup's AI capabilities, scaling from nascent to millions of users:
02Architected and implemented a robust, hyperscale Kubernetes infrastructure on GCP, evolving from basic toil instances to a fully orchestrated, cloud-native environment supporting multi-GPU and diverse generative workloads
03Leveraged Terraform, Helm, and GitOps to create a comprehensive Infrastructure as Code (IaC) solution
04Achieved seamless scaling from thousands to millions of users while maintaining 99.999% uptime
05Designed and implemented a state-of-the-art Inference Platform, dramatically reducing operational costs and enhancing model serving efficiency up to 400%
06Pioneered "SuperBooga," a low-frequency, low-latency, high-throughput broker-based solution, enabling high-pressure generative inference without system collapse for both LLM and SD
07Containerized AI workflows/inference, ensuring consistent deployment across dev and prod environments
08Spearheaded a culture of rigorous Load Testing (using K6) and Observability
09Developed custom-tailored stress-testing solutions for LLM & StableDiffusion
10Transformed "star-trek-fantasy" JIRA tickets into tangible pytorch.bin models
11Reduced 200%+ Costs while performing at 99.999% SLA