AI Engineer · production AI systems · Chennai, India

Rishabh Kumar

I build AI systems that run in production — and don't fail silently.

Production RAG, autonomous email and voice agents, multi-agent orchestration. From backend inference to the deploy gate. The kind of latency-sensitive AI that has to be right when a real customer is on the other end.

Work with me →Read the writing

focusproduction RAG · voice agents · multi-agent systems · evals & harnesses

Systems I've built & run6 shipped

Milo

in production

Connectivity CX · AI Engineer

Autonomous email assistant for automotive dealerships. Reads inbound leads and replies to customers with business-hours logic, follow-up scheduling, and a full audit trail. Multi-tenant data handling with strict per-tenant isolation and sub-500ms hybrid retrieval.

P50 < 500ms

PythonFastAPIGPT-4oPostgreSQL

Hotelzify Voice AI

shipped

Hotelzify · AI Engineer

Real-time voice support: several specialized agents under a supervisor, with streaming speech-to-text and text-to-speech over phone lines. Indexed 1M+ records at 98% retrieval accuracy.

98% retrieval · 300ms P99

PythonLangGraphRealtime VoiceNLP

Remalt ↗

live

Solo · visual AI workflow builder

Visual AI workflow builder. A drag-and-drop canvas of 13 node types to compose multi-step AI pipelines with no code. Multi-LLM routing, row-level-secure auth, usage-based billing, and a browser capture extension.

200+ active users

Next.jsMulti-LLMSupabase

CogniDB ↗

open source

Open source · NL → SQL

Open-source library for querying databases in plain English across MySQL, Postgres, and MongoDB through one interface. Semantic caching cuts inference cost ~40%; a validation pass guards against SQL injection.

200+ stars · −40% cost

PythonNL → SQLNLP

Marki ↗

shipped

Marki (LA) · Full-Stack ML

Recommendation engine with transformer embeddings — +28% engagement, +15% average order value. Real-time inference API serving 1M+ predictions a day at 45ms P50, with an automated training-to-deploy pipeline on multiple GPUs.

1M+ preds/day @ 45ms

PythonFastAPIRecsysMLOps

OneTicket ↗

open source

Smart India Hackathon

Multilingual event-booking assistant on GPT-4o with live database tools (search, book, cancel), web search, and SMS. Placed top 5 of 800+ projects at the Smart India Hackathon.

top 5 / 800+ projects

GPT-4oLangGraphTypeScript

Writing — how production AI fails2 pieces

Jun 2026

A wrong character in a URL killed our LLM tracing for 4 weeks

The health check was green the whole time. A story about why existence checks aren't arrival checks.

5 min

Open source

Ship-Safe Harness — wrap test, eval, and deploy gates around any AI codebase

A free checklist + generator skills that build the missing gates around your codebase. Distilled from the harness I run in production.

repo ↗

Shipped an AI feature from a working demo?

I help teams build the eval and harness layer that catches regressions before users do — and I take on a few hands-on builds a quarter. If your evals pass but you're not sure what fails silently, let's talk.

rishabh.vaaiv@gmail.com →DM @rispectrum