Open to opportunities

Kapil Raghav Python Backend Developer · GenAI Systems · Distributed Architecture

Backend engineer with 3.5+ years at Cognizant building high-concurrency FastAPI services, RAG pipelines, and real-time distributed systems. Passionate about clean architecture, semantic search, and systems that actually scale under load.

View Resume LinkedIn GitHub

3+ Years at Cognizant

90% Search Time Saved

$2K LLM Cost Cut / Month

99.5% API Uptime

scroll

Career Timeline

Professional Experience

Three roles. One company. Steady growth from intern to backend owner.

current-role.sh

$ kapil --status

oracle-dev.py

$ kapil --history

internship.bin

$ kapil --origin

By The Numbers

Impact Metrics

Actual numbers from actual production systems.

90% Search Time Reduced

AskSOP GenAI assistant cut manual document lookup time for 500+ internal users across departments.

99.5% API Uptime

RESTful FastAPI backend for document ingestion and retrieval workflows maintained rock-solid availability.

20% Fewer LLM Calls

Semantic Redis caching cut redundant LLM inference — saving ~$2,000/month in inference costs.

99% Response Accuracy

Pydantic validation + response guardrails achieved 99% accuracy and eliminated hallucinations completely.

1000+ Req/sec Throughput

Distributed URL Shortener handles 1000+ req/sec with sub-50ms latency, 10K concurrent users in load tests.

80% DB Queries Reduced

Multi-tier Redis + PostgreSQL smart cache invalidation slashed unnecessary database hits by 80%.

What I Build

Featured Projects

⚡

Featured · In Progress

StreamSync

YouTube-Scale Video Processing Pipeline · FastAPI · Celery · Redis · PostgreSQL · Docker

A production-grade async video processing pipeline engineered for YouTube-scale workloads. Videos are uploaded via a non-blocking FastAPI endpoint, dispatched to Celery workers via Redis broker, transcoded in parallel at multiple resolutions, thumbnail-generated, and tracked in real time — without ever blocking the API layer.

🎬

Async Upload Queue

Non-blocking video intake via FastAPI + instant Celery task dispatch with Redis as the message broker.

⚙️

Parallel Transcoding

Multi-resolution processing (360p → 4K) across distributed worker pools with retry logic and failure isolation.

📊

Real-Time Status Tracking

Live pipeline states — queued, processing, transcoded, failed — stored in PostgreSQL, queryable via REST.

🐳

Containerized Stack

Full Docker Compose orchestration with health checks, graceful shutdown, and production-ready service config.

FastAPI Celery Redis PostgreSQL Docker FFmpeg System Design Async

Explore StreamSync

Production · Cognizant

🧠 AskSOP — Enterprise RAG Assistant

End-to-end GenAI document assistant for 500+ internal pharma users. Semantic search over 100k+ SOPs with compliance guardrails, multi-tenant Azure AD auth, and semantic Redis caching.

FastAPI + AWS Bedrock (Claude 3 Haiku) two-stage retrieval pipeline
Multi-threaded Python ETL: delta-sync SharePoint → AWS OpenSearch
JWT + RBAC + Azure AD across 10+ client sites (multi-tenant)
Pydantic validation schemas — 99% accuracy, zero hallucinations
Redis semantic caching — 20% fewer LLM calls, $2K saved monthly

FastAPI AWS Bedrock OpenSearch LangChain Redis Azure AD DynamoDB

Open Source

🔗 Distributed URL Shortener Service

High-performance URL shortening service built for scale — 1000+ req/sec, sub-50ms response, 99.9% uptime under 10K concurrent users in load tests.

Base62 encoding with multi-tier Redis + PostgreSQL smart cache invalidation
Token bucket rate-limiting middleware to prevent API abuse
Docker Compose stack with health checks and graceful shutdown
80% reduction in unnecessary DB queries via caching strategy

FastAPI Redis PostgreSQL Docker

GitHub

Toolbox

Tech Stack

Languages

Backend

GenAI & ML

Databases & Caching

Cloud & DevOps

Core Competencies

Always Sharpening

DSA & LeetCode

Algorithms are
My Hobby

Actively grinding LeetCode with a focus on patterns that appear in Google, Meta, and top-tier backend interviews. Not just solving problems — understanding the why behind every approach so it maps back to real backend decisions: faster APIs, smarter caches, leaner queries.

Arrays / Sliding Window Trees & Graphs Dynamic Programming Heaps / Priority Queue Binary Search Two Pointers Backtracking Concurrency Patterns

Practice Distribution

Easy — Pattern warmup ✅ Solid

Medium — Primary grind (Google patterns) 🔥 Active

Hard — Graph / DP deep dives 💪 Building

Core belief: Backend engineers who understand data structures write faster APIs, smarter caches, and more elegant queries. DSA isn't just interview prep — it's the intuition behind good engineering.

Beyond The Code

The Convincer

🎧

I Don't Just Build Systems. I Sell Them. True Story

College days. Tight budget. I had just discovered boAt earphones — incredible bass, 1-year warranty with home pickup for claims. Genuinely in love with the product. So naturally, I did what any backend engineer would do: I ran a conversion campaign. I'd invite friends and family to "just listen to this one song" — picked specifically for the heavy bass drop. The conversion rate? Suspiciously high. Same colour. Multiple invoices. I also made sure everyone knew Amazon Prime would deliver by tomorrow, so keep your invoice handy. One year later, when my earphones showed issues, I claimed a fresh warranty year using a friend's purchase invoice. Fallback planned. Edge case handled. Invoice-as-backup-strategy deployed.

I don't just build robust systems — I advocate for them, help people understand why they're worth investing in, and always think about the failure modes before they hit production. Whether it's a distributed cache, a warranty claim, or convincing a team to adopt async architecture — same energy. Same precision.