Open to opportunities · Greater Seattle

YANPENG QI

Software with a thinking layer.

Software engineer building production AI systems — RAG pipelines, multi-agent orchestration, LLM integrations at scale.

View work Talk to my AI →

Years building software

AI projects shipped

100M+

Requests served

Engineer, end-to-end

About

Engineering for the AI era.

Featured

Admitly. Your AI study-abroad consultant.

Hybrid search · pgvector · Claude · LLM-as-judge. End-to-end.

Read case study →

Specialty

Production AI Core.

Retrieval, orchestration, reliability. The boring parts that make the magic real.

How I think about it →

Background

Full-stack with depth.

Java + Spring on the JVM. TypeScript + Next.js on the edge. Python where the models live.

See CV →

AI Chat

Try it.

Ask my AI anything about my work.

Open chat →

Writing

Notes.

Open source

Repos.

A public trail of small, sharp tools and experiments.

View on GitHub →

Featured Projects

Things I've built with AI at the core.

2026

AI Reliability Copilot

Incident Response Copilot for SRE Triage

Next.jsTypeScriptAI SDKPrompt EvalsIncident Response

Case study Live demo

Live2026

Book Traveler / Shuzhongren

AI-Driven Interactive Wuxia Fiction

Next.jsTypeScriptClaudeDeepSeekGemini

Case study Live demo

Live2025

Admitly

AI-Powered Graduate Admission Copilot

Next.jsClaudeRAGpgvectorMulti-Agent

Case study Live demo

2024

AI Financial Intelligence Agent

Multi-Tool Reasoning Agent for Investment Signal Generation

PythonClaudeFunction CallingEmbeddingsChromaDB

Case study

Try it now

Ask my AI anything about my work.

Trained on real resume + projects. It only answers from real data.

chat.tsxPowered by Gemini 2.5 Flash + RAG

you

How does the RAG pipeline in Admitly work?

Hybrid search: BM25 for keyword precision, pgvector for dense embeddings, RRF re-ranking on top. Achieves 92% factual precision on the eval set.

Open chat→

Reach

Built on every surface.

Server, edge, browser, mobile — same care, same standard.

api.admitly.com

// app/api/chat/route.ts

export async function POST(req) {

const ctx = await retrieve(query);

const ans = await claude.complete(ctx);

return stream(ans);

}

Eval run · 03:24

92.4%

factual precision · n=240

Throughput

−54%

LLM cost vs. baseline