Open to opportunities · Greater Seattle
YANPENG QI

Software with a thinking layer.

Software engineer building production AI systems — RAG pipelines, multi-agent orchestration, LLM integrations at scale.

View workTalk to my AI →
7+
Years building software
12
AI projects shipped
100M+
Requests served
1
Engineer, end-to-end
About

Engineering for the AI era.

Featured

Admitly. Your AI study-abroad consultant.

Hybrid search · pgvector · Claude · LLM-as-judge. End-to-end.

Read case study →
Specialty

Production AI Core.

Retrieval, orchestration, reliability. The boring parts that make the magic real.

How I think about it →
Background

Full-stack with depth.

Java + Spring on the JVM. TypeScript + Next.js on the edge. Python where the models live.

See CV →
Featured Projects

Things I've built with AI at the core.

2026

AI Reliability Copilot

Incident Response Copilot for SRE Triage

Next.jsTypeScriptAI SDKPrompt EvalsIncident Response
Live2026

Book Traveler / Shuzhongren

AI-Driven Interactive Wuxia Fiction

Next.jsTypeScriptClaudeDeepSeekGemini
Live2025

Admitly

AI-Powered Graduate Admission Copilot

Next.jsClaudeRAGpgvectorMulti-Agent
2024

AI Financial Intelligence Agent

Multi-Tool Reasoning Agent for Investment Signal Generation

PythonClaudeFunction CallingEmbeddingsChromaDB
Try it now

Ask my AI anything about my work.

Trained on real resume + projects. It only answers from real data.

chat.tsxPowered by Gemini 2.5 Flash + RAG
you
How does the RAG pipeline in Admitly work?
ai
Hybrid search: BM25 for keyword precision, pgvector for dense embeddings, RRF re-ranking on top. Achieves 92% factual precision on the eval set.
Open chat
Reach

Built on every surface.

Server, edge, browser, mobile — same care, same standard.

api.admitly.com
// app/api/chat/route.ts
export async function POST(req) {
const ctx = await retrieve(query);
const ans = await claude.complete(ctx);
return stream(ans);
}
Eval run · 03:24
92.4%
factual precision · n=240
Throughput
−54%
LLM cost vs. baseline