Qichang Zheng - Profile

About me

I'm an experienced AI Scientist at Devz AI, US, working to develop AI applications. I enjoy researching cutting-edge LLMs & techniques and exploring their practical applications.

My job is to research AI models and techniques, then build them into applications. I generated datasets to finetune models on our business tasks, created knowledge base for Retrival-Augmented Generation (RAG), developed tools for model to use, applied Chain-of-Thought (COT) and few-shots to improve model performance, integrated applications with our front-end & back-end pipelines, and finally empower our products with advanced AI capabilities.

What i'm doing

AI Applications Research & Development

The most modern and high-quality AI applications design (RAG, Agent, Skills, Tool-calling, etc.).
LLMs Deployment & Finetuning

Improve model's performance on specific tasks (LoRA, rsLora, Ollama, etc.).
Cloud Computing

Deploy services on elastic cloud server and publish APIs for cross-functional teams.
Software Engineering

Develop advanced services and integrate them into corporate product.

Resume

Education

University of Chicago
2022 — 2024

Program: Master in Computer Science

GPA: 3.9

Core courses: Machine Learning, Natrual Language Processing, Software Engineering, Data Science, Databases, etc.

Honor: Phoenix Scholarship ($64,340)
Xi'an Jiaotong-Liverpool University
2018 — 2022

Program: BSc in Economics and Finance

GPA: 3.93 (Top 2%)

Core courses: Quantitative Finance, Econometrics, Calculus, Microeconomics, Macroeconomics, Financial Management, Corporate Finance, etc.

Honor: Excellence Academy Award (2021, ¥10,000), Excellent Student Scholarship (2020, ¥10,000)

Experience

Devz AI, AI Scientist
08/2024 — Present

•    Autonomous Agent Architecture & Orchestration: Architected an enterprise-grade agentic system based on Tool-Calling and ReAct paradigms. Orchestrated the end-to-end automation lifecycle—from intent recognition and task planning to execution and validation—ensuring data privacy and high concurrency, contributing to $6M in annual orders.

•    Devi Multi-Agent Collaborative Framework: Engineered a project management agent with MCP toolset to automate project creation (parsing PRDs into deliverables), tracking (analyzing status), and management (dynamic reprioritization). Built a talent matching system to auto-assign tasks based on expertise, automating the full "Create-Schedule-Manage" lifecycle.

•    Self-Healing Workflow: Designed a closed-loop CoT agent for operations. Upon incident triggers, the agent retrieves historical cases, generates fixes, and iteratively self-refines solutions based on error logs while generating verification scripts. Improved solution accuracy to 96% and reduced response time to under 2 minutes.

•    General-Purpose Web Agent: Solved dynamic environment challenges by developing a Selenium-based Web Agent. Combined LLM intent recognition with HTML DOM tree parsing and recursive interaction to manipulate unstructured pages. Introduced Visual Language Model (VLM) verification, achieving a 63% end-to-end resolution rate.

•    Action Space Caching System: Constructed a "Script Alignment + Variable Filling + Multi-path Caching" mechanism. Enabled the generalization of single scripts to multiple incidents and achieved a 20x execution speedup by reusing historical paths.

•    LLM Post-Training: Led the Supervised Fine-Tuning (SFT) of vertical Ops LLMs. Utilized Unsloth to accelerate distributed training, achieving a 55% relative improvement in domain-specific technical accuracy compared to GPT-4.

•    Inference Acceleration & Industrialization: Deployed distributed inference services via vLLM. Achieved 200+ tokens/s generation speed and 100+ high concurrency, reducing agent interaction costs by 80% to support high-frequency tool calling.

•    Advanced RAG System: Built a multi-path retrieval system mapping hierarchical data to multiple vector stores. Implemented dimensional expansion, AI summarization, Reranking, and non-linear scoring, boosting the Top-3 hit rate from 65% to 88%.
Prudential Financial, Machine Learning Engineer
09/2023 — 03/2024

• High-Performance RAG Stock Predictor: Deployed an agent-based stock prediction system in Docker. Improved processing speed by 65% and reduced costs by 75% compared to GPT-4, achieving 62% prediction accuracy.

• OneAPI Aggregation Platform: Developed a unified API gateway with cost/account management, integrating OpenAI, Claude, and Gemini. Enabled universal model switching via a single Base URL, reducing manual configuration time by 85%.
Shannon Investment, NLP Engineer
07/2023 — 02/2024

• Enterprise Sentiment Analysis System: Deployed and benchmarked 20+ LLMs on sentiment tasks. Designed a sentiment analysis pipeline integrating Kafka for real-time processing (News-Analysis-Factor Generation) and reduced latency by 80%.

• Fine-tuning & Prompt Engineering: Led LLM fine-tuning using P-Tuning v2 and (Q)LoRA. Established a prompt management system, improving F1 score by 20% in sentiment tasks and achieving a backtested Sharpe Ratio of 3.