About me

I'm an experienced AI Scientist at Devz AI, US, working to develop AI applications. I enjoy researching cutting-edge LLMs & techniques and exploring their practical applications.

My job is to research AI models and techniques, then build them into applications. I generated datasets to finetune models on our business tasks, created knowledge base for Retrival-Augmented Generation (RAG), developed tools for model to use, applied Chain-of-Thought (COT) and few-shots to improve model performance, integrated applications with our front-end & back-end pipelines, and finally empower our products with advanced AI capabilities.

What i'm doing

  • design icon

    AI Applications Research & Development

    The most modern and high-quality AI applications design (RAG, Agent, Skills, Tool-calling, etc.).

  • deep learning icon

    LLMs Deployment & Finetuning

    Improve model's performance on specific tasks (LoRA, rsLora, Ollama, etc.).

  • cloud icon

    Cloud Computing

    Deploy services on elastic cloud server and publish APIs for cross-functional teams.

  • Software Engineering icon

    Software Engineering

    Develop advanced services and integrate them into corporate product.

Resume

Education

  1. University of Chicago

    2022 — 2024

    Program: Master in Computer Science

    GPA: 3.9

    Core courses: Machine Learning, Natrual Language Processing, Software Engineering, Data Science, Databases, etc.

    Honor: Phoenix Scholarship ($64,340)

  2. Xi'an Jiaotong-Liverpool University

    2018 — 2022

    Program: BSc in Economics and Finance

    GPA: 3.93 (Top 2%)

    Core courses: Quantitative Finance, Econometrics, Calculus, Microeconomics, Macroeconomics, Financial Management, Corporate Finance, etc.

    Honor: Excellence Academy Award (2021, ¥10,000), Excellent Student Scholarship (2020, ¥10,000)

Experience

  1. Devz AI, AI Scientist

    08/2024 — Present

    •    Autonomous Agent Architecture & Orchestration: Architected an enterprise-grade agentic system based on Tool-Calling and ReAct paradigms. Orchestrated the end-to-end automation lifecycle—from intent recognition and task planning to execution and validation—ensuring data privacy and high concurrency, contributing to $6M in annual orders.


    •    Devi Multi-Agent Collaborative Framework: Engineered a project management agent with MCP toolset to automate project creation (parsing PRDs into deliverables), tracking (analyzing status), and management (dynamic reprioritization). Built a talent matching system to auto-assign tasks based on expertise, automating the full "Create-Schedule-Manage" lifecycle.


    •    Self-Healing Workflow: Designed a closed-loop CoT agent for operations. Upon incident triggers, the agent retrieves historical cases, generates fixes, and iteratively self-refines solutions based on error logs while generating verification scripts. Improved solution accuracy to 96% and reduced response time to under 2 minutes.


    •    General-Purpose Web Agent: Solved dynamic environment challenges by developing a Selenium-based Web Agent. Combined LLM intent recognition with HTML DOM tree parsing and recursive interaction to manipulate unstructured pages. Introduced Visual Language Model (VLM) verification, achieving a 63% end-to-end resolution rate.


    •    Action Space Caching System: Constructed a "Script Alignment + Variable Filling + Multi-path Caching" mechanism. Enabled the generalization of single scripts to multiple incidents and achieved a 20x execution speedup by reusing historical paths.


    •    LLM Post-Training: Led the Supervised Fine-Tuning (SFT) of vertical Ops LLMs. Utilized Unsloth to accelerate distributed training, achieving a 55% relative improvement in domain-specific technical accuracy compared to GPT-4.


    •    Inference Acceleration & Industrialization: Deployed distributed inference services via vLLM. Achieved 200+ tokens/s generation speed and 100+ high concurrency, reducing agent interaction costs by 80% to support high-frequency tool calling.


    •    Advanced RAG System: Built a multi-path retrieval system mapping hierarchical data to multiple vector stores. Implemented dimensional expansion, AI summarization, Reranking, and non-linear scoring, boosting the Top-3 hit rate from 65% to 88%.

  2. Prudential Financial, Machine Learning Engineer

    09/2023 — 03/2024

    •    High-Performance RAG Stock Predictor: Deployed an agent-based stock prediction system in Docker. Improved processing speed by 65% and reduced costs by 75% compared to GPT-4, achieving 62% prediction accuracy.


    •    OneAPI Aggregation Platform: Developed a unified API gateway with cost/account management, integrating OpenAI, Claude, and Gemini. Enabled universal model switching via a single Base URL, reducing manual configuration time by 85%.

  3. Shannon Investment, NLP Engineer

    07/2023 — 02/2024

    •    Enterprise Sentiment Analysis System: Deployed and benchmarked 20+ LLMs on sentiment tasks. Designed a sentiment analysis pipeline integrating Kafka for real-time processing (News-Analysis-Factor Generation) and reduced latency by 80%.


    •    Fine-tuning & Prompt Engineering: Led LLM fine-tuning using P-Tuning v2 and (Q)LoRA. Established a prompt management system, improving F1 score by 20% in sentiment tasks and achieving a backtested Sharpe Ratio of 3.

My skills

  • AI Applications & Agents
    100%
  • LLMs deployment & finetuning & optimization
    100%
  • Programing
    100%
  • ETL pipelines
    90%
  • Cloud Computing & Containerization
    80%

Blog

Contact

Contact Form