About me

I'm an experienced AI Scientist at Devz AI, US, working to develop AI applications. I enjoy researching cutting-edge LLMs & techniques and exploring their practical applications.

My job is to research AI models and techniques, then build them into applications. I generated datasets to finetune models on our business tasks, created knowledge base for Retrival-Augmented Generation (RAG), developed tools for model to use, applied Chain-of-Thought (COT) and few-shots to improve model performance, integrated applications with our front-end & back-end pipelines, and finally empower our products with advanced AI capabilities.

What i'm doing

  • design icon

    AI Applications Research & Development

    The most modern and high-quality AI applications design (RAG, Agent, Tool-calling, etc.).

  • deep learning icon

    LLMs Deployment & Finetuning

    Improve model's performance on specific tasks (LoRA, rsLora, Ollama, etc.).

  • cloud icon

    Cloud Computing

    Deploy services on elastic cloud server and publish APIs for cross-functional teams.

  • Software Engineering icon

    Software Engineering

    Develop advanced services and integrate them into corporate product.

Resume

Education

  1. University of Chicago

    2022 — 2024

    Program: Master in Computer Science

    GPA: 3.9

    Core courses: Machine Learning, Natrual Language Processing, Software Engineering, Data Science, Databases, etc.

    Honor: Phoenix Scholarship ($64,340)

  2. Xi'an Jiaotong-Liverpool University

    2018 — 2022

    Program: BSc in Economics and Finance

    GPA: 3.93 (Top 2%)

    Core courses: Quantitative Finance, Econometrics, Calculus, Microeconomics, Macroeconomics, Financial Management, Corporate Finance, etc.

    Honor: Excellence Academy Award (2021, ¥10,000), Excellent Student Scholarship (2020, ¥10,000)

Experience

  1. Devz AI, AI Scientist

    08/2024 — Present

    •    Developed and deployed LLM applications on AWS with RAG to automate IT operations by collecting QA data, integrating multi-source knowledge from internal database and external (Google, Perplexity) data, and evaluating multiple models' solutions.


    •    Fine-tuned LLaMA Models (70B, 11B) with optimization techniques (flash-attention, unsloth, deepspeed, quantization) and regularization methods (dropout, rsLoRA, DoRA) on specific business tasks; Deployed with VLLM and Ollama to achieve 20% improvement in knowledge tests, 50% higher successful rate, and 80% lower cost compared to GPT-4o and o1.


    •    Built web agent system with RAG and COT to realize end-to-end automatic incident resolution by recursively interacting with browser (e.g. add user to a group, create a computing notebook, configure/restart a computing cluster), as well as validating the task result. Developed cache system to cache steps with editable parameters for faster execution and customization. Continuously improve web agent to fit for more tasks on different platforms (e.g. PagerDuty, Databricks, Okta) to account for diverse customers. Successfully automated 90% of engineer workload and contributed to $10M in new orders.

  2. Prudential Financial, Machine Learning Engineer

    09/2023 — 03/2024

    •    Boosted 50% speed and saved 75% cost compared to ChatGPT4 in stock topics by deploying an advanced RAG LLM structure in Docker and achieved 62% accuracy in stock prediction. Automated the process of keyword extraction, knowledge index (MongoDB), and request for stock data and news with customized ETL Pipeline and RESTful API (Flask).


    •    Deployed an OneAPI transit service platform with account and cost management to integrate multiple LLMs’ APIs (Demos: OpenAI, Claude, Gemini, Llama3, etc.) that allows developers to call different LLMs with universal base url and secret key.

  3. Shannon Investment, NLP Engineer

    07/2023 — 02/2024

    •    Designed a public opinion analysis system by introducing 20+ LLMs with an internal knowledge base mounted for stakeholders; Enabled fine-tuning, multi-task evaluation and deployment of new LLMs; Encapsulated all services into API for front end team.


    •    Developed APIs and ETL pipelines (Kafka) to link LLMs with internal knowledge base (MongoDB, Elasticsearch) to provide complementary data for RAG in LLM conversation; Built LangChain pipelines for LLMs multi-turn conversation.


    •    Improved LLMs performance by fine-tuning with P-TuningV2 and (Q)LoRA and creating a prompt management system to match LLMs and tasks with customized prompt templates. Successfully boosted 20% F1 score in sentiment analysis that achieved a Sharpe Ratio of 3 with better sentiment factors and speeded up 10X data processing time on trillion-level data.

My skills

  • AI Applications
    100%
  • LLMs
    90%
  • Programing
    90%
  • Cloud Computing
    80%

Blog

Contact

Contact Form