Welcome to Yu TIAN's Homepage

About Me

I am a final-year Master of Philosophy (M.Phil.) candidate in Computer Science at the University of Hong Kong (HKU), supervised by Prof. Heming Cui. Before that, I received my B.Eng. degree in Computer Science from Tongji University. Currently, I am a Research Intern at ByteDance Seed-Infra-Training in Shanghai.

Update: I am actively looking for job opportunities / PhD positions (Fall 2026) — feel free to reach out!

My research focuses on Machine Learning Systems, AI Infrastructure, and Distributed Computing. Specifically, I specialize in building high-performance, scalable, and reliable systems for Large Language Models (LLMs). My work tackles critical challenges in distributed training and inference, such as ensuring consistency in large-scale Reinforcement Learning (RL) scenarios and optimizing Pipeline Parallelism (PP) in inference frameworks.

Fun Fact: You can also call me the "Handle Runner" — inspired by one of my favorite movies, "Blade Runner". While we CS guys might not be good at physical running, we are always running handles!

Internship Experience

ByteDance / Seed-Infra-Training

Jan 2026 - Present

Participating in R&D for Agent RL post-training systems. Designing and implementing low-overhead calibration mechanisms for asynchronous streaming RL training to address data consistency challenges and ensure stability in large-scale RL training.

Huawei 2012 Labs (Hong Kong)

May 2025 - Nov 2025

Developed Pipeline Parallelism (PP) modules for Ascend LLM inference systems. Researched optimization techniques including Parallelism Mechanisms, PD Disaggregation, Speculative Decoding, Sparse/Linear Attention, and Quantization. Addressed resource scheduling and memory bottlenecks in super-node environments.

Shanghai AI Lab

Oct 2022 - Apr 2023

Researched secure microservice architectures. Designed a secure and efficient microservice runtime based on Trusted Execution Environments (TEE) and Rust.

Research Projects

Efficient PD Disaggregation Inference System Optimized by Pipeline Parallelism (PP)

Huawei Research: Integrated Pipeline Parallelism (PP) & PD Disaggregation to optimize long-sequence and multi-node scenarios. First complete solution combining PP, PD disaggregation, and Multi-Token Prediction. Improved throughput by 9.47% and kept pipeline bubbles ≤5%.

Landed in Huawei OmniInfer (Patent filed as First Inventor)

Secure and Efficient Distributed OLAP System via Fully Homomorphic Encryption (FHE)

HKU Research: First FHE-based distributed OLAP system. Achieved secure and efficient data processing. Reduced end-to-end latency by 44.1% compared to SOTA single-node systems in 4-node settings.

Published in ESORICS DPM '24

High-Throughput OLAP System Accelerated by GPU/NPU

HKU-Huawei Joint Research: Designed high-performance OLAP system using GPU/NPU for offline scenarios. Achieved 50% throughput improvement compared to NVIDIA RAPIDS for Spark.

Team Lead (Led 2 RAs)

Publications

HEDAS: Secure and Efficient Distributed OLAP using Fully Homomorphic Encryption

Yu Tian, Tianxiang Shen, Qi Hu, Wei Chen, Heming Cui, and Ji Qi

ESORICS 2024 International Workshops: DPM, CBT, and CyberICPS, 77-93.

Education

The University of Hong Kong (HKU)

Sep 2023 - Present

M.Phil. in Computer Science

Tongji University

GPA: 91.63/100

Sep 2019 - Jun 2023

B.Eng. in Computer Science and Technology

One of the Leaders of CPC Lab (Tongji University)

Teaching Experience

Teaching Assistant @ HKU (COMP3230: Operating Systems)
Fall 2023 & 2024
Teaching Assistant @ Tongji (Advanced Programming Language)
Sep 2020 - Jun 2021

Honors & Awards

National Scholarship (Top 0.2%) 2022
ICPC Asia Regional Contest (Jinan) Silver Medal 2021
Tongji University Qidi Scholarship (Top 1%) 2023
CCF CSP Certification 350 (Top 1.49%) 2021
Outstanding Graduate of Tongji University 2023