I am a Senior Research Scientist @ Meta SuperIntelligence Lab.
I enjoy pushing the frontier of Large Language Models in both performance and compute efficiency. I am fortunate to contribute to the Llama series from text-only Llama 2, to multimodal late-fusion in Llama 3, and Mixture of Experts in Llama 4.
Before this, I designed and studied efficient attention architectures to scale up visual perception during my PhD with Prof. Ehsan Elhamifar. I received my B.Sc. in Advanced Program in Computer Science from University of Sciences (Viet Nam).
I work on both algorithmic and data aspects of post training LLMs. Specifically, my works involve various corner of training from automate data quality assurance, distributed training optimization, on-policy distillation (with Rohan Anil) as well as inference speed up with pruning/speculative decoding.
If you are interested in my research or collaboration, I can be reached via: