About Me

Xin chào (Hi),
I am a Senior Research Scientist @ Meta.

I did my Ph.D. at Khoury College of Computer Sciences, Northeastern University, advised by Prof. Ehsan Elhamifar. I received my B.Sc. from University of Sciences (Viet Nam) where I was fortunate to study in Advanced Program in Computer Science.

Contact

If you are interested in my research or collaboration, I can be reached via:

Research Interest

I am currently working on efficiency training and serving methods to effectively productionize large language models ranging from model quantization, pruning, and decoding.
My PhD background was at significantly reducing the amount of annotation needed to train visual perceptual models for large-scale recognition, detection, and segmentation. As such, I developed cross-modal embedding methods that transfer rich knowledge from human languages to the data-scarcity visual modality. These methods can effectively deal with few or even zero training samples, with missing annotations, and with weak supervision.

News

Publications

[Project]
[Supplementary Materials]
[Slide]

D. Huynh and Adobe Research {J. Kuen and Z. Lin and J. Gu} and E. Elhamifar
Open-Vocabulary Instance Segmentation via Robust Cross-Modal Pseudo-Labeling

CVPR 2022


Description: Developed a pseudo labeling framework that can generate pseudo masks for novel objects without any segmentation annotations

Outcome: Improved the state-of-the-art performance of instance segmentation by 4.5% on MS-COCO and 5.1% on the large-scale Open Images & Conceptual Captions datasets

[Project]
[Supplementary Materials]
[Slide]

D. Huynh and E. Elhamifar
Interaction Compass: Multi-Label Zero-Shot Learning of Human-Object Interactions via Spatial Relations

ICCV 2021


Description: Developed a compositional model to recognize unseen human interactions based on spatial relations between human and objects

Outcome: Improved the state-of-the-art performance of unseen human-object interaction recognition by 2.6% mAP on HICO dataset

[Project]
[Supplementary Materials]
[Slide]

D. Huynh and E. Elhamifar
Compositional Zero-Shot Learning via Fine-Grained Dense Feature Composition

NeurIPS 2020


Description: Developed a generative model that constructs fine-grained features for unseen classes by recombining features from training samples

Outcome: Improved the state-of-the-art performance of unseen clothing recognition by 4% harmonic mean on DeepFashion dataset

[Project]

E. Elhamifar and D. Huynh
Self-Supervised Multi-Task Procedure Learning from Instructional Videos

ECCV 2020


Description: Developed a weakly supervised key-frame localization method for multi-task procedure learning in videos

Outcome: Applied self-supervised learning on CrossTask and ProceL datasets to localize key-frames without human supervision

S. Jafar-Zanjani, M. M. Salary, D. Huynh, E. Elhamifar, and H. Mosallaei
Active Metasurfaces Design by Conditional Generative Adversarial Networks

International Conference on Metamaterials, Photonic Crystals and Plasmonics, 2020
[Project]
[Supplementary Materials]
[Slide]

D. Huynh and E. Elhamifar
A Shared Multi-Attention Framework for Multi-Label Zero-Shot Learning

CVPR 2020
Oral Presentation


Description: Developed a multi-label recognition system for labels without training samples via attention sharing

Outcome: Improved the state-of-the-art performance by 2% mAP score on NUS-WIDE and scaled to 7000 seen labels and 400 unseen labels in Open Images

[Project]
[Supplementary Materials]
[Slide]

D. Huynh and E. Elhamifar
Fine-Grained Generalized Zero-Shot Learning via Dense Attribute-Based Attention

CVPR 2020


Description: Developed a dense attribute-based attention mechanism for fine-grained zero-shot learning

Outcome: Improved state-of-the-art performances on CUB, AWA2 by at least 4% harmonic mean by weakly localizing fine-grained attributes of all classes

[Project]
[Supplementary Materials]
[Slide]

D. Huynh and E. Elhamifar
Interactive Multi-Label CNN Learning with Partial Labels

CVPR 2020


Description: Developed a scalable graph-based framework to regularize multi-label CNN learning with missing labels

Outcome: Improved 2% mAP score on Open Images compared to treating missing labels as absent labels

D. Huynh and E. Elhamifar
Seeing Many Unseen Labels via Shared Multi-Attention Models

ICCVW 2019

Workshop on Multi-Discipline Approach for Learning Concepts - Zero-Shot, One-Shot, Few-Shot and Beyond

Patents

Teaching

Services

I am always proud of serving the research community as:


"hữu duyên thiên lý năng tương ngộ"