About Me

Xin chào (Hi),
I am a Ph.D. candidate at Khoury College of Computer Sciences, Northeastern University, advised by Prof. Ehsan Elhamifar.

I received my B.Sc. from University of Sciences (Viet Nam) where I was fortunate to study in Advanced Program in Computer Science.

Contact

If you are interested in my research or collaboration, I can be reached via:

Research Interest

My research aims at significantly reducing the amount of annotation needed to train visual perceptual models for large-scale recognition, detection, and segmentation.
As such, I develop cross-modal embedding methods that transfer rich knowledge from human languages to the data-scarcity visual modality.
These methods can effectively deal with few or even zero training samples, with missing annotations, and with weak supervision.

Specifically, I work on:

News

Publications

[Project]
[Supplementary Materials]
[Slide]

D. Huynh and Adobe Research {J. Kuen and Z. Lin and J. Gu} and E. Elhamifar
Open-Vocabulary Instance Segmentation via Robust Cross-Modal Pseudo-Labeling

CVPR 2022


Description: Developed a pseudo labeling framework that can generate pseudo masks for novel objects without any segmentation annotations

Outcome: Improved the state-of-the-art performance of instance segmentation by 4.5% on MS-COCO and 5.1% on the large-scale Open Images & Conceptual Captions datasets

[Project]
[Supplementary Materials]
[Slide]

D. Huynh and E. Elhamifar
Interaction Compass: Multi-Label Zero-Shot Learning of Human-Object Interactions via Spatial Relations

ICCV 2021


Description: Developed a compositional model to recognize unseen human interactions based on spatial relations between human and objects

Outcome: Improved the state-of-the-art performance of unseen human-object interaction recognition by 2.6% mAP on HICO dataset

[Project]
[Supplementary Materials]
[Slide]

D. Huynh and E. Elhamifar
Compositional Zero-Shot Learning via Fine-Grained Dense Feature Composition

NeurIPS 2020


Description: Developed a generative model that constructs fine-grained features for unseen classes by recombining features from training samples

Outcome: Improved the state-of-the-art performance of unseen clothing recognition by 4% harmonic mean on DeepFashion dataset

[Project]

E. Elhamifar and D. Huynh
Self-Supervised Multi-Task Procedure Learning from Instructional Videos

ECCV 2020


Description: Developed a weakly supervised key-frame localization method for multi-task procedure learning in videos

Outcome: Applied self-supervised learning on CrossTask and ProceL datasets to localize key-frames without human supervision

S. Jafar-Zanjani, M. M. Salary, D. Huynh, E. Elhamifar, and H. Mosallaei
Active Metasurfaces Design by Conditional Generative Adversarial Networks

International Conference on Metamaterials, Photonic Crystals and Plasmonics, 2020
[Project]
[Supplementary Materials]
[Slide]

D. Huynh and E. Elhamifar
A Shared Multi-Attention Framework for Multi-Label Zero-Shot Learning

CVPR 2020
Oral Presentation


Description: Developed a multi-label recognition system for labels without training samples via attention sharing

Outcome: Improved the state-of-the-art performance by 2% mAP score on NUS-WIDE and scaled to 7000 seen labels and 400 unseen labels in Open Images

[Project]
[Supplementary Materials]
[Slide]

D. Huynh and E. Elhamifar
Fine-Grained Generalized Zero-Shot Learning via Dense Attribute-Based Attention

CVPR 2020


Description: Developed a dense attribute-based attention mechanism for fine-grained zero-shot learning

Outcome: Improved state-of-the-art performances on CUB, AWA2 by at least 4% harmonic mean by weakly localizing fine-grained attributes of all classes

[Project]
[Supplementary Materials]
[Slide]

D. Huynh and E. Elhamifar
Interactive Multi-Label CNN Learning with Partial Labels

CVPR 2020


Description: Developed a scalable graph-based framework to regularize multi-label CNN learning with missing labels

Outcome: Improved 2% mAP score on Open Images compared to treating missing labels as absent labels

D. Huynh and E. Elhamifar
Seeing Many Unseen Labels via Shared Multi-Attention Models

ICCVW 2019

Workshop on Multi-Discipline Approach for Learning Concepts - Zero-Shot, One-Shot, Few-Shot and Beyond

Teaching

Services

I am always proud of serving the research community as:


"Don't Try"
― Charles Bukowski ―