Featured Projects

BEVDiffuser: Plug-and-Play Diffusion Models

Illustration of MTA BEVDiffuser is a novel diffusion model that effectively denoises BEV feature maps using the ground-truth object layout as guidance. BEVDiffuser can be operated in a plug-and-play manner during training time to enhance existing BEV models during without modifying architectures or adding computational overhead at inference. CVPR 2025 (Highlight).

MTA: Multimodal Task Alignment

Illustration of MTA MTA is a novel multimodal task alignment framework that boosts BEV perception and captioning. MTA enforces alignment through multimodal contextual learning and cross-modal prompting mechanisms. Arxiv.

PaPr: Patch Pruning for Faster Inference

Illustration of MTA PaPr is a novel background patch pruning method that can seamlessly operate with ViTs for faster inference (>2x). PaPr is a training-free approach and can be easily plugged into existing token pruning methods for further efficiency. ECCV 2024.

VLP: Vision Language Planning

Illustration of MTA VLP is a vision language planning approach for enhancing end-to-end autonomous driving. VLP is a training-only approach that distills the power of LLMs into the existent autonomous driving stacks for improved performance. CVPR 2024.

ZS-SSL: Zero-Shot Self-Supervised Learning

Illustration of ZS-SSL Developed the zero-shot self-supervised learning (ZS-SSL) methodology that performs test time training with a well-defined stopping criterion and tackles out-of-distribution challenges via domain adaptation. ICLR 2022.

SSDU: Self-Supervision via Data Undersampling

Illustration of ZS-SSL SSDU is the pioneering self-supervised learning work that enables reconstruction without ground-truth data. SSDU splits acquired measurements into two disjoint sets, in which one set is used as input to the network, and the other is used to define the loss function. This work received the 2020 IEEE ISBI Best Paper Award.

Share on

Twitter Facebook LinkedIn

Burhan Yaman