Samyak Rawlekar

I am a PhD student in the Computer Vision and Robotics Laboratory at the University of Illinois Urbana-Champaign. I am advised by Prof. Narendra Ahuja .

I love pixels and have recently been integrating them with text to solve intricate and exciting problems in Computer Vision. Particularly, I focus on using large multi-modal (vision-language) models for multi-task learning.

Previously, I obtained MS in Computer Engineering from New York University , and B.Tech in Electrical Engineering from IIT Dharwad .

E-mail / Resume / GitHub / Scholar / X / Bsky

Selected Publications

PositiveCoOp: Rethinking Prompting Strategies for Multi-Label Recognition with Partial Annotations

Samyak Rawlekar, Shubhang Bhatnagar , Narendra Ahuja
WACV 2025
abstract / project page / paper

Vision-language models (VLMs) like CLIP have been adapted for Multi-Label Recognition (MLR) with partial annotations by leveraging prompt-learning, where positive and negative prompts are learned for each class to associate their embeddings with class presence or absence in the shared vision-text feature space. While this approach improves MLR performance by relying on VLM priors, we hypothesize that learning negative prompts may be suboptimal, as the datasets used to train VLMs lack image-caption pairs explicitly focusing on class absence. To analyze the impact of positive and negative prompt learning on MLR, we introduce PositiveCoOp and NegativeCoOp, where only one prompt is learned with VLM guidance while the other is replaced by an embedding vector learned directly in the shared feature space without relying on the text encoder. Through empirical analysis, we observe that negative prompts degrade MLR performance, and learning only positive prompts, combined with learned negative embeddings (PositiveCoOp), outperforms dual prompt learning approaches. Moreover, we quantify the performance benefits that prompt-learning offers over a simple vision-features-only baseline, observing that the baseline displays strong performance comparable to dual prompt learning approach (DualCoOp), when the proportion of missing labels is low, while requiring half the training compute and 16 times fewer parameters.

Improving Multi-label Recognition using Class Co-Occurrence Probabilities

Samyak Rawlekar*, Shubhang Bhatnagar* , Vishnuvardhan Pogunulu Srinivasulu, Narendra Ahuja
CVPRW 2024, ICPR 2024 (Oral Top-5%)
abstract / project page / paper

Multi-label Recognition (MLR) involves the identification of multiple objects within an image. To address the additional complexity of this problem, recent works have leveraged information from vision-language models (VLMs) trained on large text-images datasets for the task. These methods learn an independent classifier for each object (class), overlooking correlations in their occurrences. Such co-occurrences can be captured from the training data as conditional probabilities between a pair of classes. We propose a framework to extend the independent classifiers by incorporating the co-occurrence information for object pairs to improve the performance of independent classifiers. We use a Graph Convolutional Network (GCN) to enforce the conditional probabilities between classes, by refining the initial estimates derived from image and text sources obtained using VLMs. We validate our method on four MLR datasets, where our approach outperforms all state-of-the-art methods.

Template