Publications

P - Patent (1) | J - Journal (0)| C - Conference (5) | A - ArXiv (1) | Total (7)

* indicates equal contribution; ∂ indicates partial derivative

[C2, C3, C4, Image Retrieval] 

Towards Learning Efficient Multilingual and Multimodal Representation

Gokul Karthik Kumar 

 2023 | MBZUAI MSc Thesis (#language #vision #speech)

[C5] VisCon-100K: Leveraging Contextual Web Data for Fine-tuning Vision Language Models with Leaky Visual Conversations

Gokul Karthik Kumar, Iheb Chaabane, Kebin Wu

 2024 | Under Review at ICLR (#language #vision)

[C4] Towards Building Text-To-Speech Systems for the Next Billion Users

Gokul Karthik Kumar*, Praveen S V*, Pratyush Kumar, Mitesh M. Khapra, Karthik Nandakumar

 2023 | ICASSP (#language #speech)

[C3] Hate-CLIPper: Multimodal Hateful Meme Classification based on Cross-modal Interaction of CLIP features

Gokul Karthik Kumar, Karthik Nandakumar

 2022 | EMNLP Workshop (#language #vision)

[C2] MuCoT: Multilingual Contrastive Training For Question-Answering In Low-resource Languages

Gokul Karthik Kumar, Abhishek Singh Gehlot, Sahal Shaji Mullappilly, Karthik Nandakumar

2022 | ACL Workshop (#language)

[A1] An Empirical Study Of Self-supervised Learning Approaches For Object Detection With Transformers

Gokul Karthik Kumar, Sahal Shaji Mullappilly, Abhishek Singh Gehlot

 2022 | ArXiv (#vision)

[P1] Method And System For Forecasting Sales Based On N-Gram Model

Gokul Karthik, Avinash Achar, Balaraman Ravindran

2021 | US Patent (#time-series)

[C1] Dynamic Bus Arrival Time Prediction: A Temporal Difference Learning Approach

LKP Vignesh, Avinash Achar, Gokul Karthik

2020 | IJCNN (#time-series)