1. Vipul Gupta, Candace Ross, David Pantoja, Rebecca J Passonneau, Megan Ung, Adina Williams. Improving Model Evaluation using SMART Filtering of Benchmark Datasets. In Proceedings of Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL) 2025.
  2. Vipul Gupta, David Pantoja, Candace Ross, Adina Williams, and Megan Ung. Changing answer order can decrease MMLU accuracy. In Proceedings of the Workshop on Datasets and Evaluators of AI Safety at AAAI 2025. For quick summary, listen to the 🎧 Notebook LM podcast.
  3. Harsh Raj, Vipul Gupta, Domenic Rosati, Subhabrata Majumdar. Semantic Consistency for Assuring Reliability of Large Language Models. Journal of Transactions on Machine Learning Research (TMLR) 2025.
  4. Berk Atil, Vipul Gupta, Sarkar Snigdha Sarathi Das, Rebecca J Passonneau. Can LLMs Rank the Harmfulness of Smaller LLMs? We are Not There Yet. arXiv preprint 2025.
  5. Shayne Longpre, Nikhil Singh, Manuel Cherep, Kushagra Tiwary, Joanna Materzynska, William Brannon, Robert Mahari, Manan Dey, Mohammed Hamdy, Nayan Saxena, Ahmad Mustafa Anis, Emad A Alghamdi, Vu Minh Chien, Naana Obeng-Marnu, Da Yin, Kun Qian, Yizhi Li, Minnie Liang, An Dinh, Shrestha Mohanty, Deividas Mataciunas, Tobin South, Jianguo Zhang, Ariel N Lee, Campbell S Lund, Christopher Klamm, Damien Sileo, Diganta Misra, Enrico Shippole, Kevin Klyman, Lester JV Miranda, Niklas Muennighoff, Seonghyeon Ye, Seungone Kim, Vipul Gupta, Vivek Sharma, Xuhui Zhou, Caiming Xiong, Luis Villa, Stella Biderman, Alex Pentland, Sara Hooker, Jad Kabbara. Bridging the Data Provenance Gap Across Text, Speech and Video. arXiv preprint 2024.
  6. Hangzhi Guo, Pranav Narayanan Venkit, Eunchae Jang, Mukund Srinath, Wenbo Zhang, Bonam Mingole, Vipul Gupta, Kush R Varshney, S Shyam Sundar, Amulya Yadav. Hey GPT, Can You be More Racist? Analysis from Crowdsourced Attempts to Elicit Biased Content from Generative AI. arXiv preprint 2024.
  7. Vipul Gupta, Pranav Narayanan Venkit, Hugo Laurençon, Shomir Wilson, Rebecca J Passonneau. CALM: A Multi-task Benchmark for Comprehensive Assessment of Language Model Bias. In Proceedings of Conference On Language Modeling (COLM) 2024. For quick summary, listen to the 🎧 Podcast by Notebook LM
  8. Shayne Longpre, Robert Mahari, Ariel Lee, Campbell Lund, Hamidah Oderinwale, William Brannon, Nayan Saxena, Naana Obeng-Marnu, Tobin South, Cole Hunter, Kevin Klyman, Christopher Klamm, Hailey Schoelkopf, Nikhil Singh, Manuel Cherep, Ahmad Anis, An Dinh, Caroline Chitongo, Da Yin, Damien Sileo, Deividas Mataciunas, Diganta Misra, Emad Alghamdi, Enrico Shippole, Jianguo Zhang, Joanna Materzynska, Kun Qian, Kush Tiwary, Lester Miranda, Manan Dey, Minnie Liang, Mohammed Hamdy, Niklas Muennighoff, Seonghyeon Ye, Seungone Kim, Shrestha Mohanty, Vipul Gupta, Vivek Sharma, Vu Minh Chien, Xuhui Zhou, Yizhi Li, Caiming Xiong, Luis Villa, Stella Biderman, Hanlin Li, Daphne Ippolito, Sara Hooker, Jad Kabbara, Sandy Pentland. Consent in Crisis: The Rapid Decline of the AI Data Commons. In Proceedings of Conference on Neural Information Processing Systems (NeurIPS) 2024.
  9. Pranav Venkit, Tatiana Chakravorti, Vipul Gupta, Heidi Biggs, Mukund Srinath, Koustava Goswami, Sarah Rajtmajer, Shomir Wilson. ’Confidently Nonsensical?’: A Critical Survey on the Perspectives and Challenges of 'Hallucinations' in NLP. Conference on Empirical Methods in Natural Language Processing (EMNLP) 2024.
  10. Jiangshu Du, Yibo Wang, Wenting Zhao, Zhongfen Deng, Shuaiqi Liu, Renze Lou, Henry Peng Zou, Pranav Narayanan Venkit, Nan Zhang, Mukund Srinath, Haoran Ranran Zhang, Vipul Gupta, Yinghui Li, Tao Li, Fei Wang, Qin Liu, Tianlin Liu, Pengzhi Gao, Congying Xia, Chen Xing, Jiayang Cheng, Zhaowei Wang, Ying Su, Raj Sanjay Shah, Ruohao Guo, Jing Gu, Haoran Li, Kangda Wei, Zihao Wang, Lu Cheng, Surangika Ranathunga, Meng Fang, Jie Fu, Fei Liu, Ruihong Huang, Eduardo Blanco, Yixin Cao, Rui Zhang, Philip S Yu, Wenpeng Yin. LLMs assist nlp researchers: Critique paper (meta-) reviewing. Conference on Empirical Methods in Natural Language Processing (EMNLP) 2024.
  11. Vipul Gupta, Pranav Narayanan Venkit, Shomir Wilson, Rebecca J Passonneau. Survey on Sociodemographic Bias in Natural Language Processing. Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP) at ACL 2024. For quick summary, listen to the 🎧 Podcast by Notebook LM
  12. Pranav Narayanan Venkit, Mukund Srinath, Sanjana Gautam, Saranya Venkatraman, Vipul Gupta, Rebecca J. Passonneau, Shomir Wilson. The Sentiment Problem: A Critical Survey towards Deconstructing Sentiment Analysis. Conference on Empirical Methods in Natural Language Processing (EMNLP) 2023.
  13. Vipul Gupta, Brian R. Belland, Alexander Billups, Rebecca J. Passonneau. AI for coding education meta-analyses: An open science approach that combines human and machine intelligence. In Artificial Intelligence in Education Technology (AIET) 2023.
  14. Vipul Gupta, Zhuowan Li, Adam Kortylewski, Chenyu Zhang, Yingwei Li, Alan Yuille. SwapMix: Diagnosing and Regularizing the Over-Reliance on Visual Context in Visual Question Answering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2022. For quick summary, listen to the 🎧 Podcast by Notebook LM
  15. Vipul Gupta, Apurva Narayan. Do we need entire training data for adversarial training? arXiv preprint 2023.