Moumita Choudhury

PhD Candidate  ·  Computer Science  ·  UMass Amherst

I am a PhD candidate at the University of Massachusetts Amherst, advised by Prof. Shlomo Zilberstein in the Resource Bounded Reasoning Lab. I completed my M.Sc. in Computer Science at UMass in 2024. I was also a Research Intern at the Trustworthy AI group, Mitsubishi Electric Research Lab (MERL).

My research sits at the intersection of multi-agent reinforcement learning, AI interpretability, and safe human–AI collaboration. I work on optimizing RL fine-tuning for multi-agent multi-turn problems, generating automated explanations for agent behavior, and designing RL algorithms that integrate human feedback while limiting unintended negative side effects.

Before UMass I was a Research Assistant at the Cognitive Agents and Interaction Lab, University of Dhaka, and a Junior Lecturer at Ahsanullah University of Science and Technology. Outside the lab I enjoy singing, painting, and travelling.

multi-agent LLM fine-tuning Test-time training Multi-agent coordination AI safety
Moumita Choudhury
positions
Research Intern — MERL Jun – Sep 2025 · Boston, MA
PhD Candidate — UMass Amherst Sep 2021 – Sep 2027 (expected)
Junior Lecturer — AUST Jan – Jun 2021 · Dhaka
Research Assistant — CAIL, DU Feb – Dec 2020 · Dhaka
recent updates

News

Jun 2025 Starting as a Research Intern at the Trustworthy AI group, Mitsubishi Electric Research Lab (MERL), Boston. new
Nov 2025 Paper accepted at AAAI 2026 Workshop on Trust and Control in Agentic AI (TrustAgent) — Amplification Effects in Test-Time Reinforcement Learning: Safety and Reasoning Vulnerabilities. new
Mar 2025 Bayesian Inverse Reinforcement Learning Approach for Policy Summarization accepted at ToM4AI @ AAAI 2025. spotlight · top 16%
May 2024 Completed M.Sc. in Computer Science at UMass Amherst.
Mar 2024 Paper accepted at FLAIRS 2024 and as an Extended Abstract at AAMAS 2024 — Minimizing Negative Side Effects in Cooperative Multi-Agent Systems using Distributed Coordination.
Jun 2022 Received the Professor Victor Lesser Graduate Scholarship in Artificial Intelligence, UMass Amherst.
what I work on

Research

Multi-Agent Reinforcement Learning Fine-tuning

I work on developing post-training techniques for multi-agent and multi-turn LLMs — including trustworthiness in test-time training.

multi-agent RLRL fine-tuningLLMstest-time training

Safe and Interpretable AI

I work on making AI systems safer and more interpretable. This includes minimizing negative side effects in cooperative multi-agent systems using distributed coordination, and generating automated explanations of agent behavior via Bayesian Inverse Reinforcement Learning for policy summarization.

safetynegative side effectsinterpretabilityBayesian IRLpolicy summarization

Multi-agent Coordination

I develop algorithms for Distributed Constraint Optimization Problems (DCOPs) and functional DCOPs. This includes provably anytime population-based algorithms (AED, PFD) and continuous DCOP solvers, spanning evolutionary, particle swarm, and local search methods, all outperforming state-of-the-art approaches.

DCOPmulti-agent coordinationdistributed optimizationanytime algorithms
selected work

Publications

workshop papers

TrustAgent @ AAAI 2026

Amplification Effects in Test-Time Reinforcement Learning: Safety and Reasoning Vulnerabilities

Vanshaj Khattar, MD Rafi Ur Rashid, Moumita Choudhury, Jing Liu, Toshiaki Koike-Akino, Ming Jin, and Ye Wang

AAAI 2026 Workshop on Trust and Control in Agentic AI (TrustAgent).

ToM4AI @ AAAI 2025

spotlight paper  ·  top 16%

Bayesian Inverse Reinforcement Learning Approach for Policy Summarization

Moumita Choudhury, Shuwa Miura, and Shlomo Zilberstein

To appear in Advancing Artificial Intelligence through Theory of Mind (ToM4AI) @ AAAI 2025.

OptLearnMAS @ AAMAS 2020 · AAMAS 2020 (Extended Abstract)

C-CoCoA: A Continuous Cooperative Approximation Algorithm to Solve Functional DCOPs

Amit Sarker, Abdullahil Baki Arif, Moumita Choudhury, and Md. Mosaddek Khan

AAMAS 2020 (Extended Abstract), pages 1990–1992. Also at OptLearnMAS @ AAMAS 2020.

AAMAS
Abstract

Distributed Constraint Optimization Problems (DCOPs) have been widely used to coordinate interactions in cooperative multi-agent systems. The Functional DCOP (F-DCOP) model extends the traditional model to continuous variables. The existing F-DCOP algorithms experience huge computation and communication overhead. This paper applies continuous non-linear optimization methods on the Cooperative Constraint Approximation (CoCoA) algorithm and empirically shows the algorithm provides high-quality solutions at smaller cost.

BibTeX
@article{sarker2020c,
  title={C-CoCoA: A Continuous Cooperative Constraint Approximation
         Algorithm to Solve Functional DCOPs},
  author={Sarker, Amit and Arif, Abdullahil Baki and Choudhury, Moumita
          and Khan, Md. Mosaddek},
  journal={Proceedings of AAMAS 2020},
  year={2020}
}
conference papers

FLAIRS 2024 · AAMAS 2024

Minimizing Negative Side Effects in Cooperative Multi-Agent Systems Using Distributed Coordination

Moumita Choudhury, Sandhya Saisubramanian, Hao Zhang, and Shlomo Zilberstein

The International FLAIRS Conference Proceedings (Vol. 37), 2024. Also appeared as an Extended Abstract at AAMAS 2024.

AAMAS 2021

A Local Search Based Approach to Solve Continuous DCOPs

Amit Sarker, Moumita Choudhury, and Md. Mosaddek Khan

Proceedings of the 20th International Conference on Autonomous Agents and Multi-Agent Systems, pages 1127–1135, 2021.

IJCAI-PRICAI 2020

Learning Optimal Temperature Region for Solving Mixed Integer Functional DCOPs

Saaduddin Mahmud, Md. Mosaddek Khan, Moumita Choudhury, Long Tran-Thanh, and Nicholas R. Jennings

Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI), pages 268–275, 2020.

IJCAI
Abstract

We combine DCOPs and Functional DCOPs into the Mixed Integer Functional DCOP (MIF-DCOP) framework, and propose Distributed Parallel Simulated Annealing (DPSA), where agents cooperatively learn the optimal parameter configuration while solving the problem. DPSA produces solutions of significantly better quality than state-of-the-art non-exact algorithms.

BibTeX
@inproceedings{mahmud2020learning,
  title={Learning Optimal Temperature Region for Solving Mixed Integer
         Functional DCOPs},
  author={Mahmud, Saaduddin and Khan, Md. Mosaddek and Choudhury, Moumita
          and Tran-Thanh, Long and Jennings, Nicholas R},
  booktitle={Proceedings of IJCAI},
  pages={268--275}, year={2020}
}

AAMAS 2020

AED: An Anytime Evolutionary DCOP Algorithm

Saaduddin Mahmud, Moumita Choudhury, Md. Mosaddek Khan, Long Tran-Thanh, and Nicholas R. Jennings

Proceedings of the 19th International Conference on Autonomous Agents and Multi-Agent Systems, pages 825–833, 2020.

AAMAS
Abstract

We present Anytime Evolutionary DCOP (AED), a novel population-based algorithm that uses evolutionary optimization to solve DCOPs. Agents cooperatively construct random solutions and improve them through a mechanism considering optimistic local benefit approximations. We prove AED is anytime and show it outperforms state-of-the-art DCOP algorithms in solution quality.

BibTeX
@inproceedings{mahmud2020aed,
  title={AED: An Anytime Evolutionary DCOP},
  author={Mahmud, Saaduddin and Choudhury, Moumita and Khan, Md. Mosaddek
          and Tran-Thanh, Long and Jennings, Nicholas R},
  booktitle={Proceedings of AAMAS},
  pages={825--833}, year={2020}
}

AAAI 2020

A Particle Swarm Based Algorithm for Functional Distributed Constraint Optimization Problems

Moumita Choudhury, Saaduddin Mahmud, and Md. Mosaddek Khan

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, pages 7111–7118, 2020.

AAAI poster
Abstract

We propose Particle Swarm based F-DCOP (PFD), a new algorithm for Functional DCOPs inspired by Particle Swarm Optimization. PFD devises a distributed method that significantly reduces computation and memory requirements. We prove PFD is an anytime algorithm, and empirical results show it outperforms state-of-the-art approaches in solution quality and overhead.

BibTeX
@inproceedings{choudhury2020particle,
  title={A particle swarm based algorithm for functional distributed
         constraint optimization problems},
  author={Choudhury, Moumita and Mahmud, Saaduddin and Khan, Md Mosaddek},
  booktitle={Proceedings of AAAI},
  volume={34}, number={05},
  pages={7111--7118}, year={2020}
}
journal papers

Engineering Applications of Artificial Intelligence, 2023

A Particle Swarm Inspired Approach for Continuous Distributed Constraint Optimization Problems

Moumita Choudhury, Amit Sarker, Md. Mosaddek Khan, and William Yeoh

Engineering Applications of Artificial Intelligence 123 (2023): 106280.

get in touch

Contact

I am happy to discuss research, potential collaborations, or anything else. Feel free to reach out.