Abstract
Large language models typically operate in a question-answering paradigm, where the quality of the input prompt critically affects the response. Automated Prompt Optimization (APO) aims to overcome the cognitive biases of manually crafted prompts and explore a broader prompt design space. However, existing APO methods often suffer from rigid template structures and inefficient exploration in the prompt space. To this end, we propose a Multi-Agent Adaptive Reasoning with Socratic guidance framework (MARS). It consists of five complementary agents and formulates the optimization process as a Partially Observable Markov Decision Process, enabling adaptive prompt refinement through explicit state modeling and interactive feedback. Specifically, a Planner agent generates flexible optimization trajectories, a Teacher-Critic-Student triad engages in Socratic-style dialogue to iteratively optimize the prompt based on pseudo-gradient signals in the text space, and a Target agent executes the prompt in downstream tasks to provide performance feedback. MARS integrates reasoning, feedback, and state transition into a unified hidden-state evolution process, improving both the effectiveness and interpretability of optimization. Extensive experiments across multiple datasets show that MARS outperforms existing APO methods in optimization, efficiency, and interpretability.
| Original language | English |
|---|---|
| Pages (from-to) | 16307-16315 |
| Number of pages | 9 |
| Journal | Proceedings of the AAAI Conference on Artificial Intelligence |
| Volume | 40 |
| Issue number | 19 |
| DOIs | |
| State | Published - 2026 |
| Event | 40th AAAI Conference on Artificial Intelligence, AAAI 2026 - Singapore, Singapore Duration: 20 Jan 2026 → 27 Jan 2026 |
Fingerprint
Dive into the research topics of 'MARS: Multi-Agent Adaptive Reasoning with Socratic Guidance for Automated Prompt Optimization'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver