This illustration is generated using DALL·E 3
This is a curated list of related literature and resources for machine theory of mind (ToM) research. Last Update: Dec 30th, 2024.
If you find our work useful, please give us credit by citing:
@inproceedings{ma2023towards,
title={Towards A Holistic Landscape of Situated Theory of Mind in Large Language Models},
author={Ma, Ziqiao and Sansom, Jacob and Peng, Run and Chai, Joyce},
booktitle={Findings of the Association for Computational Linguistics: EMNLP 2023},
year={2023}
}
- Main Contributors: Martin Ziqiao Ma
- Active Contributors: X. Angelo Huang, Jacob Sansom, Run Peng, Pony Zhang
Welcome to contribute to our paper list or be a collaborator!
- To add missing papers: Please create an issue or pull request, so the team can make the update.
- To become a contributor: Please drop an email to Martin.
- Reading List: Advances in Machine Theory of Mind
- (ToM 2024) 2nd Workshop on Theory-of-Mind @ ICLR 2024. [Web]
- (ToM 2023) 1st Workshop on Theory-of-Mind @ ICML 2023. [Web]
- To be updated
- (ToM 2023) The SocialAI School: Insights from Developmental Psychology Towards Artificial Socio-Cultural Agents. [Paper][Web]
- (Preprint 2023) SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents. [Paper][Web]
- (EMNLP Main 2024) Large Language Models: The Need for Nuance in Current Debates and a Pragmatic Perspective on Understanding [Paper]
- (EMNLP Findings 2023) Towards A Holistic Landscape of Situated Theory of Mind in Large Language Models. [Paper][Data]
- (Preprint 2023) A Review on Machine Theory of Mind. [Paper]
- (EMNLP Findings 2022) Language Models as Agent Models. [Paper]
- (RO-MAN 2022) Understanding Intention for Machine Theory of Mind: A Position Paper. [Paper]
- (Psychological Medicine 2020) Knowing Me, Knowing you: Theory of Mind in AI. [Paper]
- (Neuropsychologia 2020) Theory of Mind and Decision Science: Towards a Typology of Tasks and Computational Models. [Paper]
- (AI 2018) Autonomous Agents Modelling Other Agents: A Comprehensive Survey and Open Problems. [Paper]
- (Preprint 2017) It Takes Two to Tango: Towards Theory of AI's Mind. [Paper]
- (AI 2016) Integrating Social Power Into The Decision-making of Cognitive Agents. [Paper]
- (Premack et al., 1978) Does the Chimpanzee Have a Theory of Mind? [Paper]
- (Dennett, 1988) Précis of The Intentional Stance. [Paper]
- (Gopnik et al., 1992) Why the Child's Theory of Mind Really Is a Theory. [Paper]
- (Baron-Cohen, 1992) Mindblindness: An Essay on Autism and Theory of Mind. [Book]
- (Blakemore et al,. 2001) From the Perception of Action to the Understanding of Intention. [Paper]
- (Ho et al,. 2022) Planning With Theory Of Mind. [Paper]
- (ToM 2023) EPITOME: Experimental Protocol Inventory for Theory Of Mind Evaluation. [Paper]
- (Stack et al., 2022) Framework for a Multi-dimensional Test of Theory of Mind for Humans and AI Systems. [Paper]
- (Osterhaus et al., 2022) Looking for the Lighthouse: A Systematic Review of Advanced Theory-of-mind Tests beyond Preschool. [Paper]
- (Beaudoin et al., 2020) Systematic Review and Inventory of Theory of Mind Measures for Young Children. [Paper]
- (EACL 2023) Methods for Measuring, Updating, and Visualizing Factual Beliefs in Language Models. [Paper][Code]
- (EMNLP Findings 2021) Tiered Reasoning for Intuitive Physics: Toward Verifiable Commonsense Language Understanding. [Paper][Data]
- (ACL 2021) Implicit Representations of Meaning in Neural Language Models. [Paper][Code]
- (Preprint 2023) Unveiling Theory of Mind in Large Language Models: A Parallel to Single Neurons in the Human Brain. [Paper]
- (Preprint 2023) Sparks of Artificial General Intelligence: Early experiments with GPT-4. [Paper]
- (Preprint 2023) Theory of Mind Might Have Spontaneously Emerged in Large Language Models. [Paper][Web][Code]
- (EMNLP Findings 2021) Effectiveness of Pre-training for Few-shot Intent Classification. [Paper][Code]
- (CONLL 2023) Theory of Mind in Large Language Models: Examining Performance of 11 State-of-the-Art models vs. Children Aged 7-10 on Advanced Tests [Paper]
- (EMNLP 2023) FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions. [Paper][Code]
- (EACL 2024) Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models. [Paper]
- (Preprint 2023) Limitation of Theory of Mind In Large Language Model: Anthropomorphize Religous Figure. [Paper]
- (Preprint 2023) Does ChatGPT have Theory of Mind? [Paper]
- (Preprint 2023) Large Language Models Fail on Trivial Alterations to Theory-of-Mind Tasks. [Paper]
- (AI Review 2023) Mind the Gap: Challenges of Deep Learning Approaches to Theory of Mind. [Paper]
- (Preprint 2022) Do Large Language Models Know what Humans Know? [Paper]
- (Preprint 2022) Large Language Models Are Not Zero-shot Communicators. [Paper][Web][Code][Data]
- (EMNLP 2022) Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs. [Paper]
A taxonomized review of existing benchmarks for machine ToM and their settings under ATOMS. We further break beliefs into first-order beliefs (1st) and second-order beliefs or beyond (2nd+); and break intentions into Action intentions and Communicative intentions. Tasks are divided into Inference, Question Answering, Natural Language Generation, MultiAgent Collaboration, and MultiAgent Competition. Input modalities consist of Text (Human, AI, or Template) and Nonlinguistic ones. The latter further breaks into Cartoon, Natural Images, Chess, 2D Grid World, and 3D Simulation. The Situatedness is divided into None, Passive Perceiver, and Active Interactor. Symmetricity refers to whether the tested agent is co-situated and engaged in mutual interactions with other ToM agents.
Benchmarks and Task Formulations | Resources (Code, Data, etc.) | Tested Agent | Situatedness | ATOMS Mental States | Sym. | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Task | Input Modality | Physical | Social | Belief | Intention | Des. | Emo. | Know. | Per. | NLC | ||||||||
Text | Nonling. | Per. | Int. | Per. | Int. | 1st | 2nd+ | Act. | Com. | |||||||||
(Preprint 2021) Epistemic Reasoning | - | Infer | T | - | - | - | - | - | ✔️ | ✔️ | - | - | - | - | - | - | - | - |
(EMNLP 2018) ToMi | Code | QA | T | - | ✔️ | - | - | - | ✔️ | ✔️ | - | - | - | - | - | - | - | - |
(EMNLP Findings 2023) Hi-ToM | Code | QA | T | - | ✔️ | - | - | - | ✔️ | ✔️ | - | - | - | - | - | - | - | - |
(EMNLP Findings 2023) MindGames | Code, Data | Infer | T | - | ✔️ | - | - | - | ✔️ | ✔️ | - | - | - | - | - | ✔️ | - | - |
(ToM 2023) Selective Encoding | - | QA | T | - | ✔️ | - | - | - | - | - | ✔️ | - | ✔️ | - | - | - | - | - |
(Preprint 2023) Adv-CSFB | - | QA | H | - | ✔️ | - | - | - | ✔️ | - | - | - | - | - | - | - | - | - |
(EMNLP 2010) ConvEntail | Data | Infer | H | - | - | - | ✔️ | - | ✔️ | - | - | ✔️ | ✔️ | - | - | - | - | - |
(EMNLP 2019) SocialIQA | Data | QA | H | - | - | - | ✔️ | - | - | - | ✔️ | - | - | ✔️ | - | - | - | - |
(LREC 2022) BeSt | - | - | H | - | - | - | ✔️ | - | ✔️ | - | - | - | - | ✔️ | - | - | ✔️ | - |
(ToM 2023) Loophole | - | NLG | H | - | - | - | ✔️ | - | - | - | - | - | - | - | - | - | ✔️ | - |
(ACL Findings 2023) FauxPas-EAI | - | QA | H,AI | - | - | - | ✔️ | - | ✔️ | - | - | - | - | - | - | - | ✔️ | - |
(Preprint 2023) COKE | - | NLG | AI | - | - | - | ✔️ | ✔️ | - | - | ✔️ | - | - | ✔️ | - | - | - | - |
(Preprint 2022) ToM-in-AMC | Data | Infer | H | - | ✔️ | - | ✔️ | - | - | - | ✔️ | ✔️ | - | - | - | - | - | - |
(ACL 2023) G4C | - | NLG | H,AI | - | ✔️ | - | ✔️ | ✔️ | - | - | ✔️ | ✔️ | - | - | - | ✔️ | - | - |
(Preprint 2016) VisualBeliefs | Web | Infer | - | Cartoon | ✔️ | - | - | - | ✔️ | - | - | - | - | - | - | - | ✔️ | - |
(AAAI 2016) Triangle COPA | Data | QA | H | Cartoon | ✔️ | - | ✔️ | - | - | - | ✔️ | - | - | ✔️ | - | - | - | - |
(NAACL 2022) MSED | Data | Infer | H | Images | ✔️ | - | - | - | - | - | - | - | ✔️ | ✔️ | - | - | - | - |
(NeurIPS 2021) BIB | Code | Infer | - | 2D Grid | ✔️ | - | - | - | - | - | ✔️ | - | ✔️ | - | - | - | - | - |
(ICML 2021) AGENT | Code | Infer | - | 3D Sim. | ✔️ | - | - | - | - | - | ✔️ | - | ✔️ | - | - | ✔️ | - | - |
(ToM 2023) RBC | - | Compete | - | Chess | ✔️ | - | - | - | - | - | - | - | - | - | ✔️ | - | - | - |
(ICML 2018) MToM | Code | Infer | - | 2D Grid | ✔️ | - | - | - | ✔️ | - | ✔️ | - | - | - | - | - | - | - |
(ICML 2022) SymmToM | Code | Collab | - | 2D Grid | ✔️ | ✔️ | ✔️ | ✔️ | - | - | - | - | - | - | ✔️ | - | - | ✔️ |
(EMNLP 2023) Search & Rescue | - | Collab | AI | 2D Grid | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | - | - | - | - | ✔️ | ✔️ | - | ✔️ |
(EMNLP 2021) MindCraft | Code | Infer | H | 3D Sim. | ✔️ | ✔️ | ✔️ | ✔️ | - | - | ✔️ | - | - | - | ✔️ | ✔️ | - | ✔️ |
(IJCAI 2023) CPA | Code | Infer | H | 3D Sim. | ✔️ | ✔️ | ✔️ | ✔️ | - | - | ✔️ | ✔️ | - | - | ✔️ | ✔️ | - | ✔️ |
(EMNLP 2023) FANToM | Code | QA | T | - | - | - | ✔️ | - | ✔️ | ✔️ | - | - | - | - | ✔️ | - | - | - |
- (ACL 2024) ToMBench: Benchmarking Theory of Mind in Large Language Models [Paper]
- (ACL 2024) OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind Reasoning Capabilities of Large Language Models [Paper]
- (IJCAI 2023) Towards Collaborative Plan Acquisition through Theory of Mind Modeling in Situated Dialogue. [Paper]
- (EMNLP 2021) MindCraft: Theory of Mind Modeling for Situated Dialogue in Collaborative Tasks. [Paper][Code]
- (RO-MAN 2021) Deep Interpretable Models of Theory of Mind. [Paper]
- (EMNLP 2020) RMM: A Recursive Mental Model for Dialog Navigation. [Paper][Code]
- (ICML 2018) Machine Theory of Mind. [Paper][Code]
- (ACL 2023) Minding Language Models' (Lack of) Theory of Mind: A Plug-and-Play Multi-Character Belief Tracker. [Paper][Code]
- (ToM 2023) The Neuro-Symbolic Inverse Planning Engine (NIPE): Modeling Probabilistic Social Inferences from Linguistic Inputs. [Paper][Code]
- (EMNLP 2024) A Notion of Complexity for Theory of Mind via Discrete World Models [Paper]
- (EMNLP 2023) Theory of Mind for Multi-Agent Collaboration via Large Language Models. [Paper]
- (Preprint 2023) How FaR Are Large Language Models From Agents with Theory-of-Mind? [Paper]
- (Preprint 2023) Violation of Expectation via Metacognitive Prompting Reduces Theory of Mind Prediction Error in Large Language Models. [Paper][Code]
- (Preprint 2023) CAMEL: Communicative Agents for "Mind" Exploration of Large Scale Language Model Society. [Paper][Code]
- (Preprint 2023) Boosting Theory-of-Mind Performance in Large Language Models via Prompting. [Paper][Data]
- (Preprint 2023) Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT-4. [Paper][Code]
- (ToM 2023) Theory of Mind as Intrinsic Motivation for Multi-Agent Reinforcement Learning. [Paper]
- (ToM 2023) Iterative Machine Teaching for Black-box Markov Learners. [Paper]
- (ToM 2023) Between Prudence and Paranoia: Theory of Mind Gone Right, and Wrong. [Paper]
- (ToM 2023) Emergent Deception and Skepticism via Theory of Mind. [Paper]
- (ToM 2023) How To Make Social Decisions in a Heterogeneous Society? [Paper]
- (ICML 2022) Symmetric Machine Theory of Mind. [Paper][Code]
- (ICML 2021) Few-shot Language Coordination by Modeling Theory of Mind. [Paper][Code]
- (CogSci 2020) Improving Multi-Agent Cooperation using Theory of Mind. [Paper][Code]
- (Preprint 2019) Modeling Theory of Mind in Multi-Agent Games Using Adaptive Feedback Control. [Paper]
- (EmeComm 2019) Emergence of Theory of Mind Collaboration in Multiagent Systems. [Paper][Code]
- (Current Opinion in Behavioral Sciences 2019) Theory of Mind as Inverse Reinforcement Learning. [Paper]
- (ToM 2023) Language Models are Bounded Pragmatic Speakers: Understanding RLHF from a Bayesian Cognitive Modeling Perspective. [Paper]
- (ToM 2023) Inferring the Future by Imagining the Past. [Paper]
- (ToM 2023) Inferring the Goals of Communicating Agents from Actions and Instructions. [Paper]
- (RSS 2015) Grounding English Commands to Reward Functions. [Paper]
- (CogSci 2011) Bayesian Theory of Mind: Modeling Joint Belief-Desire Attribution. [Paper][Web]
- (ToM 2023) Towards a Better Rational Speech Act Framework for Context-aware Modeling of Metaphor Understanding. [Paper]
- (ACL Findings 2023) Define, Evaluate, and Improve Task-Oriented Cognitive Capabilities for Instruction Generation Models. [Paper]
- (ACL 2022) Learning to Mediate Disparities Towards Pragmatic Communication. [Paper][Code]
- (ICML 2021) Few-shot Language Coordination by Modeling Theory of Mind. [Paper][Code]
- (Science 2012) Predicting Pragmatic Reasoning in Language Games. [Paper]
- (ToM 2023) MindDial: Belief Dynamics Tracking with Theory-of-Mind Modeling for Neural Dialogue Generation. [Paper]
- (ACL Findings 2023) Speaking the Language of Your Listener: Audience-Aware Adaptation via Plug-and-Play Theory of Mind. [Paper][Code]
- (SIGDIAL 2022) Towards Socially Intelligent Agents with Mental State Transition and Human Utility. [Paper]
- (EMNLP 2020) RMM: A Recursive Mental Model for Dialog Navigation. [Paper][Code]
- (ICLR 2023) Computational Language Acquisition with Theory of Mind. [Paper][Code]
- (Preprint 2023) Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Theory of Mind. [Paper][Code]
- (ToM 2023) Preference Proxies: Evaluating Large Language Models in Capturing Human Preferences in Human-AI Tasks. [Paper]
- (CHI 2021) Towards Mutual Theory of Mind in Human-AI Interaction: How Language Reflects What Students Perceive About a Virtual Teaching Assistant. [Paper]
- (Springer 2002) Theory of Mind for a Humanoid Robot [Paper]
- (iScience 2021) CX-ToM: Counterfactual Explanations with Theory-of-Mind for Enhancing Human Trust in Image Recognition Models. [Paper]
- (ToM 2023) Discovering User Types: Mapping User Traits by Task-Specific Behaviors in Reinforcement Learning. [Paper]