Show me the evidence: Evaluating the role of evidence and natural language explanations in AI-supported fact-checking
Abstract
Although much research has focused on AI explanations to support decisions in complex information-seeking tasks such as fact-checking, the role of evidence is surprisingly under-researched. In our study, we systematically varied explanation type, AI prediction certainty, and correctness of AI system advice for non-expert participants, who evaluated the veracity of claims and AI system predictions. Participants were provided the option of easily inspecting the underlying evidence. We found that participants consistently relied on evidence to validate AI claims across all experimental conditions. When participants were presented with natural language explanations, evidence was used less frequently although they relied on it when these explanations seemed insufficient or flawed. Qualitative data suggests that participants attempted to infer evidence source reliability, despite source identities being deliberately omitted. Our results demonstrate that evidence is a key ingredient in how people evaluate the reliability of information presented by an AI system and, in combination with natural language explanations, offers valuable support for decision-making. Further research is urgently needed to understand how evidence ought to be presented and how people engage with it in practice.
Community
TL;DR: In an AI-supported fact-checking task, people consistently relied on underlying evidence to judge AI reliability, using explanations as a supplement rather than a substitute, showing that evidence is central to how people evaluate AI-aided decisions.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Fact-Checking with Large Language Models via Probabilistic Certainty and Consistency (2026)
- Do LLM Self-Explanations Help Users Predict Model Behavior? Evaluating Counterfactual Simulatability with Pragmatic Perturbations (2026)
- Can We Trust AI Explanations? Evidence of Systematic Underreporting in Chain-of-Thought Reasoning (2025)
- Large Language Models Require Curated Context for Reliable Political Fact-Checking - Even with Reasoning and Web Search (2025)
- When Medical AI Explanations Help and When They Harm (2025)
- Full Disclosure, Less Trust? How the Level of Detail about AI Use in News Writing Affects Readers'Trust (2026)
- Human Cognitive Biases in Explanation-Based Interaction: The Case of Within and Between Session Order Effect (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper