Detecting disinformation in a work of fiction requires much more than keyword matching. TrueStories combines several lines of technological innovation to build a tool capable of reasoning about content, cross-referencing sources and delivering well-founded assessments.
The technical challenge: what makes fiction so hard to analyse?
Imagine a scene in which a character explains how the immune system works. Is that explanation correct? To answer the question, an automated system needs, at a minimum, to: extract the claim from the text, identify that it is a factual statement (rather than a metaphor or dramatic device), search for it in reliable sources, and compare the result with what was expressed — all within a narrative flow that includes dialogue, description, time jumps and very different registers.
This is precisely the kind of complex reasoning the TrueStories team is designing. To do so, they have structured three technological pillars that work in an integrated way.
Pillar 1 – Reliable information extraction: data source fusion
The first component of the system is responsible for building a verified knowledge base. Consulting a single source is not enough: different databases, encyclopaedias, scientific records, historical archives and institutional repositories may contain complementary or even contradictory information.
TrueStories develops information fusion algorithms that aggregate data from multiple heterogeneous sources, weighting their relative reliability. This component draws on principles of consensual trusted AI: the system does not assume any single source is absolutely authoritative, but instead weighs the degree of agreement among multiple independent sources to build a more robust, less bias-prone representation of knowledge.
Pillar 2 – Certified ontologies: the ground truth
An ontology, in the context of artificial intelligence, is a formal structure that organises knowledge within a domain: it defines concepts, the relationships between them and logical rules. In TrueStories, ontologies represent the ground truth against which claims extracted from fictional works are checked.
The novelty of this project lies in the certification of those ontologies. To guarantee that the ground truth is trustworthy and auditable, the team investigates the use of distributed ledger technology (DLT), similar to the technology underpinning blockchain systems. This mechanism allows immutable recording of when, how and by whom the stored knowledge has been modified, creating a verifiable history that reinforces confidence in the system.
Managing knowledge under this model poses its own challenges: how to represent uncertainty, how to handle contradictions between sources, how to update the ontology as scientific consensus evolves. The University of Salamanca team brings decades of experience in knowledge representation and ontological engineering to bear on these questions.
Pillar 3 – Natural language processing and classification
The third major component is the one that engages directly with the text of fictional works. Using advanced Natural Language Processing (NLP) techniques, the system automatically extracts factual claims, disambiguates them in context and prepares them for cross-referencing with the ontology.
But TrueStories goes beyond simply detecting whether a claim is true or false. The system is designed to classify disinformation into two broad categories: intentional (where there are indications that the content was deliberately introduced to deceive) and unintentional (where it appears to be an error without manipulative intent). This distinction carries very different implications from an ethical, legal and communicative standpoint.
Language models and contextual understanding
Large-scale language models have transformed the ability of automated systems to understand natural language. TrueStories leverages these advances, but with a specific orientation: it is not enough for the model to understand the text — it must also be able to reason about its veracity in relation to an external knowledge base. This involves retrieval-augmented generation (RAG) techniques and neuro-symbolic reasoning mechanisms that represent the current frontier of AI research.
Integration: a unified tool
The final product of the project is a tool that integrates the three pillars seamlessly: it ingests a work of fiction, extracts factual claims, cross-references them against the ontological ground truth and produces a report identifying the disinformative elements found, their classification and the system's confidence level in each verdict.
From the perspective of the audiovisual and publishing industry, such a tool can be integrated into production and post-production workflows, acting as an additional layer of review before content reaches audiences. This holds value both for production companies seeking rigour and for distribution platforms managing responsibility for the content they host.
Funded by MICIU/AEI /10.13039/501100011033 and by the European Union NextGenerationEU/PRTR
Project reference: CPP2021-008358