Denotation Without Answerability
Deleuze, AI, and The Logic of Sense — Part 3
Can language point to the world before any speaker, tool, or institution has made it answerable to what is there?

This series seems like it might go a while: please let me know if it’s interesting! Last time we concluded with three counterfeits of sense and a question: if AI prose stopped inventing sources, smoothing away idiolects, and adopting the patterns of confident competence… would the result be sense?
Which is really three questions. (1) Can an LLM-produced sentence refer to the world? (2) Can the model itself reliably determine whether the reference succeeded (or anything at all)? (3) Are there potential mechanisms by which the system that produced it can be made to answer for its failures?
If this is indeed too long to finish, let me give you my conclusions up front: often yes; not reliably; only through a broader technical, human, and institutional arrangement.
The popular view
There are several different objections circulating here. I think we owe it to ourselves to a) address the strongest versions of each, while b) being honest about the differences.
Emily Bender and Alexander Koller argue that a system trained only on linguistic form cannot thereby learn meaning in the relevant sense; their octopus telegraph example is designed to show that merely understanding the conversational norms without understanding the reality they refer to is inadequate: when the conversation turns to a novel physical topic (a coconut catapult) the octopus can’t keep up. And when a bear attacks, the octopus is useless and the conversation fails fatally.
Steven Piantadosi and Felix Hill reply that models may nevertheless acquire something semantically important (but not hints for building a coconut catapult) through playing a conceptual role: the LLMs can come to understand patterns of inferential relationships among internal representations: they come to understand the relationships between ideas like big and small or safe and dangerous. They could at least recommend a smaller coconut, or ask what color the bear is and then advise you whether or not to climb a tree.
More recently, Anandi Hattiangadi and Anders Schoubye defend a stronger conclusion: that LLM outputs are literally meaningless because the models lack the intentions or attitudes required for speakers to mean anything by their utterances. So even if users can extract true beliefs from what they produce, it’s not because of what the models meant by the words. In short: the model can try to help with the catapult or the bear, and maybe succeed, but that won’t count as being intentionally helpful. They do this by telling a careful and true story about the nature of LLM training, token extraction (into numerical IDs) and generation (into statistically congruent new numerical IDs.) Arguably no part of that system has attitudes or intentions, so therefore the model as a whole cannot.
The rub here is that a similar story about neurons and language in the human brain has the same problem: no part of the story of neurons firing has attitudes or intentions… yet humans do. (Searle’s Chinese Room supplies an older version of the background anxiety: syntax is not yet semantics, and fluent symbol manipulation may not lead to understanding.)
My argument does not need to settle that entire dispute. Part of what makes it slippery is that the participants are asking about different things under the word meaning: communicative intention, conceptual role, literal speaker meaning, successful reference, useful uptake. I can’t show that the model means “Paris” as a speaker means Paris, or that it understands a citation as a lawyer understands one. But for this series, we only need the narrower claim to work: that a sentence produced through an LLM can enter our practices as a purported claim about Paris or about a case, and can succeed or fail there.
The narrower externalist claim
Denotation is not a property of mental states. (Meaning ain’t in the head!) It is a relation between propositions and states of affairs, mediated by linguistic conventions, naming chains, and the practices of speech communities. When I say “the cat is on the mat,” the denotational relation is fixed by what “cat” means in English, what “mat” means in English, what “the” picks out, and the actual cat. It is not fixed by my seeing the cat. A blind person can denote the cat. A historian can refer to a dead monarch they know only from historical records. The lawyer can refer to an opinion she hasn’t read fully, just found in a database. Externalism is usually associated with Kripke and Putnam, but it just means this: successful reference does not require that each speaker privately navigate the entire trajectory by which a name reaches its object.
Unsurprisingly, externalism is pretty lenient with LLM meanings. Even if LLMs don’t occupy the same referential position that speakers do, they can still carry the torch of meaning a bit of the way. Functionally, when a user deploys model-generated language inside an ordinary linguistic practice, the resulting sentence can purport to refer, can be checked against the world, and can turn out true or false. The chatbot’s status as a “speaker” matters less than whether the output enters a practice as a candidate for assessment.
If meaning ain't in the head, then maybe the headless can play a role, even if externalism alone doesn't get LLMs all the way to meaning. We still need to settle whether the model is a speaker, or whether its outputs possess literal meaning independently of the users who take them up.
Purported reference without an object
A fabricated citation raises a second worry. Matter of Bourguignon doesn’t exist. If propositions need to be about something that exists to count as denoting, then the LLM’s hallucinated citation isn’t a failed denotation. It’s a non-denotation. The whole Snark hunt falls apart from the other direction: not “LLMs can’t denote anything” but “you can’t fail at denoting something that wasn’t going to exist anyway.”
Meinong is relevant here, because we can pretty clearly think and reason about objects that don’t exist: the golden mountain, the round square, Pegasus. Fantasies, impossible objects. These objects are intelligible and thinkable, in some sense, even though they don’t exist! Their so-being is independent of their being: they have properties (they’re horses and have wings, round and square, etc.) independent of their existence. The round square is characterized as both round and square, even though no such object exists. Russell rejected this framework, arguing that apparently-referring descriptions could be analyzed without populating the universe with non-entities. On his analysis, apparently referring descriptions need not name objects at all. The phrase round square is meaningful not because there is a strange non-existent object waiting to be characterized, but because the description can be unpacked into a quantified claim… one that fails, since nothing is both round and square.
It’s hard to imagine an LLM could even follow this array, because much like the octopus trying to give advice on coconuts they can’t tell the difference between the really-existing and only linguistic. It seems like a race horse and a Pegasus are equally real to them, and the round square is just as possible as the square square. But back to Piantadosi and Hill, this is precisely where LLMs shine: they can chart in the corpus of English language texts that trained them Pegasus belongs with mythology, racehorse with zoology, and round square with contradiction. The statistical signature of real-vs-imaginary is different in ways they can detect, even if they can’t check “out in the world” to test that. The model has the conceptual distinction between the real and the imaginary. It lacks an independent way to force an object to answer to the world before it produces the sentence (absent tools or some other route of correction.)
Now, there’s no hope of settling the Meinong-Russell dispute here. But the hallucinated citation has enough determinate structure to organize conduct before its failure is discovered. It fits inside legal practices because it’s been given the formal indicia by which legal authority is located and assessed. The Snark names interrelated practices, too: an elaborately characterized, patiently-hunted absence. I tend to side with Russell in thinking that such absences can occupy us without ontological inflation, and this is a satisfactory answer to ontological arguments for God. (I imagine that this is much more difficult for LLMs training on a corpus written by a theistic culture, where that word is usually placed on the real side of things.) Either way, the Matter of Bourguignon found its way into Lee’s brief and did end up occupying the court’s time, as do the many lawsuits against Satan and his staff.
Can LLMs picture?
My friend Carl Sachs, who edited Interpreting Sellars: Critical Essays, pressed me on a harder version of this problem, one that runs through Wilfrid Sellars’s account of picturing.
Sellars distinguishes the normative dimension of language, where claims stand in relations of inference and justification, the “space of reasons,” from what he calls “picturing”: a natural, non-semantic relation between linguistic tokens, considered as objects in the world, and the worldly objects they map. An empirically adequate language must both contain inferentially connected sentences, and also be reliably patterned by how things are across observation and inquiry.
This is not the same as saying that each meaningful sentence must be produced by someone who has directly perceived its object. A blind speaker can refer to a cat; a historian can refer to Caesar. Nor is picturing simply another word for intention or responsibility. It is a relation between a token-producing practice and the world that constrains it.
Back to the octopus: he can imitate his way into the conversation, but when the conversation demands action on land, he’ll be unavailable.
A bare language model occupies an unstable position here. It inherits the inferential traces of human world-tracking practices, and its outputs can enter those practices to be assessed. But ordinarily it has no independent route from the generated claim back to the object that would confirm or defeat it. It can produce “Matter of Bourguignon holds X” because legal discourse has supplied the shape of that claim. Nothing in next-token generation itself requires that any actual opinion answer to the name.
Tool use complicates this picture: a model connected to a case database, a browser, a camera, or a laboratory instrument may become one component in a larger token-producing system whose outputs are causally corrected by worldly objects. In that sense, Sellars does not permit the simple claim that AI cannot picture. Instead, he sets us a limit: LLM-generated language becomes epistemically trustworthy only insofar as it is embedded in practices that let the world push back. (And the same thing goes for human-generated language!)
The empirical case
Magesh and colleagues’ Stanford study of leading AI legal-research tools, circulated in 2024 and published in 2025, found substantial rates of factual hallucination, varying by product and task. That is enough for the Snark.
But now it gets awkward. The Stanford study was 2024. A citation-shaped claim becomes discoverable when no legal object answers to it, and those legal objects are contained in databases which are machine-readable and searchable. Model-generated sentences can plainly enter practices of verification. A legal-research answer naming an actual case and accurately stating its holding survives fact-checking. Earlier models were less performant, but my own tools would catch a mistake like Matter of Bourguignon. So my own tools seem to have more of a world-picture than earlier ones, and AI-building corporations can create scarily-accurate legal tools that can successfully automate some legal work.
This does not prove that the model itself refers in the lenient Meinongian sense. It establishes the more practical point: model-generated language can become the vehicle of successful or failed reference inside the human practices that take it up.
Denotation without answerability
A model-generated sentence can point. In the hands of a user, it can become part of an argument: it can then succeed or fail against what is actually there. That is enough for the practical problem of denotation.
What is missing from a bare LLM is not necessarily reference in every sense philosophers have given that word. What is missing is an independent route by which the world constrains the saying before the saying is handed to us as finished prose. A human can tell a tall tale; a model can generate the shape of a legal citation without retrieving the opinion. A human can sound puffed up and overly confident; a model can produce the shape of an explanation. Both can sound like the Bellman without ever having looked at the (blank) map.
Sellars helps. But here’s a “Not X, it’s Y” sentence: he doesn’t help by giving us an answer, but by telling us how we’ll decide. Picturing belongs to systems of linguistic tokens disciplined by worldly objects. A model connected to reliable tools, evidence, correction, and accountable users can participate in such a system. Naked and tool-free, a large language model does not. It inherits the results of other people’s contact with the world while remaining frighteningly free to rearrange those results into propositions no object confirms.
Lee was answerable not because she had perceived the case, but because she practiced law badly. You’re supposed to retrieve the opinion, read it, verify the reporter citation, confirm the holding. Her conduct became sanctionable when she advanced a proposition while bypassing the procedures, and because she was admitted to the bar to practice law and could be disbarred. An LLM does not occupy that professional position, and thus cannot practice law. But the lawyer using the model does occupy it, and a legal-research system can be built either to assist that contact with the world or to hide its absence.
This is what I mean by denotation without answerability. Language enters our practices as being about things; its aboutness arrives before any adequate practice of correction has forced the claim to answer to what is there. The danger is the satisfaction we take in language that arrives before the world has had its say. That is what we have to build practices to keep at bay.
Next time I’ll turn to what Deleuze thought sense actually was, and to John Sellars’s worry that the Stoic apparatus Deleuze built it from isn’t quite the apparatus the Stoics had.


