What is a Confabulator?
Copyright 2024 Kurt Cagle / The Cagle Report
I've made this observation many times in LinkedIn feed posts, that an LLM is not a database but a confabulator, but I've been deliberately coy about describing what exactly the latter is. The word, of course, is a portmanteau of fabula, a story, and con, with, from the Latin, and can be roughly translated as meaning - to make stories with. It is, put simply, a means of feeding information into a system and having that system, then have that system hallucinate a story about that information for us.
Confabulators: A New Piece of the Puzzle
We've never really had confabulators in the computational zoo. We have algorithms, which generally are used to store and manipulate data structures, and those algorithms in turn are used used for everything from building databases to running applications to controlling devices. Indeed, arguably, the bulk of the last half century can be seen as the rise of the algorithm in computing.
Confabulators are something different. Most algorithms work upon the idea of providing keys to retrieve extant data structures, whether such keys are pointers to locations in memory, keys for extracting data structures from databases, or keys for connecting resources together different processes, the lookup table and its closely aligned cousin the hash table serves to create abstractions that allow you to talk about relational structures indirectly. This is actually one of the most powerful aspects of semantics, which can be thought of as very complex keys..
A confabulator, on the other hand, relies upon prompts, conceptual spaces, and contexts. A prompt with one word will usually bring back a potentially vast number of associations, a prompt with a few words will do a Venn diagram that better narrows down a topic, a prompt with a couple dozen lines of text will usually be pretty good at creating a narrow enough overlap to retrieve meaningful content. However, there are no keys anywhere, just tokens.
As those tokens are returned, they are then matched with expressions that most closely align to their potential usage in order to create responses, using a core set of foundational templates that are usually invisible to the user. These responses could be text, parts of images, sound files, or anything else that can be effectively tokenized.
Diffusion systems (used primarily for media) use a variation of this built around GANs - Generative Adversarial Networks. GANs involve a tension between different prompts in order to create images and videos, from very realistic to truly surreal. In the discussion about the utility of generative AI (confabulators), this particular category stands apart as perhaps the most rapidly evolving (and certainly with the greatest potential value) of all aspects of generative AI.
The upshot is that we have created a software solution that tells stories. Unlike a database that returns a recitation of the facts that were stored in it with no interpretation, an LLM takes facts as "inspiration", using some facts, changing some, ignoring others, and weaving them into a cohesive narrative. just as an historical biography may change some information for narrative coherence (and a film adaptation may change even more information to better tell the story, the idea is that the LLM (or any Generative AI) is transforming the information to better adapt to a given format or narrative voice.
Implications of Confabulation
Let me address a very subtle point here. Why do I talk about Confabulation rather than AI, or even Generative AI? There are four reasons:
So what are confabulators good for? As a general rule of thumb, a confabulator is your go-to guy if you need a story. Thus, this particular question can be reframed as "where do you need a story?":
Media Production
Media production includes images, videos, speech, text, 3D and any situation where you want to PRODUCE something. The product may not actually be what you intend to produce, but then again, it is rare for anything that human beings produce to meet needs in its first iteration. You try something, build a prototype, find out where it's flaws are, refine the parameters, build a prototype, find out where it's flaws are and so forth. Each media tells a story, whether as an illustration, a film, or a podcast,
This increasingly goes for document production as well. A very common misconception with most confabulations is that they are pushbutton magic. Lay out a quick plot, press a button, get a book. You'll get a book. It will be a very bad book, but it will be a book. This is not really the fault of the confabulation, but of the user. Once you have something you've produced, rewrite, edit, remove, tighten, rework. The second draft will be better than the first by a considerable degree. Pass this new document in, rinse and repeat, and you will likely find that after a while, what you get is a workable output.
This holds true for any created project, because creation is intrinsically iterative. When i'm doing a piece of art, I will frequently start from a sketch or a 3D layout, will often create several different versions using an AI app, take the ones I like and refine them further, sometimes going back to the sketch stage when I find a like something better than when I started. I see AI as "finishing", especially when I start with a conceptual idea and want to turn it into a photographic (or hyperrealistic) rendition.
Put another way, there is an agile process at work here. The first prototype that you create is frequently quite crude, and the implementation only gets me to the point where I can see a better target to head towards. This doesn't change with AI, other than reducing the time necessary to render the output.
Does this necessarily speed up productivity? Perhaps, though not as much as you might think. When you are dealing with physical media - pens, paint, pencils, etc., it takes a lot of time. If your goal is to create a drawing or a painting in these media, the time involved is worth it, because you are creating a physical artefact that has value in and of itself.
If, on the other hand, the goal is creating an image, a video, or even a story, you are creating a piece of IP, typically creating it under a very tight deadline and usually creating it for a larger product. A confabulator provides you with the means to produce more potential amounts and forms of media for your project quicker and often opens the door to utilizing media—such as video—that was simply too far out of reach earlier.
However, there's a flipside to this. If you're an accountant or a lawyer or creating a report for a presentation, or even writing code, you need to verify that the output accurately reflects the input information, and you probably should edit the document to make sure the conclusions being made are in fact consistent. You have to iterate. Without that step, what you are producing is going to be crap.
This is not what a lot of people (especially the people shelling out the money) want to hear. Confabulators replace design time with editorial time. Creation requires both - the design time to hash out ideas and to refine them, the editorial time to determine whether the output is worthwhile and the discrimination to see what needs to be changed.
If this sounds like Agile, it is, sort of. Agile was largely predicated upon multiple people providing different functions towards the evolution of a given product. With Confabulators, many of those functions are increasingly managed by AI confabulators, with the creator increasingly playing the role of a conductor or orchestrator. It's possible that you will see some specialization in this space, but we're not quite there yet.
Outside of the Code function (which I'll consider below), how much of an organization's efforts revolve around the creation of media assets? My suspicion is that it is fairly low, as these are functions that are typically outsourced to private agents. This means perhaps improvement in the quality of product, but it is likely that the generative benefits will more likely accrue to the producers of the media, rather than the consumer (the company in this case).
There's another point to consider as well - media companies are getting something of a boost here, but the quality of tools that most media organizations have in general is superior to what is currently producible via generative methods. It's one thing to generate images for a blog post or a fifteen-second spot for Instagram or TikTok, it's quite another to produce consistent, high-quality image or video output this way.
My estimate is that it will likely be mid-2025 by the time that changes significantly. Controllability of generated content is still wildly consistent, and most of the tools that have begun developing a modicum of sophistication in the image space are still at least a year away for video.
For instance, one of the most powerful tools in the AI imaging toolset is the ability to do inpainting, in which you create a region in an existing image and tell the confabulator what you want to replace in that image. Doing this with video production is not yet widely available, though when it is, it will be a game changer. You could, as an example, remove the modern sportscar in that Victorian era period piece you've just produced via such video inpainting, just by identifying the car and having the system remove it as it moves from frame to frame.
The Code Conundrum
Does code figure in the Confabulation Handbook? This is actually one of the biggest unanswered questions concerning LLMs, in particular. The answer again comes back to the notion of person-in-the-middle iteration.
At the moment, the code that is produced by LLMs can be thought of as Intellisense on steroids. Intellisense worked pre "AI" days by reading and parsing the relevant files in order to determine interfaces for classes. In essence, it read the metadata that it could determine from importing resources and then keeping track on what the particular context is that a particular "dot" or period represents (or similar structures for other languages than those that can trace their origin to Algol and C.
Microsoft Copilot and similar tools are trained on code, and are also trained on identifying local context and adjusting the content accordingly. This capability is quite powerful for a variety of tasks, especially if you understand how to code and are needing the ability to replicate code blocks.
Code development is similarly iterative, though despite nearly fifty years of research, the process whereby we actually write code is still not terribly well understood. There is very likely a qualitative point at which point the ability to think through an algorithm (which generally can be automated) shifts into the ability to think systemically. It is also likely that this transition may be what keeps LLMs by themselves from being able to take the next leap.
This statement is not made because I believe that humans are more capable of system thinking than AI systems are. It has to do more with the fact that humans have a better innate capability of abstracting, of thinking of systems recursively as containers of other functionality, There is a certain degree of self-reflection that needs to happen in order to make that leap, and that self-reflection is mathematically very difficult to bridge algorithmically.
To put it simply, people are surprisingly good at making intuitive leaps that arrive at a plausible answer, at which point they then have to backtrack and prove their thesis. It is possible that there is something "quantum" going on there that moves beyond Turing processes, and that is otherwise very difficult to do via brute force methods.
This again suggests that you can use LLMs and similar confabulation tools to generate code, but only up to a point, at which point system complexity becomes a factor. In other words, this isn't just a question of making the model faster or adding more processors, but may very well be fundamental. This consequently has implications for agented systems - there is only so far one can take them before they break down, probably due to insufficient context to be able to maintain the state of such systems. In other words, these systems have a stack overflow problem.
Recommended by LinkedIn
Analysis, Watersheds, and Fractals
One useful way of thinking about confabulators is that they are essentially map / reduce (M/R) algorithms writ large. While such algorithms have been around since the dawn of computing -- gather the data, process the data in parallel, then condense the processed data into a single result -- map / reduce systems lend themselves especially well to analysis and summarization, though at the risk of being lossy.
Summarization - the building of summaries - is a powerful tool, in that it condenses information into fractal structures akin to the branching structures of a river system, making it much easier to identify and retrieve the important concepts (the principal tributaries) quickly. Once you have these particular tributaries, you can then dig deeper by following their tributaries, and so forth in an ever-expanding network. By establishing a boundary about how deep you want to go, you can, in effect, store information holographically.
As with any such network, however, there's a cost to navigating it. The deeper you go, the fuzzier and more interconnected information becomes, meaning the time it takes to traverse the network grows geometrically, while at the same time, it becomes more difficult to determine relevancy of that information to the larger tributary.
A particularly well-studied area of systems theory is the exploration of watersheds. Water flows in the direction of gravity, but it is also subject to the contours of the land over which it flows. Not surprisingly, most river systems start up in the mountains, but a ridge or hill, even a small one, could make the difference between water flowing into one watershed or another, even if they start in the same general proximity.
Moreover, changes in the contours of those regions, in many cases induced by the very water moving through those hills, can in turn change whether any given point belongs in watershed A or watershed B at a specific moment in time. This is why determining the source of a river is so difficult - there are many sources, and they shift continuously.
If you've ever seen a low-res JPEG image, it resembles nothing more than nebulous clouds of colour. As you decrease the compression, you increase the fidelity of the image, but you also increase the file size. Eventually, you reach a point where there is no compression, and the image has the fidelity of the original because it is, in essence, the original.
Confabulators are compressed information spaces, and moreover they are compressed with imperfect algorithms. It is easy to get very high-level information (summaries) from such encodings, but as you look to get more detail, you also increase the size and complexity (and hence performance and retrieval time) of that information. At some point you have to create a cutoff in the depth of information to make the system even remotely useful, and this, ultimately is where hallucinations come from - they are artefacts of the compression process.
This is also why, if you build a system that does a deep dive on the your data, you eventually reach a stage where it seems like you're getting back the same content just worded in slightly different ways. There's a phenomenon that infovores are familiar with - the Wikipedia rabbit hole. Once you start following links within Wikipedia, you can frequently find yourself starting out with a topic in Quantum Physics and end up in a discussion about Phoenician outposts in the 5th century BC. This is because while there is a certain degree of hierarchy in the linking structure of each entry, there's no real cutoff below which you can't link. With LLMs, there is.
This is a big part of the reason why LLM responses can often seem bland. A linked system such as Wikipedia is chaotic - at any given locus in that information space, there are innumerable links that can take you to related content. An LLM, on the other hand, is fractal - it mathematically resembles the chaos of the Wikipedia system, but it is at best an approximation, and even given that you have to add noise into the encoding process to keep it from falling into completely predictable stable patterns. That noise is irrelevancy - it is a necessary part of how LLMs (and most confabulators) work but it is not relevant information to the topic at hand, regardless of what that topic is.
RAG, Linking and Education
What that suggests is that an LLM (or any confabulator, for that matter) should not be seen as being a vehicle for education by itself. Rather, it needs both RAG (the ability to reference external data stores), and spidering (the ability to following links to dynamic, richly linked content. In other words, for any confabulator to be truly useful, it should not be seen as the repository of information but only as an interpreter of that information.
This is a hard pill for many in the Chat-space, in particular, to swallow, but like any foul-tasting medicine, it's necessary. The confabulator in this case then becomes not the store of knowledge, but its mediator, having enough context to be able to allow for deeper retrievals of information without necessarily locking that information into the matrix of the fractal structure.
This is one of the reasons that I (and many others) advocate knowledge graphics, with the assumption that such knowledge graphs are themselves dynamically refreshed. A knowledge graph can be used for reasoning (though there are benefits to making available the schema of the knowledge graph to enhance reasoning capabilities within the LLM itself. Note that what a schema (or more specifically, an ontology) does here is not replace the superb capabilities of the confabulator. Instead it provide a better mechanism for introducing linking into the LLM, not just to an arbitrary set of conceptual links but also links to external web resources in a systematic fashion.
This becomes important, because while confabulators are good at presenting information (a very key prerequisite of any educational system, whether formal schooling, corporate training, or providing information) they are fairly indifferent as keepers of that information.
Confabulators could play a vital role in education for several reasons:
The Role-Playing Storyteller
The one underappreciated aspect of the LLMs is the capacity of such LLMs to be used for role-playing. Role-playing is, at its core, participatory storytelling. The storyteller in this case is the gamemaster, the "person" with knowledge about the world, non-player characters, and objectives who sets the plot of the story and keeps the characters moving towards some form of resolution of that plot.
Most company executives do not understand role-playing. This is unfortunate, because it is one of the primary ways that we both learn and communicate. A use-case is a form of role playing - you are the user of a given system and you want to do X. By assuming a certain role, you as the user set specific constraints on the possible. By assigning a certain role to an LLM, you are in effect telling the LLM to role play the scenario not from the user's standpoint but from the standpoint of everyone (and everything) else.
This is, in effect, what the original configurers of the larger models do: they tell the model that it is a model that is roleplaying as a chat agent. When you interact with such a system, you should further refine it by saying something like: "you are a relational database", "you are an economist", "you are an air traffic control system". It does not magically turn the LLM into any of these, but it does establish the behaviours that it is expected to emulate.
Note again here that the emulation will only work if there is enough information in its context (whether directly generated or via rag) for it to respond in an accurate fashion. For instance, I'm currently working on a pre-processor for a language called Terrapin. I use various LLMs, specifying a BNF description of the language, then use the LLM as a testing device. One of the first commands I give as part of the BNF data is "You are a Terrapin processor". Once in that mode, it is a surprisingly good Terrapin processor.
It is, in effect, roleplaying a processor. Now, without that BNF file, roleplaying the processor is meaningless - it can (and has) generated garbage, because it had no reference upon which to work. Additionally, when I make a logical error with the BNF, what gets produced is similarly going to be incorrect.
This role-playing area is one that I think has the potential to be far more extensively developed. It does require a bigger investment than just sitting down and asking questions - the author must place constraints upon how the characters (whether hobbits or management trainees) navigate from point to point in the broad graph, what conditions are necessary for doing so, and how this gets communicated to the user in a way that takes them on a (potentially gradeable) journey.
The Emergence of Confabulator Best Practices
Historically, every major innovation in the last fifty years has gone through a period of extreme hype in which the innovation is guaranteed game-changing, followed by a more sober analysis as people discover the limitations about what the technology can do, after which there is a period of discovering what the tech is really good for. The Internet was first seen as a glorified magazine, then a new form of TV, and even a platform for a while, until it simply became The Internet.
I see the same thing happening with confabulators. A confabulator is not a database, though it has the potential to be a powerful data transformer. It is not a search engine, though again it's a pretty decent transformer of search engine results. It is not a replacement for your coding, accounting, helpdesk, marketing, or C level departments, though it can certainly improve capabilities within each of these.
A confabulator is primarily a generative technology intended to create things from other things. It should, for the most part, be paired with both network capabilities and a reliable knowledge graph (in this case, any reliable data source, rather than the narrower definition I normally use).
We're at the outer edge of the Best Practices phase, though I suspect we'll still be in the pits of despair (or whatever the latest Gartner name for it is) for some time. The technology will continue to improve, I have no doubt about that, but it's time that we stopped trying to see GenAi as the second coming of Sliced Bread, and instead look at it for what it is - a great tool for content creation, a potentially good tool for content dissemination, and with sufficient grounding in more traditional databases, a reasonably good tool for analysis.
I'll make another prediction here - I think in the long run, the use of external data resources (such as those used for video and image production) will be hashed out, not as wholesale grabs of IP, but through some form of licensing. Part of this has to do with the fact that already there is strong regulatory and industry pushback that is making the more egregious players accountable, while another part is that while no one likes the idea that their content can be used, they generally are becoming used to the idea that everyone else's is probably fair game, and they like exploring the technology.
Best practices take a while to emerge. These reflect acquired (experimental) knowledge that can then be codified. I suspect that Confabulator best practices are already being tested out even now.
In media res,
Editor, The Cagle Report
Chief Architect genAI, AI, HC & LS @ Progress | M.Eng.
3moThey are definitely not databases. Not even a search engine. I think of them as lossy knowledge compression engines.
Bible lover. Founder at Zingrevenue. Insurance, coding and AI geek.
3moKurt Cagle, thanks! So let me get this straight. Without a LLM, the KG is an pretty precise unified data store with which one can issue powerful, deep queries and get back unique insights about the org, its business, customers and suppliers like never before in the history of the company. Insights that make a tired executive team suddenly energised realising that, "HEY! We have had the answers right in front of us all this time!" Insights that suddenly result in the company going all battle stations because they found a key advantage over their competition. Pure excitement. The KG is now a mission critical tool. The KG team is now a board level advisory function. ----- With an LLM, now what we get is a Tom Hanks box of chocolates. Which we try and try and try to get to fruitlessly work with the KG. The ML team never gets the glory it deserves, half of it retrenched. The KG team is laid off, with management citing "challenging economic conditions". All because of ideology. ... Why, oh why, are we such a foolish species of creature I shall never understand! 🤦🏾♂️
Bible lover. Founder at Zingrevenue. Insurance, coding and AI geek.
3moKurt Cagle thank you for putting a most suitable name to the tech. Now for that uphill task... executive buy in! 😱 I can just imagine a CEO asking, "I asked you to lead the AI effort and you built me a what?" "A confabulator, ma'am." CEO buries her head in her hands, mumbling something about explaining it to the board, then asks you to leave the room... Gulp! 😅
Bible lover. Founder at Zingrevenue. Insurance, coding and AI geek.
3moWhere _are_ all the ladies to chip in their two cents worth?! Or truly and deeply that KGs are an area where males dominate again (monocultural challenges and all the user acceptance problems that come with them, anyone)? 😓
Bible lover. Founder at Zingrevenue. Insurance, coding and AI geek.
3moThanks Kurt Cagle! The problem is that now the source of knowledge is obscured from staff. The KG already has the problem of lossy data transfer only because it is pretty impossible for the KG team to document everything from a department, whether it's due to uncooperative colleagues, from a sheer volume of data much of which hasn't been digitised or are not fully extractable, or from a deadline & cost perspective (have to prioritise the top 10 attributes/entities/relationships). Now put in front of that a wayward chatbot plus the expectations of management which had approved the massive budget and we have now an error prone IT tool that serves paradoxically as yet another reason why management doesn't need to connect with the ground troops, because there's this thing that they spent so much money on. And can't use as a key piece of their financial and risk evaluation toolset. But CEOs won't know all this, because nobody dares tell them the truth, which sucks as they need yet another win to impress the board and to splash all over the next investor relations piece that goes out next week. Hey, it's not that easy to get a spot on the "Business Leaders Today" section of the local news, right after this ad break! 😑