Knowledge Graphs are Essential for Safe AI
AIs will only be safe for general use when they have and use goals and values that are identical to those of humans. In theory, the particular goals and values – very much like Asimov's original Laws of Robotics – could be legislated and enforced, so that we would all be safe from harm from AI.
In theory.
This is the AI alignment problem and it captures the common assumption that safe AIs are those that can understand and reliably use only our goals and values in planning and executing behaviors.
But in practice, current approaches to AI make this assumption seem much more naïve than feasible. Let's see why.
As progress in AI continues to grow at an astonishing pace, we need much more than naïve assumptions -- solutions for AI safety are becoming a dramatically more urgent need.
AI Alignment
The discussion about AI alignment – as shown by Ji et al. (2024) in an up-to-date, comprehensive, 100-page survey of the field – focuses on the RICE principles (emphasis added):
The RICE principles define four key characteristics that an aligned system should possess, in no particular order:
(1) Robustness states that the system’s stability needs to be guaranteed across various environments;
(2) Interpretability states that the operation and decision-making process of the system should be clear and understandable;
(3) Controllability states that the system should be under the guidance and control of humans;
(4) Ethicality states that the system should adhere to society’s norms and values.
Why these principles specifically?
There are many issues hidden in this formulation of the parameters of a solution. The challenges to actually implementing the RICE principles shouldn't be underestimated.
Here I want to focus on the most fundamental and most problematic issue: the dramatically naïve assumption that we can control current AIs’ ethicality through language.
Three of the four RICE principles rest on this same requirement. Implementing these principles requires us to communicate our norms, values, and instructions to AI systems in ways that are clear and understandable for both human and machine.
But exchanging words is simply not enough to ensure a shared point of view.
And then we have to verify that both parties have understood those words in the same way. And we need to ensure that AIs cannot modify, distort, substitute, or ignore that understanding across a wide range of decisions and tasks (one aspect of the Robustness Principle). Otherwise, we won't be able to monitor, guide, or control these systems in any meaningful way to guarantee our safety.
Communicating with AIs is like Communicating with Teens
Communication and verification – especially of values and norms – depend directly on shared knowledge and some willingness to come to an alignment, as anyone with a teenager knows all too well. But both large language model-based AIs and our own lovable, unruly teenage offspring display a series of behaviors that undermine communication – many more than just hallucinations:
For teens and presumably for AIs, the norms and concepts they associate with strings like d-a-n-g-e-r are very different from the norms, concepts, and decisions of the adults around them. What could possibly go wrong if we entrusted the safety of our friends and loved ones to the assumption that a machine "understood" the word danger in the same way that we do?
When we "order" an AI to follow a law like "A robot may not injure a human being or, through inaction, allow a human being to come to harm" (Asimov's First Law), we assume it will understand the law as we do and comply as we ordered.
In essence, though, we are telling the AI: Read my mind to see what I'm thinking of and if I forgot anything, add that to the list, too.
But AIs still aren't very good at mind reading.
In the context of AIs with the behaviors listed above, the RICE Principles seem incurably naïve and the goal of AI alignment seems doomed. The Sturm und Drang around end-of-species threats from AI starts to seem understandable.
Knowledge Graphs for AI Alignment
Knowledge Graphs are gaining traction in solving a range of other problems: eliminating hallucinations, imposing guardrails, improving relevance, ensuring accuracy of responses from AIs, and others. And Knowledge Graphs are likely to provide the only viable solution for the AI Alignment problem.
Text-based LLMs model huge collections of strings to predict in very robust and useful ways how to generate continuations for arbitrary input strings: generate a continuation from a prompt to an email, from a question to an answer, from an article to a summary, etc. They do all this without storing or accessing anything like meanings or concepts that we might be able to identify or monitor. This creates an explainability problem that makes an AIs inner workings difficult or impossible to understand: we simply do not know in which sense LLMs use a particular string. The outputs, however, are so natural that we essentially hallucinate semantic processes in the background.
Researchers were quick to notice that this issue of (lack of) explainability or scrutability blocks implementation of the Interpretability Principle which is essential for actually controlling, guiding, and stopping these systems. So in fact we have no safety mechanisms in place.
Rich Knowledge Graphs complement this string-based approach by focusing on capturing explicit, monitorable representations of concepts and facts. Knowledge Graphs already contribute to controllability by being used to guide training and evaluation of LLMs, by improving explainability, and by imposing explicit guardrails on LLM outputs. In fact, Knowledge Graphs help at every step of the development, evaluation, and deployment of LLMs.
More to the point of the present discussion, with rich Knowledge Graphs it is possible for humans to track and visualize which facts and concepts are being activated as an AI processes inputs and decisions. We can refine this knowledge as issues occur. Since humans create and curate these concepts, we can architect AIs to use them and only them as the basis for processing and recapture some control over the decision making process – or at the very least enable monitoring it in a human-accessible way. The quickly growing research on graph neural networks already shows that merging neural networks with graph-structured concepts and facts is not only feasible but leads to significant improvements across the board.
Knowledge graph-centric AI will be a big step forward in ensuring AI safety.
Chief Architect genAI, AI, HC & LS @ Progress | M.Eng.
1moWe have been using #SemanticRAG for combining knowledge, graphs with vectors and other indexes to (1) analyze the question with semantics, (2) analyze, categorize and fact extract the content (3) retrieve the most relevant data, (4) link Concepts back to the graph, and (5) check the output of the graph with respect to allowable/safe concepts.
Principal Content Strategist, IKEA | Contextual Content, Knowledge Domain Modelling, Systems Thinking, Game Design Thinking, AI Content Ops
1moAside from ensuring responsible, explainable AI, structured knowledge AND structured content are essential ingredients for responsible, explainable #personalisation. Investing in structured knowledge-content integration and transformation is a rising tide that lifts all ships!
DevOps Engineer @ i/o Werx
1moI have seen several articles describing how some problem was solved or something new invented by AI and ... "researchers don't know how." It really should be disallowed, at least for any system doing reasoning or decision making, because it might as well be an hallucination.
Microcontent champion, Terminologist, Ontologist, Professor of Terminology, Translation and Localization
1mo"humans create and curate these concepts" - yes, and those humans are called "terminologists"
i-inf.net - get to know what you know. IT Consultant. Information-Oriented Software Architectures, ISAQB-Cert., IREB-Cert.Prof.f. Requirements Engineering, UXQB-Cert.Prof.f.Usability&UX, Data Protection
1moWhat kind of solutions do we actually call AI? I tend to believe that especially neural networks are called intelligent because of their use of “neurons” and even more because humans do not really understand everything that’s happening inside of them. We show lots of training material to them and in the end they can solve a set of problems without the need for us to really bother how they do it. We watch and wonder. Maybe that’s why people call them intelligent. So, trying to tame NNs by trying to control their behavior, and by making them consider data structures we as humans can understand, seems to take away part of the mystic from the NNs and thus seems to convert them into conventional IT solutions, where we have to care about our problems by ourselves. This might contrast with many AI business models and might reduce ROI significantly. So, I wonder if we should strive for changing the way AI solutions work or if we should rather put them into contexts of non-AI algorithms to reach our final goals while having them do what they do best: pattern recognition. I'm not sure and maybe there’s not just one answer to this. I guess, both alternatives are called symbolic AI, but there might be better terms for it.