A dream LLM can turn into a deployment nightmare

Cristiano De Nobili, PhD

Physicist ∣↑↓⟩ | Lead AI Scientist | Lecturer & Speaker

Published Aug 29, 2023

This is the first article of Beyond Entropy, a space where the chaos of the future, the speed of emerging technologies and the explosion of opportunities are slowed down in short posts. For longer and detailed posts, check out my newsletter Turning bits into dreams (link below).

A dream LLM can turn into a deployment nightmare

Are Large Language Models (LLMs) changing the Machine Learning workflow? If so, what are the main differences and issues?

Over the past year, I am sure that many AI specialists, from data scientists to project owners, are pondering these questions. Having personally worked on several LLM projects over the last few months, I am starting to get a feel for them and would like to try and answer these questions. I will try to summarise my thoughts in this short post.

On the one hand, LLMs make us dream, they open up many possibilities since it is easy to create something beautiful. On the other hand, it can become a nightmare to build production-ready applications with them. It is necessary to consider and be aware of LLMs limitations. Below I list the ones I have come across and consider most relevant.

The ambiguity of prompting

In computer science, instructions written in programming languages such as Python, C++ or JavaScript are known to be mostly exact. On the other hand, when it comes to LLMs, instructions are written in natural languages. Given the ambiguous nature of the latter, prompt engineering is a programming paradigm that lacks rigour. This makes it very flexible and easy to use, but can cause frustration. If one adds the nascent nature of prompt engineering, this leads to a rather negative development experience.

Silent failures

If you receive standard software, e.g. written in Python, and you add a random character or accidentally remove a line, it will basically not work and will rise an error. On the other hand, if you slightly modify a prompt, any LLM will still work, but will give very different results. This shows how prompting can lead to many silent failures.

Stochasticity

Generative algorithms, such as current LLMs, are stochastic. They always produce slightly different results, in contrast to non-generative programmes or standard machine learning models that are deterministic. Deterministic results are generally safer, especially when an application is based on more than one linked level. The unpredictability of results can increase enormously if there is no control over individual outputs. Therefore, when it comes to LLMs, we must accept ambiguity. However, despite its scientific interest, stochasticity is an underappreciated feature in industrial applications.

Maintenance

In prompt engineering, there are some general rules or paradigms, but for best performance, each LLM requires specific variations. For example, if a set of prompts is perfectly designed to solve a task with one LLM, there is no way to guarantee that all prompts will also work with a newer LLM. This could lead to severe headaches and a huge maintenance cost.

Robustness and dependence

Prompt programming is not yet robust to changes. For example, if you have started a series of prompts using the role-playing technique, such as "You are a creative writer and you must help me to...", and a few months later Llama or GPT4 are updated by Meta or OpenAI with the role-playing already integrated, then all your prompts need to be modified to incorporate the new changes.

Opportunities, talks, and events

I share some opportunities in you might find interesting (please, if interested contact me for more info):

Job & Research opportunities

👗 A startup in Milan, using Machine & Deep Learning in the fashion sector is looking for a junior Data Scientist or AI developer;

🍷 A startup in Milan, using Deep Learning & NLP in the food and wine sector is looking for a 2-4 year experienced Data Scientist;

🇪🇸 A Barcelona-based venture studio, Antai Venture, is looking for a full-time AI Specialist;

⚛️ The quantum company Quantinuum is looking for a Research Software Engineer;

⚛️ Covestro is opening a PhD opportunity in Quantum Computing for computational chemistry;

Talks, Conferences, and Courses

🎙 Tech Talk at Pi School (August, 30th): Classic and Explainable AI Methods in Vaccine Development by Francesco Patanè ;

🌊 CodingWaves is organizing several AI courses and workshop in Milan and in many European outdoor locations.

If you would like to get in touch with me or view my lectures and courses (technical and non-technical), you can find everything here.

Beyond Entropy

2,203 followers

+ Subscribe

Marco Cello

🤖 Building Decentralized AI Agents...

Interesting reading Cristiano! We treat GPT output as a regular API output (same input-same output, predetermined format). But this is so far from reality, so we need to work around, try and error with prompting.

A dream LLM can turn into a deployment nightmare

Cristiano De Nobili, PhD

Physicist ∣↑↓⟩ | Lead AI Scientist | Lecturer & Speaker

A dream LLM can turn into a deployment nightmare

The ambiguity of prompting

Silent failures

Stochasticity

Maintenance

Robustness and dependence

Recommended by LinkedIn

Opportunities, talks, and events

Beyond Entropy

2,203 followers

More articles by this author

Insights from the community

Others also viewed

New LLM & RAG Courses and Certifications

The Rust-Python Hybrid: A Powerful Polyglot Architecture for Cutting-Edge AI Engineering

Interesting Content in AI, Software, Business, and Tech- 5/31/2023

The Implications of Prompt Interfaces (Vol. 7)

Challenging the Notion That LLMs Can't Reason: A Case Study with Einstein's Puzzle

Turning ideas into real things: Meet our synthesis team

YOLOv8 AS-One: The Holy Grail of Computer Vision Development

The Dictator Game: Exploring Fairness with Python and AI (with Example Code)

DeciCoder-6B and DeciDiffusion 2.0: Models Built for Accuracy, Speed, and Cost-Efficiency

Building a simple Chatbot using LangChain

Explore topics

A dream LLM can turn into a deployment nightmare

The ambiguity of prompting

Silent failures

Stochasticity

Maintenance

Robustness and dependence

Recommended by LinkedIn

Opportunities, talks, and events

Beyond Entropy

2,203 followers

AI for Quantum Circuits Design

Dec 17, 2024

Decoding the Energy Efficiency of Quantum Hardwares

Oct 10, 2024

Addition bias: playing LEGO with LLMs

Sep 19, 2024

Decoding the Energy Footprint of AI

Aug 7, 2024

How to make LLMs more memory-efficient?

Jul 3, 2024

The era of Artificial Collective Intelligence (ACI) is about to start

May 9, 2024

Thermodynamic Computing: new hardwares for future AI algorithms

Apr 3, 2024

Tuning LLMs: a galaxy of endless possibilities

Feb 14, 2024

Quantum Computing: enthusiasm and scepticism still coexist

Jan 3, 2024

How can Philosophy elevate LLMs?

Oct 31, 2023

Insights from the community

Others also viewed

New LLM & RAG Courses and Certifications

The Rust-Python Hybrid: A Powerful Polyglot Architecture for Cutting-Edge AI Engineering

Interesting Content in AI, Software, Business, and Tech- 5/31/2023

The Implications of Prompt Interfaces (Vol. 7)

Challenging the Notion That LLMs Can't Reason: A Case Study with Einstein's Puzzle

Turning ideas into real things: Meet our synthesis team

YOLOv8 AS-One: The Holy Grail of Computer Vision Development

The Dictator Game: Exploring Fairness with Python and AI (with Example Code)

DeciCoder-6B and DeciDiffusion 2.0: Models Built for Accuracy, Speed, and Cost-Efficiency

Building a simple Chatbot using LangChain

Explore topics