Docker Labs: GenAI No. 15

Docker, Inc

Docker helps developers bring their ideas to life by conquering the complexity of app development.

Published Dec 20, 2024

In previous explorations, we focused on how AI-based tools can help developers streamline tasks, and some ideas for enabling agentic workflows like reviewing branches and understanding code changes.

Today we’ll be exploring our experiments around the idea of creating a Docker AI Agent, something that could help both new users learn about our tools and products, and power-users get things done faster.

During our explorations around this Docker Agent and AI-based tools, the main pain points we encountered were so often the same few:

LLMs need good context to provide good answers (garbage in -> garbage out);

Using AI tools often requires context switching (moving to another app, to a different website, etc.)
We’d like agents to be able to suggest and perform actions on behalf of the users;
Direct product integrations with AI are often more satisfying to use than chat interfaces

At first, we tried to see what’s possible using off-the-shelf services like ChatGPT or Claude.

Testing prompts such as “optimize the following dockerfile, following all best practices” and providing the model with a sub-par but very common Dockerfile, we could sometimes get decent answers! Often, though, the resulting Dockerfile had subtle bugs, hallucinations, or simply wasn’t very optimized or didn’t use many of the best practices we would’ve hoped for. This was not reliable enough.

Data ended up being the main issue. Training data for LLM models is always outdated by some amount of time, the number of bad Dockerfiles that you can find online vastly outnumbers the amount of up-to-date Dockerfiles using all best practices, etc.

After doing some proof on concept tests using a RAG approach, by including some documents with lots of useful advice for creating good Dockerfiles, we realized that the AI Agent idea was definitely possible, but setting up all the things required for a good RAG would’ve taken too much bandwidth from our small team.

Because of this, we opted to use Kapa AI for that specific part of our agent. Docker already uses them to provide the ai docs assistant on docs.docker.com, so most of our high quality documentation is already available for us to reference as part of our LLM usage through their service. Kapa AI allowed us to experiment more, getting high quality results faster, and allowing us to try out a bunch of different ideas around the AI Agent concept.

Out of this experimentation came a new product that you can try out, Gordon (working name)

With Gordon we’d like to tackle these pain points. By integrating Gordon into Docker Desktop and the Docker CLI, we can:

Have access to much more context that can be used by the LLMs to best understand the users questions and provide better answers, or even perform actions on the user’s behalf;

Be where the users are. If you launch a container via Docker Desktop and it fails, you can quickly debug with Gordon. If you’re in the terminal hacking away, docker ai will be there too
Avoid being a purely chat-based agent by providing Gordon-based features directly as part of Docker Desktop UI elements. If Gordon detects certain scenarios, like a container that failed to start, a button will appear in the ui to directly get suggestions, or run actions, etc.

Recommended by LinkedIn

GPT 4o, Stack Overflow + OpenAI, LLMs Explained, & How…

HackerRank 6 months ago

Docker Labs: GenAI | No. 7

Docker, Inc 4 months ago

Transforming Ideas into Reality: How AI Fuels My…

Reuven Cohen 2 months ago

What it can do

We want to start with Gordon by optimizing for Docker-related tasks, not general purpose questions, but we are not excluding expanding the scope to more development-related tasks as work on the agent continues.

Work on Gordon is at a very early stage and its capabilities are constantly evolving, but it’s already really good at some things! Here is some stuff to definitely try out:

Ask general Docker-related questions. Gordon knows Docker very well and has access to all of our documentation;
Get help debugging container build or runtime errors;
Remediate policy deviations from Docker Scout;
Get help optimizing Docker-related files and configurations;
Ask it how to run specific containers (e.g. “how can i run mongodb?”)

How it works

The Gordon backend lives on Docker servers, while the client is a CLI that lives on the user’s machine and is bundled with Docker Desktop. Docker Desktop uses the CLI to access the local machine’s files, asking the user for the directory each time it needs that context to answer a question. When using the CLI directly, it has access to the working directory it’s executed in. E.g. if you are in a directory with a Dockerfile and you run docker ai rate my dockerfile it will find the one that’s present in that directory

Currently, Gordon does not have write access to any file so it will not edit any of your files. We’re hard at work on some future features which will allow the agent to do the work for you, instead of only suggesting solutions.

The following graph can help give a rough overview of how we are thinking about things behind the scenes

The first step of this pipeline, “Understand the user’s input and figure out which action to perform”, is done using “tool calling” (also known as “function calling”) with the OpenAI API. Despite this being quite a popular approach, we noticed that documentation online isn’t very good, and general best practices aren’t very well defined yet. This led us to experiment a lot with the feature, and try to figure out what works for us and what doesn’t.

What we noticed:

Tool descriptions are very important, and we should prefer more in-depth descriptions with examples
Testing around Tool detection code is also important. Adding new tools to a request could confuse the LLM and cause it to no longer trigger the expected tool;
The LLM model used influences how the whole tool calling functionality should be implemented, as different models might prefer descriptions written in a certain way, behave better/worse under certain scenarios (e.g. when using lots of tools), etc.

Trying It Out

Gordon is available as an opt-in Beta feature starting with Docker Desktop version 4.37!

To participate in the closed beta, all you need to do is fill out the form found here.

Initially, Gordon will be available for use both in Docker Desktop and the Docker CLI, but our idea is to surface parts of this tech in various other parts of our products as well.

Docker Labs: GenAI

175,548 followers

+ Subscribe

Hanna Ivanchenko

Customer Success Manager - Hubject

Looks interesting!

Sanghamitra Das

Software Development Team Lead | Application Development and Agile Leadership | DevOps, AI/ML Enthusiastic

this is really exciting .. Can't wait to use Gordon in docker desktop. also hope very soon it would be able to edit docker files and generate the corrected one

1 Reaction

PRITISH KUMAR TRIPATHY

Out of the world

Md Milon Hossain

Talks About #Web Designer, WIX Website Design, #Wordpress Developer, #Shopify Dropshipping, #e-Commerce Website Design

"Wow, this is absolutely brilliant! The idea of integrating an AI agent like Gordon directly into Docker Desktop and CLI is both innovative and user-centric. Addressing key points such as context switching and outdated data while offering intelligent, context-aware assistance is a true game-changer. The seamless functionality, like debugging failed containers or suggesting actions through the UI, shows a deep understanding of user needs. This is a beautiful blend of cutting-edge technology and practical application—kudos to the team for such an outstanding effort!"

2 Reactions

See more comments

To view or add a comment, sign in

Docker Labs: GenAI No. 15

Docker, Inc

Docker helps developers bring their ideas to life by conquering the complexity of app development.

Recommended by LinkedIn

What it can do

How it works

Trying It Out

Docker Labs: GenAI

175,548 followers

More articles by Docker, Inc

Insights from the community

Others also viewed

Navigating the Paradox: The Decline in IT Software Testing Roles Amidst AI Acceleration

Beyond the AI Hype! A Personal Analysis of How AI is Truly Shaping Our Tech Future

Google AI for Developers

The big trap when working on AI agents

Self-Reflecting AI Agents

The New New Thing: Agentic Systems

GPT-5, Where Are You?

Invest in LLMOps to tackle LLMDrift

Demystifying Devin: How AI Augments, Not Replaces, Software Engineers

CriticGPT: Elevating Code Quality with Advanced AI [2024 Updated]

Explore topics

Recommended by LinkedIn

What it can do

How it works

Trying It Out

Docker Labs: GenAI

175,548 followers

More articles by Docker, Inc

[Docker Navigator] Issue #12: Docker AI Catalog, Build Musical Light Shows with Open Source Tools, Latest Enterprise-Grade Features

Docker Labs: GenAI No. 14

Docker Labs: GenAI No. 13

[Docker Navigator] Issue #11: Unlock New Docker Desktop Features, Boost DevOps Automation, and a Buyer’s Checklist

Docker Labs: GenAI No. 12

Docker Labs: GenAI No. 11

[Docker Navigator] Issue #10: Debunking Docker Myths, Docker for Web Development & DevOps

Docker Labs: GenAI No. 10

Docker Labs: GenAI No. 9

Docker Labs: GenAI | No. 8

Insights from the community

Others also viewed

Navigating the Paradox: The Decline in IT Software Testing Roles Amidst AI Acceleration

Beyond the AI Hype! A Personal Analysis of How AI is Truly Shaping Our Tech Future

Google AI for Developers

The big trap when working on AI agents

Self-Reflecting AI Agents

The New New Thing: Agentic Systems

GPT-5, Where Are You?

Invest in LLMOps to tackle LLMDrift

Demystifying Devin: How AI Augments, Not Replaces, Software Engineers

CriticGPT: Elevating Code Quality with Advanced AI [2024 Updated]

Explore topics