Get Answers From Your PDFs: Build a Chatbot Without Coding
Imagine sitting at your desk, staring at a 100-page well completion report. You only need a few key details, but the process of scrolling through, hunting for the right data, feels endless.
It's a tedious task that eats up time and patience. Maybe you find what you’re looking for, maybe you don’t, but you know the frustration of wasted time. Now, imagine a different scenario: You simply ask a question like,
Which logs were run?”,
“What was the cored section?”, or
“What was the DST-3 result?”
– and instantly receive the precise answers you need.
What if you could do this for any document: technical reports, reserves audits, financial statements, meeting minutes? And not just you—imagine your entire team being able to instantly query critical documents and get answers within seconds.
Today, I’ll walk you through how to build your own chatbot that can do exactly this—answer questions from PDF files—with no coding required. In just 5 minutes, you’ll be able to deploy a bot that can be embedded on any platform, such as your team’s intranet or public web pages.
Why Build Your Own Chatbot?
While there are plenty of services that offer “chat with PDF” bots, they often come with limitations. Most rely on third-party platforms or restrict customization. Building your own bot gives you the flexibility and control to adapt it to your organization’s specific needs—plus, where’s the fun in outsourcing it all?
I’ve previously demonstrated how to leverage Custom GPTs by OpenAI to build bots using a similar interface to ChatGPT. In this article, I outlined step-by-step how I built a PRMS Reserves Guidelines advisor. Today’s example will take things a step further: I’ll show you how to build a custom chatbot that you can integrate anywhere, without relying on external GPT libraries or third-party services.
Leveraging Google’s Vertex AI
Instead of depending on ready-made platforms, we’ll use Google’s infrastructure, particularly Google’s Vertex AI, a cutting-edge technology for building machine learning models that you can train on your own data—all without writing a single line of code. I’ll guide you through the setup, and I’ll make it really easy and practical.
Our Use Case: Clean Energy Report
To demonstrate, let’s pick a document to work with.
Each year, the Clean Energy Australia report, published by the Clean Energy Council, offers a comprehensive overview of Australia’s clean energy sector. For this example, I’ll use the 2024 Clean Energy Report, a 96-page PDF available on the Clean Energy Council’s website, here.
We’ll build a Clean Energy Chatbot that can instantly answer questions about the report.
The end result? A fully functioning chatbot that understands natural language queries and retrieves relevant data from a complex document—ready to streamline your work processes and save hours of manual searching.
Let’s dive in.
First, let’s have the document ready to upload to Google Cloud Storage when needed. The file I’ll use is named “clean-energy-australia-report-2024.pdf”, and as I just mentioned above can be downloaded from the Clean Energy Council’s website.
Go to cloud.google.com, log in with your google credentials, and click the ‘Console’ link in the top right corner.
Type ‘Agent Builder’ in the search box, and click on it when it appears just below in the results list:
Next, you’ll need to enable billing and activate ‘Vertex AI Agent Builder’ to start using this technology.
Click on ‘Create New Project’ and give it a name. I’ve named mine ‘Energy ChatBot’.
After that, select the ‘Chat’ option for the agent we want to implement.
Next, you’ll enable the Dialogflow API and fill in the following prompts:
Next, you’ll enable the Dialogflow API and fill in the following prompts:
After completing the basic agent setup, it’s time to create our Data Store, which will contain the file(s) that our bot will be trained on.
Click Create Data Store, and for this example, select Cloud Storage to use files that will be uploaded to the cloud.
Since we’re building a bot to specialize in a PDF file, select ‘Unstructured’ data, then click FILE, and choose Browse to upload your document.
You’ll need to create a Bucket and, within it, a Folder to organize the data store. This will serve as the repository for our PDF file(s). You can move quickly by accepting all default settings.
To upload our PDF file to the data store, open a new tab and return to the Cloud home page. Click on ‘Console’ and find the ‘Cloud Storage’ button.
Next, navigate into the bucket and folder you just created until you find the ‘Upload files’ button. This is where you’ll browse for the PDF to upload.
Once uploaded, it will look like this:
Now, back in the tab with the Data Store, the file is available for us to select.
Next, name your Data Store and click ‘Create’. Then, select the Data Store and click ‘Create’ again. It should look like this:
Next, click the ‘Preview’ button on the left, then click ‘Test Agent’ at the top right. A preview window will appear where you can test a conversation with the agent. Type ‘Hello’ and see the response.
Great! The bot is responding back to you. Under the ‘Agent Settings’ option, check the ‘Enable Conversation History’ box to keep track of all interactions.
Finally, go to ‘Manage’ and select ‘Dialogflow Messenger (text)’. Enable the unauthenticated API, and you’ll receive a piece of code that looks like this:
This snippet of code allows the ChatBot to be embedded wherever you need it, such as on your organization’s intranet page (talk to your webmaster).
Once embedded, anyone can chat with the bot and ask questions about the PDF file, offering a much smoother user experience compared to digging through countless pages—don’t you think?
To wrap up this demo, I’ve embedded the Clean Energy ChatBot on my CrowdField website, which you can explore yourself. Click here and look for the typical chat icon in the bottom-right corner to launch the bot and give it a try.
Here are a few questions you can try asking the bot: