Build Your Own Multimodal Image Search Demo, Choose the Right AI Model for your GenAI Application, and More!

Build Your Own Multimodal Image Search Demo, Choose the Right AI Model for your GenAI Application, and More!

In this issue: 

  • Multimodal RAG with Milvus
  • AI Models for Your GenAI Apps
  • Milvus 🤝Cohere
  • Upcoming Events 

🔎 Multimodal RAG with Milvus

Create your own multimodal image search demo powered by…

🐦 Milvus for efficient retrieval

👁️ Visualized BGE model for precise image processing and matching

🔁 GPT-4o for advanced ranking

See how the final result would look like through the interactive demo here

Build your own with the tutorial here

🫵 AI Models for Your GenAI Apps

❌ MYTH: It doesn't matter what embedding model you use.

✅ FACT: To get optimal and accurate search results, choose an embedding model that is training on similar data to create your embeddings. Pay attention to if it's designed for image, search or another type of unstructured data.

Some examples:

▶️ Jina AI / jina-embeddings-v2-base-en

Specialized embedding model for English text and long documents; support sequences of up to 8192 tokens

▶️ Voyage AI / voyage-large-2

Voyage AI's general-purpose text embedding model optimized for retrieval quality (e.g., better than OpenAI V3 Large). It is also ideal for tasks like summarization, clustering, and classification.

▶️ Cohere / embed-multilingual-v3.0

Tailored for multilingual text and is a member of Cohere's newly released Embed V3 model family. It supports 100+ languages and can be used to search within a language (e.g., search with a French query on French documents) and across languages (e.g., search with a Chinese query on Finnish documents).

Compare the models

Milvus 🤝 Cohere

Create a question-answering system based on the SQuAD dataset using Milvus as the vector database and Cohere as the embedding system.

Steps (follow the code here):

1. Prepare the dataset 

In this example, we use the Stanford Question Answering Dataset (SQuAD) as our truth source for answering questions. This dataset comes in the form of a JSON file and we use pandas to load it in.

2. Create a collection 

Within Milvus, we need to set up a collection and index it.

3. Insert data

Once we have the collection set up we need to start inserting our data. This is done in three steps

1️⃣ reading the data,

2️⃣ embedding the original questions, and

3️⃣ inserting the data into the collection we’ve just created on Milvus.

In the example, the data includes the original question, the original question’s embedding, and the answer to the original question.

4. Answer Questions

Once all the data is inserted into the Milvus collection, we can ask the system questions by taking our question phrase, embedding it with Cohere, and searching with the collection.


Follow the tutorial

Example - performing a similarity search using Cohere embeddings

In this article, we embed the query ‘Who founded Wikipedia’ and use it to search a Milvus collection.

Read more

👥 Upcoming Events

Sept 9: The AI Alliance + The Unstructured Data Meetup (in-person)

Join us and The AI Alliance for an SF meetup on Sept 9 at GitHub! Spots are limited, save yours below. 

▶️ Industrial Problem-Solving through Domain-Specific Models and Agentic AI: A Semiconductor Manufacturing Case Study with Christopher Cuong T. Nguyen & Shruti Raghavan from AITOMATIC

▶️ Evaluating Safety & Alignment of LLM in Specific Domains with Zhuo Li from HydroX AI

▶️ Introduction to Llama 3.1 with Amit Sangani from Meta

▶️ AI Alliance Working group for Materials and Chemistry (WG4M) by Jed Pitera from IBM

Save your spot

Sept 10-11: CIVO Navigate Europe (in-person)

See Zilliz Developer Advocate Stephen Batifol at the following sessions:

📈Sept 10th: Scaling Generative AI Solutions with Open-Source and K8s. We'll have a look at how Milvus makes it possible to do Vector Search at Billions+ scale. 

👥Sept 11th: Panel Discussion about Berlin Tech Community with Sophia McKee Nele Uhlemann Benazir Khan and Kadir Keles 

Reach out to him for a free ticket while supplies last! 

Register here

Sept 12: Voxel51 AI, Machine Learning and Computer Vision Meetup (virtual) 


Join at 10:00 AM PT virtually and listen to talks by The Julia Language, Voxel51, and Zilliz. 

“It’s in the Air Tonight. Sensor Data in RAG” with Tim Spann , Developer Advocate, Zilliz

Register 

Sept 12: AI & Tech Talks Zoom HQ (in-person)

Join Zilliz and other companies at Zoom HQ in San Jose for the second event in their developer meetup series. They’re bringing together developers, products managers, and AI enthusiasts to hear from industry leaders and dive into some of the most exciting developments in AI and technology. Frank Liu , Head of AI/ML at Zilliz, will be speaking! 

Register

Sept 12: AI breaks Privacy: How PrivateGPT Fixes It (virtual)

AI tools store your prompts and documents, so compliance becomes a risk. Daniel Gallego Vico , Co-Founder of PrivateGPT, will talk about the: 

  • Different levels of privacy you find in B2B AI products.
  • Functionalities PrivateGPT provides for developers.
  • Ensuring your organization’s use of AI complies with data privacy regulations.

Save Your Spot

Tim Spann

Principal Developer Advocate | Zilliz (creators of Milvus, world's most popular open-source vector database) | AIM Stack for AI with Milvus vector database

3mo

Some really cool stuff coming next week. I cant wait

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics