💊 DATA Pill #104 - What can LLMs never do?, Kafka Connect: A Love/Hate Relationship
Hi,
This pill dives into the nuances of Kafka Connect, explores the challenges with LLMs, highlights efficient ETL/ELT processes, examines AI's role in reshaping data teams, and discusses the changing landscape of Terraform usage.
Also, you have the last chance to take part in our contest and win the pass for the InfoShare 2024.
Good luck!
ARTICLES
Building an Efficient ETL/ELT Process for Data Delivery | 15 min | Data Science | Mateusz Kujawski | Personal Blog
This post outlines strategies for constructing a resilient ingestion and ETL/ELT process to facilitate seamless data delivery for our data platform.
Kafka Connect: A Love/Hate Relationship | 12 min | Data Streaming | Abraham Leal | Personal Blog
Apache Kafka is the leading streaming platform businesses use, with Kafka Connect facilitating data transfer to and from Kafka using pre-made, configuration-driven Connectors. This article discusses the advantages and challenges of using Kafka Connect, including practical solutions for common issues and examples with Debezium's Postgres Connector and Confluent's JDBC Sink Connector.
What can LLMs never do? | 9 min | LLM | Rohit Krishnan | Personal Blog
Despite LLMs excelling at complex tasks, they struggle with simple ones, and the reasons behind these failures still need to be clarified. This article explores LLMs' limitations, showing that their failures reveal more about their capabilities than their successes. It analyzes why models like GPT-4 and Opus fail at tasks like Wordle and cellular automata, highlighting challenges in reasoning and generalization.
SKILL LAKE
LLM Zoomcamp | 10 weeks | LLM | DataTalks.Club
It’s a free online course about real-life applications of LLMs. In 10 weeks you will learn how to build an AI bot that can answer questions about your knowledge base.
Topics:
・Introduction to LLMs and RAG
・Open-Source LLMs and self-hosting LLMs
・Vector databases
・LLM orchestration
・Monitoring and Guardrails
・Tips and Tricks
TUTORIALS
Creating a Kubernetes cluster from scratch in 1 hour using automation | 18 min | DevOps | Martin Hodges | Personal Blog
This tutorial discusses the 8 steps to create a Kubernetes cluster from scratch. It carries out the following:
NEWS
Stack Overflow and OpenAI Partner to Strengthen the World’s Most Popular LLMs | 4 min | LLM | Stack Overflow
Recommended by LinkedIn
Stack Overflow and OpenAI today announced a new API partnership that will empower developers with the collective strengths of the world’s leading knowledge platform for highly technical content with the world’s most popular LLM models for AI development.
DATA TUBE
Cloud-Native LLM Deployments Made Easy Using LangChain | 34 min | LLM | Ezequiel Lanza, Arun Gupta | CNCF
This talk walks you through how to smoothly and efficiently transition your trained models to working applications by deploying an end-to-end LLM containerized LangChain application in a cloud-native environment . You'll learn how quickly and easily it can be achieved.
CONFS EVENTS AND MEETUPS
Infoshare | Gdańsk | 22nd-23rd May
Software developers, business leaders, and tech enthusiasts gather annually in Gdańsk for Infoshare, CEE's most prominent event, to connect and evolve. DataMass collaborates with Infoshare to introduce a new stage focused on AI/ML innovation, data engineering, and cloud scalability this year.
Use the SC24-DATAPill10 code to get the 10% discount.
CONTEST!
Win a free Developer Pass to InfoShare!
🤔 How would you name the most clickbait and the most cringe presentation title for the DataMass Stage at the InfoShare Conference?
We already have some examples made by your competitors:
So, what do you do to win an InfoShare pass?
👉 Answer the above question
and optional:
👉 Subscribe to datapill.tech weekly data & AI newsletter
👉 Follow InfoShare
🏆 Rules:
1. Submit your suggestion in the comments of this post or by sending the answer to datapill newsletter mail by 13th May 23:55
2. The Organizer of the contest is GetInData.
3. The winner will be chosen based on the most interesting proposal, as selected by the Organizer. We value your creativity and unique ideas, and we're excited to see what you come up with!
4. By submitting your proposal, you agree that the Organizer may use this idea for marketing purposes.
5. We will announce the winner in the comments on 14th May.
6. The Organizer reserves the right not to select the winner if the proposed answers are not distinctive, offensive, or discriminatory.
_______________________
Have any interesting content to share in the DATA Pill newsletter?
➡ Join us on GitHub
➡ Dig previous editions of DataPill
Adam from the GetInData | Part of Xebia