DATA Pill #056 - Fine Tuning vs. Prompt Engineering LLM, Kedro-Snowflake plugin, and more…

DATA Pill #056 - Fine Tuning vs. Prompt Engineering LLM, Kedro-Snowflake plugin, and more…

Hi,

Your ultimate source for all things data has landed!

In this power-packed edition, I've curated a treasure trove of insights and innovations to fuel your data-driven journey.

Kedro-Snowflake plugin, LLM, unlocking the power of JunoDB, and much more is waiting for you to enjoy.


ARTICLES

From Image Classification to Multitask Modeling: Building Etsy’s Search by Image Feature | 7 min | Data Engineering | Eden Dolev, Alaa Awad | Etsy Blog

This image-based discovery tool on Etsy’s mobile apps is available now. Read the story on how the Etsy team was able to take a proof-of-concept hackathon project, and turn it into a production feature to help make the millions of unique and special items on Etsy more discoverable for buyers.

No alt text provided for this image

Dependency Management at Scale | 5 min | Data Engineering  | Adrian Comisel | Yelp Engineering Blog

Keeping project dependencies up to date is crucial, but there is a Yokyo Drift. It actively scans all repositories in use at Yelp and submits pull requests that upgrade any outdated dependencies, and tracks and monitors the progress of these upgrades. Let’s take a quick look at this solution.


Run your first, private Large Language Model (LLM) on Google Cloud Platform | 16 min | LLM | Michał Bryś | GetInData | Part of Xebia Blog

Imagine you want to develop a personalized language model (LLM) powered assistant for generating financial report summaries whilst ensuring the utmost privacy for your organization, but also guaranteeing the utmost privacy for your organization is a challenge.

In his latest blog post, Michal delves into three essential aspects:  

  • The obstacles that must be surmounted to achieve this goal.
  • The approach you can adopt to construct your very own LLM-based assistant. 
  • Detailed instructions on how to implement this solution on the Google Cloud Platform.


Fine Tuning vs. Prompt Engineering Large Language Models | 9 min | LLM | Niels Bantilan | MLOps Community Blog

Let's dive into this blog post, where Niels describes prompt engineering and fine-tuning in more detail, gives a practical sense of how they are different, and provides you with a few heuristics that will help you begin your fine-tuning journey.

No alt text provided for this image


In MORE LINKS you will read about unlocking the Power of JunoDB, and why Modern Data Platforms don’t do ETL anymore

{ MORE LINKS }



TUTORIAL

From 0 to MLOps with ❄️ Snowflake Data Cloud in 3 steps with the Kedro-Snowflake plugin | 8 min | MLOps | Marcin Zabłocki, Marek Wiewiórka, Michał Bryś | GetInData | Part of Xebia Blog

Marcin, Marek and Michał unveil their newest Kedro-Snowflake plugin. Thanks to this, you can streamline your ML pipelines in Kedro and effortlessly execute them in a scalable Snowflake environment, and all it takes is three simple steps.


In MORE LINKS you will read about: Join a streaming data source with CDC data for real-time serverless data analytics using AWS Glue, AWS DMS, and Amazon DynamoDB

{ MORE LINKS }


NEWS

Announcing NVIDIA DGX GH200: The First 100 Terabyte GPU Memory System | 4 min | AI | Pradyumna Desale | Nvidia Developer

During COMPUTEX 2023, NVIDIA made an exciting revelation by introducing the NVIDIA DGX GH200. This groundbreaking development in GPU-accelerated computing is set to revolutionize handling massive AI workloads. Apart from highlighting the critical elements of the NVIDIA DGX GH200's architecture, this announcement also explores the capabilities of NVIDIA Base Command, which facilitates swift deployment, expedites user onboarding and streamlines system management processes. 

No alt text provided for this image

PODCAST

Data Strategy: Key Principles and Best Practices | 56 min | Data Engineering | Host: Richie Cotton; Guest: Boyan Angelov | DataTalks.Club Podcast

In this episode, you will discover how organizations leverage data to make informed decisions, drive innovation and gain a competitive edge. Tune in to this episode to uncover critical strategies for building a robust data foundation, optimizing data governance and unlocking the true potential of your data assets.



CONFS EVENTS AND MEETUPS

LLMs in Production | 15-16th June | Online

Join 50 Speakers from Stripe, Meta, Canva, Databricks, Anthropic, Cohere, Redis, Langchain, Chroma, Humanloop and so many more.

It is a two day conference of talking with some of our favorite people at the forefront of using LLMs in the wild, and an in-person workshop in San Francisco on how to build and deploy LLM based apps hosted by Anyscale.

________________________


Have any interesting content to share in the DATA Pill newsletter?

➡ Join us on GitHub

➡ Dig previous editions of DataPill 


Adam from the GetInData | Part of Xebia




Richard Cotton

Senior Data Evangelist at DataCamp | DataFramed podcast host | Course creator | Author | Spends all day chatting about data & AI

1y

It wasn't me speaking to Boyan. Wrong podcast host!

To view or add a comment, sign in

More articles by Adam Kawa

Insights from the community

Others also viewed

Explore topics