In conversation with Kristen Kehrer

Further

Further | Own The Unknown

Published Oct 31, 2024

Welcome to the second October issue of Further's Own the Unknown™ LinkedIn newsletter, which means it is time to reflect on this month’s guest, Kristen Kehrer , and our conversation on October 16th, which you can see in recording here. Twice monthly, we'll share some of the knowledge we've gained from our thought leadership interviews with the second issue each month dedicated to highlights from the conversation. In our first October issue, we discussed Kristen’s background and her book.

Effective Leadership of Data Science Teams

Let’s jump right in with the first topic we discussed. Further’s Principal Data Scientist Keith McCormick asked Kristen about organizational structure and leadership structure for data science, adding the context that he’s seen many different structures including reporting up through IT, centers of excellence, and data science embedded in lines of business. Kristen had a lot to say in response.

"When the business feels like they're not getting the results they need from the data science team, it can lead to frustration, especially if there’s a perception that the team isn’t delivering. Sometimes, to address this, a department like marketing might decide to spin off and say, 'You know what, we’re going to hire our own analyst directly into the team.' But I’ve seen it happen more than once where that person—often a junior analyst—gets inundated with ad hoc requests, sometimes on very tight timelines and with limited guidance. Without the support of a structured data team, they often don’t have the proper context, scoping, or background in hypothesis testing, which leaves them without the tools to push back on unrealistic expectations. So, they end up buried in requests, and it doesn’t usually end well. The lack of alignment and understanding about what data science can deliver is part of why we sometimes see data teams struggle to demonstrate value when they’re siloed like that."

Keith followed up with a question about the alternative of a center of excellence structure.

Kristen: "I do love the idea of a center of excellence. I think, you know, we’re constantly thinking about... these are data projects, but the majority of the time when things break down, it’s due to communication, because these organizations are so large. You have stakeholders in multiple areas... and there’s intricacies everywhere. But really, trying to get those people in a room who can set that data scientist up for success and fully understand the problem, so that they can go get data and start it... it makes sense to have things centralized in a way where we’re able to go and speak to the people who need to help us."

This inspired a question about setting priorities with this structure. We posted Kristen’s reply as a video clip in our feed.

The Rise of MLOps

MLOps is a big topic, but Keith and Kristen were able to identify the period around 2016 and 2017 where momentum started to build for MLOps and then it rapidly became critical. The older approaches of tracking algorithmic experiments in spreadsheets or notebooks (which both Keith and Kristen did early in their careers) became impossible. Those techniques simply couldn’t scale to today’s complexity.

Kristen: "Ten years ago, I don’t think I was doing any of those things right, and these tools for MLOps really sort of started to come on the scene around 2017 or 2018. In 2012, when I was fitting a model, I’d be trying... if it’s time series, I’m trying all these different lags, I’m trying different ways of smoothing price, trying different variables, and I’m writing it all down in a notebook. And I remember the look of horror on my boss’s face when I gave my notice, because he knew I was taking all of what I had tried—everything that went into building these models—with me. And, you know, I do like to think that I’m pretty good with documentation... but at the same time, without the MLOps tools we have now, it was hard to retain everything."

Keith: “What we were doing ten years ago (is) impractical today."

Kristen: "And then with these MLOps platforms, you’re not just tracking experiments; you’re able to do data versioning too, so if the business says, 'Hey, that thing you did, can you analyze this metric?' I can go back and pull that exact same data set. And, I mean, it’s critical because if you try to grab a different dataset, you might end up with different numbers, and now you’re spending two weeks trying to figure out why the numbers don’t match. So, MLOps is saving time, helping with collaboration, and just making sure that we’re producing good science."

Recommended by LinkedIn

Key Lessons from 15 Data Leaders & Industry Experts

Bernard Marr 2 years ago

Data Science Teams: Good Questions to ask by Data…

Doug Rose 4 months ago

Welcome to #DataIsForDoing

Barton Poulson, PhD 1 year ago

Working with Large Language Models

Keith, Kristen, and our audience would all agree that we could have spoken about LLMs for much longer than a one hour interview allows.

Kristen: "I know that I use [LLMs] frequently, and now, of course, we are using them more in business... The challenges that I’m really seeing is that then you go to move that to production and... what’s really the problem is that the data scientist or the dev gets really excited to go build something. But we don’t want the content that’s coming out to have a ton of exclamation points. We want it to be at a fifth-grade reading level so that it’s not using big words and aligns with how the business wants to communicate."

Those criteria tend to be highly specific to the business situation. For instance, customers who interacting with a chatbot with English as a second language, or customers with lower levels of education, might require these kinds of adjustments or customization.

Keith followed up with the following observation: "I get the impression that sometimes (clients or internal customers) think that basically, what’s happening is you’re taking the text that maybe was, let’s say, typed in a chat bot... as data scientists, we’re basically taking that text unaltered, sending it to the large language model, and then the model comes back with some text, mostly unaltered. So, you’ve mentioned a couple of things that indicate that that’s clearly not the case.”

Of course, fine-tuning is the hot topic now, and Kristen elaborated on her technique. We encouraged you to watch the entire recording for more. Also, Further’s Head of Data Science and AI, Cal Al-Dhubaib , has given a recent talk The Key to Designing Winning Generative Experiences which included some high-level discussions of fine tuning.

More About Kristen

Kristen Kehrer is the coauthor, with Caleb Kaiser , of Machine Learning Upgrade: A Data Scientist's Guide to MLOps, LLMs, and ML Infrastructure, which was released in August. She also has a LinkedIn Learning course, AI and the Future of Work: Workflows and Modern Tools for Tech Leaders.

In addition to her thought leadership, she serves as Head of Decision Science and AI at MoneyGram International . Kristen has been delivering innovative and actionable machine learning solutions in industry since 2010 in the utilities, healthcare, and eCommerce. Also #8 global LinkedIn Top Voice - Data Science & Analytics in 2018 with 98k followers in data science. Creator of Data Moves Me, LLC and previously Faculty/SME at Emeritus Institute of Management.

Upcoming interviews and events

If you haven't done so, follow Further here on LinkedIn. That's the best way to get the latest news. And go to our upcoming event with Donald Farmer , and click on "attend" so that you won't miss our November interview. You'll also be able to watch the recording in your LinkedIn feed.

Coauthors Ian Barkin , and Tom Davenport will be joining us for our interview in December to discuss their new book. And if you are in the Bay Area, mark your calendar for a very special event, Leveraging Google Cloud & AI for Competitive Advantage in Marketplace, held in the Google Cloud offices in Sunnyvale on Wednesday, November 13, 2024. Finally, join the waitlist for a major sold-out event. Further is the presenting sponsor of the inaugural Ohio AI Summit powered by OhioX. That this event so quickly sold-out underscores the caliber of Ohio’s amazing AI community.

In conversation with Kristen Kehrer

Further

Further | Own The Unknown

Effective Leadership of Data Science Teams

The Rise of MLOps

Recommended by LinkedIn

Working with Large Language Models

More About Kristen

Upcoming interviews and events

Own the Unknown™

861 followers

More articles by this author

Insights from the community

Others also viewed

DATAcated Expo is LIVE!

Driven by Data | The Newsletter, Edition 53

Explore 50 Quotes About Data That Inspire and Inform

Data Intelligence | Free data events | Notable updates

MDS Newsletter #22

☝️Ask this one question to save heartburn (and millions of dollars) in your data science journey

Toward Data-Driven Decision-Making

The Beverly Orbit - News: January 2023

Measuring the value from data science

Out-of-this-world October

Explore topics

Effective Leadership of Data Science Teams

The Rise of MLOps

Recommended by LinkedIn

Working with Large Language Models

More About Kristen

Upcoming interviews and events

Own the Unknown™

861 followers

In conversation with Tom Davenport and Ian Barkin

Dec 20, 2024

Own the Unknown™ with Tom Davenport and Ian Barkin

Dec 9, 2024

In conversation with Donald Farmer

Nov 20, 2024

Own the Unknown™ with Donald Farmer

Nov 7, 2024

Own the Unknown™ with Kristen Kehrer

Oct 9, 2024

In conversation with Jonathan Reichental

Sep 13, 2024

Welcome to Own the Unknown™!

Aug 29, 2024

Insights from the community

Others also viewed

DATAcated Expo is LIVE!

Driven by Data | The Newsletter, Edition 53

Explore 50 Quotes About Data That Inspire and Inform

Data Intelligence | Free data events | Notable updates

MDS Newsletter #22

☝️Ask this *one* question to save heartburn (and millions of dollars) in your data science journey

Toward Data-Driven Decision-Making

The Beverly Orbit - News: January 2023

Measuring the value from data science

Out-of-this-world October

Explore topics

☝️Ask this one question to save heartburn (and millions of dollars) in your data science journey