🚨We have a new working paper full of experiments on how AI effects work, and the results suggest a big impact using just the technologies available today🚨 Over the past months, I have been working with a team of amazing social scientists on a set of large pre-registered experiments to test the effect of AI at Boston Consulting Group, the elite consulting firm. The headline is that consultants using the GPT-4 AI finished 12.2% more tasks on average, completed tasks 25.1% more quickly, and produced 40% higher quality results than those without. And low performers had the biggest gains. But we also found that people who used AI for tasks it wasn’t good at were more likely to make mistakes, trusting AI when they shouldn’t. It was often hard for people to know when AI was good or bad at a task because AI is weird, creating a “Jagged Frontier” of capabilities. But some consultants navigated the Frontier well, by acting as what we call “Cyborgs” or “Centaurs,” moving back and forth between AI and human work in ways that combined the strengths of both. I think this is the way work is heading, very quickly. All of this was done by a great team, including the Harvard social scientists Fabrizio Dell'Acqua, Edward McFowland III, and Karim Lakhani; Hila Lifshitz- Assaf from Warwick Business School and Katherine Kellogg of MIT (plus myself). Saran Rajendran, Lisa Krayer, and François Candelon ran the experiment on the BCG side. There is a lot more in the paper: https://lnkd.in/eZUp34CW And in the summary: https://lnkd.in/eASQ_CVr
Thank you for sharing this!
For tasks that depend on standardized knowledge and for which it’s relatively easy to achieve peak performance, LLMs aid the most inexperienced. For tasks that depend on tacit knowledge, high difficulty in achieving peak performance, and generally lots of room to improve, LLMS uniquely accelerate experts.
Wow, those are significant gains — and it shows that using these tools can help bring everyone up to speed. Totally agree with the AI + human combo. That is going to be the future of work, and the great news is that everyone, regardless of tech background, can leverage these tools for their benefit.
Earlier research showed that consultants performed #worse than children 6-12 on assignments creating towers from spagetti and marshmallows mainly due to their fixation on #planthenexecute approach rather than #tryandimprove used by kids. What was the motivation of choosing #consultants as group to study? Because their tasks appear random, vague and insignificant to outsiders (make spreadsheet, complete powerpoint) and quality is therefor hard to measure. Who defined quality: their bosses? Beads and mirrors productivity imho. If a job can be significantly improved using current state of AI, it is likely a #bullshitjob (Graeber). But then we live in a #mimetic society
Defininely worth a read if you want to understand how AI is transforming work. The research is done by a number of highly respected authors. Will Meneray you will want to read this for your foresight work in this area.
The 40% higher quality results by consultants using AI is an eye-opener. While the speed and efficiency gains were expected, the substantial boost in quality underscores the real, tangible benefits of integrating AI in professional work.
Ethan Mollick really interesting working paper. I found this fascinating about AI as a skill leveler or would you say booster? “We also found something else interesting, an effect that is increasingly apparent in other studies of AI: it works as a skill leveler. The consultants who scored the worst when we assessed them at the start of the experiment had the biggest jump in their performance, 43%, when they got to use AI. The top consultants still got a boost, but less of one. Looking at these results, I do not think enough people are considering what it means when a technology raises all workers to the top tiers of performance”.
somehow understandable that the lower performers initially benefit the most and the stronger ones naturally struggle to improve even more. The higher the level, the smaller and more difficult improvements are - it's like that everywhere. what I have not found and do not quite understand: did each consultant work with his own AI prompt system, or were common templates, prompt libraries provided beforehand?
The paper included a bit of Cyborg work as well. For example, our illustration of the Jagged Frontier and the 54 line graphs showing the effect of AI use on work were both rendered by ChatGPT with Code Interpreter. I provided an initial prompt & guidance, but I don’t know Python.