From Prototype to Production, AI Accelerated
Artificial Intelligence (AI) is the most transformative technology since the internet itself ... you hear it time and time again from the smartest names in tech. Behind the magical algorithms that route our Amazon prime packages, re-route us around traffic accidents, and auto-reply to messages in our favorite salutation, an army of data scientists are designing, writing, and deploying the algorithms that make it possible. Coiled's mission is simple, to make life easier for the data scientists when they're ready to migrate their latest creation from development to production at cloud scale by automating away the many traditional barriers to success.
Hugo Bowne-Anderson, Head of Data Science Evangelism and Marketing at Coiled joined me via the magic of Zoom from the future in Sydney, Australia. Hugo practiced data science at the Max Planck Institute for Molecular Biology and Yale University, before jumping over to the education side, co-developing and delivering data science classes with DataCamp that reached over 500,000 learners. Hugo's perspective, both as a practitioner, instructor, and now evangelist, provides him a broad perspective when evaluating the "Pythonic Data Science" landscape.
DISCLOSURE*: This interview was sponsored by Coiled. Neither Coiled nor other sponsors have editorial control over the content.
(Jump to the full-length interview by clicking on YouTube,)
Hugo detailed how Python is being used by millions to prototype all sorts of things on their laptops. The problems begin when they want to apply those models to larger data sets, and migrate to production.
Hundreds of thousands, if not millions of people are using Python for data science work, and prototyping on their laptops. The big disconnect is between getting stuff from the laptop into production, on cloud or big clusters
- Hugo Bowne-Anderson
So it's the two-step from Python to Dask for scale, Dask to Coiled for ease of integration into the production environments, both high-performance computing (HPC) clusters and cloud.
This is consistent with two macro innovation trends we see over and over again. First, the democratization of data and tooling gives more people the opportunity to experiment and build. Turns out that the adoption inhibitors aren't simply available, but something slightly more nuanced and frustrating.
There were all these tools, but they didn't know how to access them, how to choose them, ... so I wanted to figure out how to get the right tools to the right people
- Hugo Bowne-Anderson
Let the data scientists do what they do best, not be accidental DevOps, MLOps, public cloud infrastructure engineers.
Trend number two, remove undifferentiated heavy lifting. Coiled's goal is to automate away those things that stand in the way of better data scientist utilization.
Running DASK on your own laptop, really smooth experience ... but if you want to get up and running on the cloud, you've got to set up your account, credentials, authentication, etc., we abstract all of that away
- Hugo Bowne-Anderson
Beyond easing the data scientist day-to-day, IT and team leads have specific regulations, restrictions, control, security, and compliance requirements that must be met. No one wants a high-performance public cloud resource left 'ON' ... for days or weeks when the job only ran for a few hours. The results can be hazardous to your budget.
Our three major stake holders are individual contributors, IT, & team leads
- Hugo Bowne-Anderson
Many stakeholders, differing priorities.
This brings us to Hugo's role of Evangelist. What exactly does that mean, what do you do all day?
Evangelism and Developer relations is one of the most important things happening at the moment .... my job is to increase the signal to noise ratio
- Hugo Bowne-Anderson
"I Love this space, I Love all the stories, I Love the scientists
Where are we seeing this technology change the world?
They stay in a flow state with their work, they don't need to wait days (to breakup their work) ... the gains in productivity are exceptional
- Hugo Bowne-Anderson
The combination of open-source software plus enterprise-needs-meeting commercial entity provides large organizations an innovation engine that utilizes the contributions of data scientists all over the world building libraries to address an array of applications.
... Python data science packages weren't built by software engineers, they were built by research scientists who needed them for a particular task
- Hugo Bowne-Anderson
Helping build broader access and remove barriers to the data scientists building the algorithms powering the AI that's playing an increasing role in our lives every day
Pretty important work in 2021. Thanks for the conversation, Hugo.
Links and Referencses
Hugo Bown-Anderson, LinkedIn Profile, Twitter, GitHub, DataCamp, O'Reilly
Recommended by LinkedIn
Brian Granger, Project Jupyter, Jupyter Notebook, LinkedIn
DevOps - Wikipedia
Fernando Perez, IPython, Project Jupyter, LinkedIn
High-Performance Computing (HPC) - Wikipedia
Linus Torvalds, The mind behind Linux, TED Talk, TED2016
Matt Rocklin, LinkedIn Profile, Blog, MatthewRocklin.com, GitHub
MLOps - Wikipedia
Reproductive Hyper Parameter Sweeps in Machine Learning, Pedro Rodriguez, Pedro's Blog
Resilience and Vibrancy: The 2020 Data & AI Landscape, Matt Turck, FirstMark, MattTurck.com, Sept 2020
Wes McKinney, Python Pandas, Apache Arrow, LinkedIn, WesMcKinney.com, GitHub, Twitter
Work Queue + Python: A Framework for Scalable Scientific Ensemble Applications, Peter Bui et al, University of Notre Dame,
Disclaimer and Disclosure*
DISCLOSURE*: This interview was sponsored by Coiled. Neither Coiled nor other sponsors have editorial control over the content.
Quotations are attributed to the original authors and sources.
All products, product names, companies, logos, names, brands, service names, trademarks, and registered trademarks (collectively, *identifiers) are the property of their respective owners. All *identifiers used are for identification purposes only. Use of these *identifiers does not imply endorsement. Other trademarks and trade names may be used in this document to refer to either the entities claiming the marks and/or names of their products and are the property of their respective owners.
We disclaim proprietary interest in the marks and names of others. No representation is made or warranty given as to their content. User assumes all risks of use.
Content Manager at TriNet | Marketing and Content Manager Building Business Growth
3yGreat job Hugo Bowne-Anderson! Thanks for sharing Jeff Frick (he/him).