Kick start your company's data science practice (Part 1)
Past 5 years there has been lot of noise generated surrounding implementation of a data science practice within a company. Quite often, I have seen companies jump into establishing this program without thinking through long term ramifications. In this day and age of open source technology disruptions; companies need to be careful on what paths they choose as a go forward strategy that can be sustainable for the next several years.
There are several factors companies need to look for before attempting to get into this space:
What kind of data you possess internally?
Understanding data already available within the company is a critical step in the process. We have seen when companies grow inorganically through acquisitions same data will be processed and churned through multiple groups. It would be helpful if all of these can be collated in to central repository for further analysis. By doing so you have started building the foundation for data lakes.
Segregate or Anonymous data
Data Lakes established should have built-in security and privacy controls to help segregate and anonymous data
Search within your enterprise for all external data partnerships
Large companies always tend to buy or utilize external data (3rd party data sources) to enhance and enrich their own data. As an example; marketers may look for demographic data sold by 3rd parties to enrich their campaigns.
Look for some to lead your data science practice
When you want to make your first hire ensure the candidate meets many of the following criteria:
- Statistical Background
- Exposure to open source tools and languages
- Exposure to machine learning algorithms
- Hands-on person who likes to code and explore
- Good communicator
- Prior leadership experience
- Worked on your specific data domain
- Implemented production ready use cases
- Does the individual has specific public profile in public competitions like KDD, Kaggle and others?
Decide on tools
There are two categories (Exploration and Production) of needs when you start looking for tools to implement data science within the company.
Best avenue to expedite GTM
The greater chance to success for GTM needs would be to partner with a company who can provide platform based consulting to identify and build your use cases. Platform based approach should address the need for reducing the effort to build new use cases as you continue to mature your offerings. External company should be able to offer seasoned consulting resources who have past experience in building these in the same or similar domain. They should be a long term candidate for some kind of OEM or partnership model.
In Part 2, I am going to talk about the state of the market place for tools, technologies, machine learning models and other measurable KPI's.