Hyperparameter Optimization Techniques: Will AI make this conversational in the future?
Is this a complex work for Data Scientists or have we reached a stage where complex models may make this a conversation to find optimal results?
Hyperparameter: What are they and why are they important in Machine Learning
I connected with an old friend yesterday and we were exchanging notes about our interest in Machine Learning. One topic that caught my attention from the conversation was - Hyperparameters. In the world of data science, this is a topic for debate. Before I get into those, let’s first define what a hyperparameter is.
In Machine learning, a machine learning model has model parameters and these can be learned / derived automatically during the training process. These model parameters are the properties of the training data and are not set beforehand. On the other hand, a hyperparameter is a parameter that cannot be learned during the training process and has to be set beforehand. For example, in a Gaussian Mixture Modeling (GMM), the number of clusters N is a hyperparameter. In a Neural Network model, an epoch is a hyperparameter. An epoch is one complete pass of the entire training dataset to pass through the training process of the algorithm. Other examples are learning rate, neural network nodes, neural network layers, maximum allowed depth in a decision tree. There are more and these are just a few examples.
Each machine learning model has a set of hyperparameters and they are not equally important. So as a data scientist, it is important that we carefully choose them and tweak them. That leads to the next topic - hyperparameter tuning.
Hyperparameters directly control model structure, function, and performance. Unfortunately there is no silver bullet that tells us which hyperparameter and their value will provide the optimal results. Hyperparameters are like dials for the model. Hyperparameter tuning is a slow and tedious iterative process to find the optimal result. There are numerous tuning algorithms that exist to help the data scientist to get to the answer quickly.
Hyperparameter optimization techniques are steps taken by the data scientist to reach the optional result. There are many techniques - manual search, grid search, random search, Bayesian Optimization to name a few. AWS with the introduction of SageMaker allows you to perform automatic model tuning. In addition to this, now we have a number of Python libraries that make implementing hyperparameter tuning simple for any machine learning model.
So to my earlier point about the debate: There are two points of discussion.
1. Are data scientists just programmers that are tweaking models with these hyperparameters to find the best answer? If we teach someone, wouldn’t it be just a matter of asking someone to do this and remove the need for the highly skilled data scientists? Yes, this is controversial and that's why this is a point for debate. Obviously, the ones that really do this job well are highly skilled in math and understand the details behind the models. However, with every university trying to spit out data scientists and engineers, are we just getting a bunch of programmers that can write good code to find the optimal result?
Recommended by LinkedIn
2. Have we reached a state where complex models can be created to take the work of a data scientist to find the optimal result. Will there be models built in the future that will spit out options to the business / product owners. That would make this a conversation with the models to find the optimal result.
I don’t know the answer to these two, but the future does look different as we start growing and learning more about these models and its usage.
Thank You for taking the time to read through this article. I welcome you to share your feedback and thoughts in the comments. Obviously, I am expecting some extreme comments but let’s keep it lively and inclusive for everyone to share their opinion. Let’s go!!!
References:
AI Engineer | Web Developer | Certified Data Scientist
10moGuys. It will be automated, with human oversight. Just like autopilot in airplanes. Tuning hyperparameters & optimization requires reasoning, which requires knowing the ground-truth. So be my guest and p-value hack your way to better performance - yet you have to always ask - "does it make sense?".
Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer
10moYou mentioned the evolving landscape of hyperparameter tuning and the potential automation of this process. Similar to how automation has impacted various industries, the history of automation in AI tasks indicates a trend towards increased efficiency. For instance, automated feature engineering tools have streamlined data preparation. However, the nuanced decisions involved in hyperparameter tuning might still require human expertise. Considering the ongoing advancements, do you foresee a complete shift towards automation, or do you believe a harmonious collaboration between automated tools and skilled data scientists will be the future norm in optimizing machine learning models?