How does AI work? A 3-minute overview for non-engineers.
Odds are that you’ve heard someone talk about “Artificial Intelligence” or “neural networks” in the recent past especially in light of the cultural phenomena know as ChatGPT.
If you are like most non-engineers, you may have no idea what that actually means.
Hopefully, this will give you a general idea of what someone is really talking about when they mention a “neural network”. NOTE: This is a very high-level overview and there are a lot of different variants and methodologies for how these models can work.
One of the primary goals of “Artificial Intelligence’ is using past data to teach a computer how to make better predictions and smarter decisions (based on whatever your goal is).
A few examples of this would be predicting stock prices, identifying cancerous tumors, building chatbots that emulate humans, self-driving cars, housing prices, etc.
Building linear and logistic regression models had been very popular methods for solving these kinds of predictive problems prior to the AI boom and availability of massive computing power.
For example, if you were trying to predict housing prices you might believe that a home’s price depends on the average price of houses around it, the quality of the school district it’s in, interest rates, etc.
You could convert this into a linear regression model that might look something like:
Ax1 (average price of neighboring houses) + Bx2 (school district) + Cx3 (interest rates) = Y (price of the house)
A, B, C represent the respective weights you place on each of those factors in predicting the price of the house.
As a data scientist, your goal would be to figure out the optimal values for A, B, C that get your predictions closest to the actual home prices (Y).
If you had a big set of historic data about home prices, you’d hopefully be able to make fairly accurate future predictions of home prices once you determined the optimal values for A, B, C (ie how much to weight each variable).
Fortunately, there are better ways to find the best values for A, B, C than the classic “guess and check” method we all learned in middle school which would take forever.
There is a popular mathematical method called “gradient descent” that will help determine what the best values for A, B, C should be so that your model best fits your historic data set.
I won’t go into any detail about how it works here, but it would be a good thing to Google if you are curious.
It’s a very important concept in this field.
In any case, the outcome of your efforts to find the best values for A, B, C (variable weights) would create an equation that looks something like:
0.3x1 + 7.1x2–2.5x3 = Y
This would be your linear regression equation that best fits your historic home price data.
To describe it in English — for each house, you want to predict the price of, you would input x1 (average price of neighborhood houses), x2 (school district) and x3 (interest rates) values of that house into the equation above and it would output the predicted price.
Logistic regression models are different from the linear regression example above but share the same requirement that the person creating the model defines the inputs they think are important (ie what pieces of data x1, x2, x3 should be used to make the most accurate equation).
The problem with both of these techniques is it requires a lot of imperfect guessing by a human on what inputs are important in making accurate models (and even if combinations or quadratic versions of these inputs should be used…ie “interest rates²” or “school district*interest rates”).
This is too much input guessing and typically lack the complexity a neural network model can have.
Human minds aren’t designed to accurately build extremely complicated data models with a wide array of non-obvious inputs to make predictions.
This is where neural networks are different.
Recommended by LinkedIn
They don’t require a person to define the inputs for a model.
So no one is on the hook to figure out whether school districts or interest rates or average home prices in the area or weather, etc are the main variables in predicting a home price.
The neural network handles this for you.
It takes your set of historic data and does a set of calculus processes (if you want to google, check out “forward propagation” and “backward propagation”) that figure out not only what the values for A, B, C should be (the weights of your inputs) but also what inputs are important (ie x1, x2, x3, x4…xN).
So to say it one more time, you no longer need to guess what factors you think impact a home price.
The ‘neural network’ does this part.
The reason it’s called a “neural network” is because it shares a somewhat similar architecture to the human brain.
The human brain has a network of billions of “neurons” which are interconnected nodes that allow people to process information.
A neural network also is comprised of many interconnected nodes organized into “layers”.
The more data you feed a neural network, the smarter and more accurate it can become.
In the linear regression example, the Ax1+ Bx2 + Cx3 = Y could be considered the equivalent of one layer in a neural network.
But unlike a linear model, a neural network can have many interconnected layers allowing for the creation of much more complex and accurate models (kind of similar to how a human brain works…but still not nearly as complex).
For example, there is no way a person would realize that input like “interest rate³*average home prices*school district⁵” is a helpful predictor of a home price.
How could someone possibly know that?
But neural networks can uncover this type of input on their own.
Because of this, it’s worth noting that a person inspecting a neural network likely would not know what the actual inputs mean in the human terms that I described above.
This poses interesting legal questions about liability since no one really knows what exactly is happening inside a neural network.
There are a lot of different types of neural networks; feedforward, recurrent, convolutional, etc and they have different use cases and advantages.
This turned out to be a little longer than I had hoped for.
If you take anything away from this article it should be that neural networks don’t require a person to guess the inputs of a data model.
This alone is a quantum improvement in how accurate models can become and also vastly expands the types of things that can be modeled.
It’s an arms race for access to data and processing power.
Google is making a big push to be a leader in the space.
GCP (Google Cloud Platform) has a lot of dedicated machine learning services and the company also created TensorFlow, which is one of the most popular software libraries for creating neural networks.
Digital Transformation | AI | Solar & EV | MarTech | Business Development | Software Consultation | Solution Development | Real Estate
1yThanks for sharing, great article
Founder @ Telemetree | User acquisition, monetization & retention for apps and games
1yGreat article! Just wanted to leave one comment: "Because of this, it’s worth noting that a person inspecting a neural network likely would not know what the actual inputs mean in the human terms that I described above." The progress in the field of machine learning over the last 10 years has been phenomenal. However, it's crucial to understand the limitations that come with these advancements. One major challenge is the "black box" nature of most ML models, making it challenging to comprehend exactly how they arrived at a decision. I remember listening to a talk by a person who dedicated their Ph.D. research to studying BERT's linguistic quirks. "Black box" solutions can lead to a false sense of reliability, especially in the hands of consumers. As highlighted by the ProPublica investigation, it's essential to approach algorithms with a critical eye and not blindly rely on them, particularly when working with highly "contaminated" data and when actual lives are on the line. It's vital to remember that algorithms are only as good as the data they are trained on and the people who create and use them.
Relationship Builder | Passionate about education for all ages
1yThanks, Charlie! Helpful, but slightly terrifying as well based on use case.