DeepMind .. ready to blow your mind !

DeepMind .. ready to blow your mind !

Some concepts are just complex, we grasp them at the edges, but it’s really hard to fully understand or appreciate the implications or impacts. Artificial Intelligence (AI) can be a bit like that at times, but one AI story really seemed to catch public attention like no other and it was the success of DeepMind’s AlphaGo in beating the world’s top ranked Go player, Lee Sedol. But why was this such a big deal? and more importantly why is DeepMind’s recent announcement, which has hardly gathered any attention at all, a much bigger deal?

Unlike chess where you can use ‘brute force’ AI (or simply map out all potential moves in a decision tree), the complex strategic nature of Go prohibits such an approach, For example, after the first two moves in chess there are 400 possible next moves, in GO there are 130,000 ! And it only gets more complicated from there :-) 

So the success of AlphaGo became a very potent and comprehensible example of artificial intelligence or 'computers starting to think like humans'.

But how did AlphaGo do this? In simple terms, it has essentially jus been trained. AlphaGo was fed a database of over 30 million expert moves, so it effectively had the same ‘experience’ as someone playing the game for 80 years ! While this ‘supervised learning’ approach has led to huge advances in AI (speech recognition, image classification etc.) it has one very obvious limitation - what happens when there is no human expertise or large dataset to train it? 

Enter DeepMind’s new algorithm - AlphaGo Zero.

AlphaGo Zero uses a ‘reinforcement learning’ approach, so unlike AlphaGo which has been trained on millions of moves, AlphaGo Zero skips this step and learns to play by simply understanding the rules and then playing against itself, over and over again, each time getting a little better - in effect, it becomes it’s own teacher. 

And the results? 

  • After 3 hours, AlphaGo Zero plays like a human beginner - short-term stone capture versus long-term strategy 
  • After 19 hours, AlphaGo Zero has learnt the fundamentals of more advanced Go strategies
  • After 70 hours, AlphaGo Zero is now playing at super-human level 
  • And in a little over 3 days, AlphaGo Zero surpasses the AlphaGo version that beat Sedol Lee - beating it 100 games to 0 ! 

Here I believe we glimpse the future of AI - just think about the following two things: 

Firstly, think about the problems that need to be solved where human knowledge (or data) is unavailable or too expensive/unreliable - we can now start to tackle those with the help of AI. In the words of Demis Hassabis, DeepMind’s co-founder, AlphaGo Zero can provide ‘a thinking engine for scientific research’.

Secondly grasping the fact that one of the reasons AlphaGo Zero is so much better than previous versions is because it is not constrained by human knowledge, opens the door to a world where insight and intelligence will likely be generated independent of humans !

Personally, I think this is truly amazing, and maybe also just a little like the ‘Skynet’ countdown from Terminator :-) but what do you think? 

Kelvin Gillen

CIO & Transformation Director

7y

Spot on Dave Ferguson , biased (and opaque) algorithms are increasingly a real concern - see John Giannandrea (Head of AI at Google) recent comments in MIT Technology Review

HI Kelvin, another thought provoking post! And to think only a 'few' years ago we were impressed when we got a different workflow to trigger dependent on what systemised task was closed by a user.... One of the areas of AI I feel always challenges, is its potential for bias(or indeed worse). Supervised learning approaches, relying on the data available to it can suffer because of what data it is fed. From what I understand, reinforcement techniques rely on creating a framework of rewards (in addition to the rules) so the algorithm knows what good looks like.... --> some years hence "Thank you Skynet for all you give us".... There is also interesting research on inverse reinforcement learning, whereby the rules are inferred, worth having a look at. Irrespective this whole area and its potential for positive impacts on humanity is fascinating!

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics