Author Topic: Artificial intelligence bests humans at classic arcade games (Read 10265 times)

mouser · « **on:** February 26, 2015, 12:09 PM »

There has been some buzz recently around a few articles that demonstrate machine learning in the video game domain.

Here's one writeup:

Artificial intelligence bests humans at classic arcade games
http://news.sciencem...classic-arcade-games

For the academically inclines, I would recommend:
Playing Atari with Deep Reinforcement Learning
http://www.cs.toront.../~vmnih/docs/dqn.pdf

Which talks in detail about the methods used.

The use of the term "deep" seems to me to be as much about coming up with a catchy term that has gone viral and is being hyped like mad -- and has little innovation behind it -- but the new wave of practitioners using neural networks for large scale problems are getting undeniably impressive results.

Again, getting back to the video game results:

There is nothing particularly novel in the approach -- the domain is wonderful, and the basic focus on using the same architecture and parameters to tackle a large collection of learning problems -- and using large dimensional raw input, is great. And the results are impressive. Again -- in my mind this is more a story of the new wave of practioners who are getting very good at leveraging fairly standard neural network techniques on larger and larger problems.

Having said that, this line of work offers little qualitative improvement on the hard problems in AI -- on serious multiscale hierarchical planning, scene recognition, etc. For that we are still waiting for some paradigm shifts.

TaoPhoenix · « **Reply #1 on:** February 26, 2015, 01:03 PM »

My enthusiasm for AI often exceeds my editing, but here's a couple of thoughts.

Certainly "Q or Reinforcement Learning" feels like a bit of partly complicating the obvious. So it seems that from what little I know of chess programming, they can't guarantee the best move, so they "monte-carlo simulate it" - aka run scads of iterated tests and then the program "tends to notice that such a certain X move tends to lose or win". Sometimes the other side escapes, but it's that "tends" that matters.

So in Space Invaders, if you get stuck on the side, you "tend to get trapped" because you're missing half your movement range. In certain conceptual ways, that feels like "sorta easy" programming to me.

What I don't see is any interaction with "precursor tutorials" such as if your friend comes over and hangs out with pizza for an hour to show you stuff. You still have to play the game, but it sounds like the games tested were "easy to play with clever middle level tricks". So unlike hardcoded strategies, you make your friend's suggestions "a hypothesis" - that's how he always played, so the computer looks there first with at least a baseline. Then some of the friend's suggestions turn out to be sub-optimal. (I think that was called H:0 and H:A in statistics. Yes?) PacMan sounds like it would be a good test here.

Take a game where the human says "I don't know what I am doing" regarding gameplay and I bet the computer will get stuck.

mouser · « **Reply #2 on:** February 26, 2015, 01:18 PM »

I don't mean to sound harsh, and please don't take offense, but I don't think it's helpful saying things like "Q or Reinforcement Learning feels like a bit of partly complicating the obvious" without understanding the math and foundation for these algorithms. Q-learning and other reinforcement learning techniques are elegant, efficient, and based on very sound principles. They aren't the holy grail of human-level intelligence but they are very elegant algorithms. There are great books on this stuff for those who want to learn about it. The now classic book on reinforcement learning is by Sutton and Barto (here), which I recommend.

ps. Your idea to use an expert to initialize training and start as a baseline is an area of active research in current AI -- and in fact was part of the early days of AI.

Deozaan · « **Reply #3 on:** March 29, 2016, 04:45 PM »

NECRO-THREAD ARISE!

Someone wrote and explains a bit about a neural network made for Super Mario World.

MarI/O - Machine Learning for Video Games

EDIT:

And the follow up:

Deozaan · « **Reply #4 on:** July 03, 2018, 12:22 AM »

NECRO-THREAD ARISE AGAIN!

Machine learning AIs are starting to get better, and can now defeat humans in 5v5 Dota 2 matches!

Our team of five neural networks, OpenAI Five, has started to defeat amateur human teams at Dota 2. While today we play with restrictions, we aim to beat a team of top professionals at The International in August subject only to a limited set of heroes. We may not succeed: Dota 2 is one of the most popular and complex esports games in the world, with creative and motivated professionals who train year-round to earn part of Dota’s annual $40M prize pool (the largest of any esports game).

OpenAI Five plays 180 years worth of games against itself every day, learning via self-play. It trains using a scaled-up version of Proximal Policy Optimization running on 256 GPUs and 128,000 CPU cores — a larger-scale version of the system we built to play the much-simpler solo variant of the game last year. Using a separate LSTM for each hero and no human data, it learns recognizable strategies. This indicates that reinforcement learning can yield long-term planning with large but achievable scale — without fundamental advances, contrary to our own expectations upon starting the project.
-https://blog.openai.com/openai-five/

A really interesting (and long) read.

https://blog.openai.com/openai-five/

mouser · « **Reply #5 on:** July 03, 2018, 12:30 AM »

Interesting stuff, thanks for sharing

Here are two recent AI articles cited on that page for people who want to learn more:

Asudem · « **Reply #6 on:** December 26, 2018, 02:51 PM »

Let's exploit really old games for MAXIMUM points!

An AI has managed to cheat with the best humanity has to offer after discovering an exploit in classic arcade game Q*bert and running with it.

While earlier iterations of the AI would play Q*bert properly, at some point in its learning of how the game works, it discovers an exploit that lets it rack up insane points. Naturally, as any score-hunting player would, it repeats the process so it can boost its score in the most effective way possible.

You can see the AI working its way around platforms in the video below. At first, it looks as if it’s aimlessly jumping between platforms. Instead of seeing the game progress to the next round, Q*bert becomes stuck in a loop where all its platforms begin to flash – it’s here the AI can then go on a score-frenzy racking up huge points.

At what point would AI, after completing its goal in an alternate way like this, could it continue "training" if something like quality assurance at the source level fails due to human error? Interesting times.

Author Topic: Artificial intelligence bests humans at classic arcade games (Read 10265 times)

mouser

Artificial intelligence bests humans at classic arcade games

TaoPhoenix

Re: Artificial intelligence bests humans at classic arcade games

mouser

Re: Artificial intelligence bests humans at classic arcade games

Deozaan

Re: Artificial intelligence bests humans at classic arcade games

Deozaan

Re: Artificial intelligence bests humans at classic arcade games

mouser

Re: Artificial intelligence bests humans at classic arcade games

Asudem

Re: Artificial intelligence bests humans at classic arcade games