11/05/2007

A Game Too Human

by Jenna Beck

 This story won the The Comment 2007 feature writing contest, and appears below with permission of The Comment editor.

 

In the faculty break-room of a junior high school in Japan, Principal Iwasa points to a laptop computer screen. “There,” he tells the junior teacher, who taps an optical mouse to place a digital black stone on the game board. Instantly, a white stone appears at another point-- the computer has made its move. At the beginning of the game, Iwasa Kocho Sensei admired the computer for its speed. Now, he is altogether exasperated with his adversary. “This machine,” he emphasizes the noun, “does not know alive and dead!” The game finishes in a few minutes, an easy win for Iwasa.

Computers are good at games. Games, with their definite rules and discreet, win-or-lose outcomes, are the stuff of binary dreams. But there's an exception-- the eastern game of Go. Go is played on a 19x19 grid with black and white stones. The object of the game is to keep more of your stones alive and to secure more territory on the board than your opponent. Though it is a win/lose game that follows simple rules, in 30 years of trying no one has been able to create a computer program that can beat even intermediate human players. The stubborn game seems to be too human for computers. The response— make computers more human.

Good human Go players rely on such nebulous powers as judgment, intuition, and pattern recognition to evaluate the life and death of their stones during a game. An apparently “dead” group can come alive with the placement of a single stone, and an apparently “live” group can be killed just as easily. Go players rely so heavily on intuition and experience to evaluate their plays that the game seems, in those respects, to be unteachable. David Fotland, an advanced Go player who also created one of the best commercial Go programs for computers, says that “You can’t write down a set of rules and directions that would make someone play a good game of Go. Good players are drawn to the right place on the board.” If you can’t write the directions for humans, you can’t write them for computers.

Beyond Blue 

When IBM’s Deep Blue computer defeated the world chess champion Garry Kasparov in 1997, its programmers used two classical programming methods known as tree search and evaluation. The evaluation function is the key to a program’s success because it shows the computer which moves are better than others. Chess programs play by searching hundreds of thousands of possible moves and evaluating each one to see which would lead to the highest score. If a programmer can write a good evaluation function for a game, the computer will beat out a human player by brute force because it can search through so many more possibilities.

Go is far more difficult game to evaluate than chess because there are more potential moves and because the life or death of the stones often remains unfixed until the end of the game. If a player thinks he will be able to kill an enemy group of stones later on, he will judge the group to be dead and use his current move to capture more territory or attack the enemy in a different place. Throughout most of the game the life and death of the stones hangs in the realm of human judgment. The challenge for modern Go programmers is to create the evaluation function that replaces that judgment in their computers.

Digital Neurons

For some, this means humanizing the programs. Programmer Markus Enzenberger from the University of Alberta created a program called “Neurogo” which won a medal in a computer Go tournament in 2003. As its name suggests, NeuroGo is a program that emulates the function of the same neural networks found in human brains.

Just as a certain amount of stimulation will trigger an electrical signal in a neuron, a digital “neuron” has a weight of significance similar to a real neuron’s activation threshold. If the weight receives enough positive input from a move it searches and evaluates, the weight of significance is reached and the computer places a stone on the board. The “intelligence” lies in the weights of significance: after each game finishes, it is scored and played again in reverse. The program compares the desired outcome to what actually happened, and adjusts its weights of significance for the next game. So by trial and error, it adjusts its own evaluation function to reward better moves. More simply, we can say the system learns from experience.

Enzenberger trained NeuroGo like a professional athlete. He played it against itself, and watched it grow slowly stronger as its weights adjusted over thousands of games. Not only the strength of the program, but the style of play changed over the course of the training sessions. It developed in same way many human beginners learn to play Go: first it placed stones randomly on the board. Then it moved on to safe, defensive plays. Finally, it aggressively went after territory and enemy stones.

NeuroGo has not made much progress since 2003. Despite its impressive achievements and hundreds of thousands of games, there are some basic Go concepts that it still hasn’t mastered. Meanwhile, Enzenberger is getting tired of the interminable training sessions. He may soon abandon neural networks for another type of program, Monte Carlo, which gets its name from the gambling quarter in Monaco.

Randomness Rules

At first glance, Monte Carlo seems less human than other programs. Where classical programs struggle with evaluation functions, and where neural networks learn evaluation for themselves, Monte Carlo steps around it. Rather than trying to evaluate the importance of a move, a Monte Carlo program searches by randomly filling in stones on the board the move gives rise to. If more random games are won than lost, it deems the move a favorable one. The probabilistic program thinks nothing like a human being.

But it is human in some respects. In May 2006, at the Computer Olympiad in Turin, a Monte Carlo program called CrazyStone turned heads by taking the gold medal. Remi Coulom, CrazyStone’s programmer, says that, “in a way, it does play a little more like a human than other programs” because its probabilistic nature makes risky moves when its in trouble and safe moves when its ahead. Other programs tend to play uniformly throughout a game. Now, because of programmers’ interest and increased computing power, Monte Carlo programs are the best in the world.

Diminishing Returns?

The real test of computer Go will be for a program to beat a professional human player. Pro Go players are scouted like prodigies in Asian countries, raised on the game from as young as five years-old. They might become first-level professionals by the age of 12; some masters reach 10th-level before death. “Before this year,” says NeuroGo creator Markus Enzenberger, “if you asked me if a computer would beat a professional Go player, I would have said ‘Not in my lifetime.’ Now, I think we’ll see it happen in the next 10 years.” David Fotland, the classical programmer, guesses 20 years, but insists that he “won’t be the one to do it.” Like Enzenberger, the years of Go programming with diminishing returns have worn him out.

Remi Coulom, whose small Monte Carlo program is much easier to improve than a classical program or a neural network, won’t say when he thinks a program will beat a first-level professional. As for whether a computer can ever beat a high-level master, he says “I believe this will happen. Maybe not within my lifetime, but it will happen.”