As Kahneman (2011) pointed out in his book “*Thinking, fast and slow’’*, we have two modes of thinking: fast and slow. For example, we do not need to think much about how to walk, how to eat; but we do need to think slowly for some complex tasks such as planing our travel routes.

In reinforcement learning, there are two main categories of methods: model-free and model based.

**Model-free methods:**never learn task*T*and environment*E*explicitly. At the end of learning, agent knows how to act, but doesn’t explicitly know anything about the environment. Deep learning algorithms are model-free methods.**Model-based methods:**explicitly learn task*T*. (see model-based reasoning to get a sense of it.)

AlphaGo involves both model-free methods (Convolutional Neural Network (CNN)), and also model-based methods (Monte Carlo Tree Search (MCTS)). In fact, AlphaGo is pretty similar to how we humans think: involving both fast intuition (i.e., cost function by CNN) and also careful and slow thinking (i.e., MCTS).

Combining model-free and model-based methods should probably be the way to go for the solutions to many real-world problems (fast intuition + careful planing).

**References:**

Kahneman, Daniel. *Thinking, fast and slow*. Macmillan, 2011.