# Naïve Learning in Networks

## Synonyms

Evolutionary learning in networks; Imitation in networks; Local imitation

## Definition

In the following, we use the term naïve learning in networks to describe simple models of behavior in strategic situations.

## Theoretical Background

Behavior in strategic situations can be modeled in a static or in a dynamic way. An example for a *static* concept is the concept of *equilibrium*. In a (hypothetical) world of perfectly rational decision makers, we assume that decision makers form expectations about their mutual behavior. Given these expectations, decision makers optimize. A situation is an *equilibrium* if all decision makers behave optimally, that is, play a *best reply* given their expectations, and expectations are in line with actual decisions of the other participants. This analysis assumes that decision makers understand the situation immediately and that *no learning* takes place. To understand all strategic aspects of a situation requires sometimes a high degree of sophistication.

The alternative is a dynamic approach where decision makers do not find the “correct” choice instantly but, instead, adjust their behavior incrementally and learn over time. We will call this behavior more *naïve*. An extreme example for this approach is *evolutionary learning* which assumes a world of decision makers who possess no rationality at all but rather follow a preprogrammed strategy (like plants or simple animals). In such a world, we assume that more successful strategies grow as the result of an evolutionary process. This evolutionary dynamics requires a definition of a reference group with respect to which decision makers compare their choices and their behavior and, eventually, *learn*. Technically, such a population can be seen as a network. In an *unstructured network*, where potentially all members interact with all other members in the same way, the reference group is the entire population. In a highly *structured network*, where each member is connected to only a small number of other members, the reference group is rather local. If fitness on the local level matters, then we assume that strategies that outperform local competitors grow (at least locally), regardless how they compare globally.

Such a *local dynamics* is not only plausible in a biological context, but also in the context of human decision makers. Here, the paradigm of *learning* replaces that of *evolution*. Stategies that perform well do not grow due to their evolutionary fitness but rather as a result of *imitation* of successful strategies.

Regardless whether *learning* or *evolution* is our guiding paradigm, the dynamics of a population with a *local* reference group can differ markedly from one that is based on global and unstructured interaction.

## Important Scientific Research and Open Questions

### A Theoretical Consideration of Naïve Learning on Networks

To see that naïve learning in networks can generate patterns of behavior that are different from global learning, let us look at the following prisoners’ dilemma game.

Strategies are denoted *C* and *D*, payoffs for player *A* are shown at the bottom left, payoffs for player *B* are shown at the top right in each cell. The two players, *A* and *B*, are in a dilemma situation. If both play *C*, they obtain a payoff of 5 each; however, each of them has an incentive to play *D*. Regardless what the other player does, *D* is always a *best reply* and hence, in a rational world both players only obtain 1. Individual rationality does not seem to offer a way out of this social dilemma.

To find out whether boundedly rational players find a better solution in this game, let us first assume that members of a population of naïve decision makers all follow a predetermined strategy; they either play always *C* or always *D*. Hence, on an individual level they do not learn at all; on a population level, however, newborn players replace old players. Hence, the population changes its behavior and *learns*. In an *unstructured population*, regardless what the rest of the population does, the *D* players always obtain a higher payoff than the *C* players. Hence, if payoffs are helpful for survival, that is, payoff translates into evolutionary fitness, then *D* players reproduce more quickly and *C* players will eventually die out. On a *network* things are less clear.

The diagram shows a part of a network where most decision makers choose *C*. In the center of the diagram, we have a single *D*. Since this decision maker has four *C*s as neighbors, the payoff is, according to the above matrix, 4 × 6 = 24. (In the graph, payoffs are shown next to the nodes in the network.) The four immediate *C*-playing neighbors have each only three other *C*s around them and, hence, a smaller payoff, only 3 × 5 = 15. Of course, the remaining *C*s which are surrounded by four fellow *C*s obtain 4 × 5 = 20. Now let us look at *learning*. Assume that all decision makers in this simple network learn at the same time and use the same rule. In this example, we assume they use *copy best-average* as a learning rule, that is, each decision maker in the network determines the average success of both strategies (as far as these strategies are “visible,” i.e., in the immediate neighborhood). The four immediate neighbors of the *D* player will find that *D* yields a higher payoff than *C* and, thus, choose *D* in the next round. We will have the following network.

Now the situation of the four *D*s at the periphery of the cluster of *D*s has changed. The three *C*s they can see in their neighborhood have an average payoff of (10 + 10 + 15)/3 = 11.66 while the two *D*s they can see obtain (19 + 4)/2 = 11.5 on average. With copy best-average, these decision makers will switch back to *C*. The example should show that the dynamics of a naïve learning rule like *copy best-average* on this simple network is complex and can lead to multifarious patterns of coexistence of the two strategies, *C* and *D*.

### An Experimental Consideration of Naïve Learning on Networks

One way to understand strategic interaction among decision makers better is a controlled experiment in a laboratory where participants are paid according to their success. Since participants are expensive and research budgets are tight, the networks that are studied in the lab are usually simpler and smaller than the one shown above. It is also easier to analyze such a network theoretically. Eshel et al. (1998) study a ring of players where each player sees and interacts with two players to the left and two players to the right. Here, we see a part of such a network with five *C*s in the middle of the picture (imagine that the ends of the line are somewhere tied together). Payoffs are again shown next to the nodes in the graph.

The *D* who is just at the border of the cluster of *C*s has a payoff of 14. Let us introduce another learning rule here and let us assume that learning is based on *copy best* (the learning rule we mentioned above, *copy best-average*, implies a similar dynamics). The best strategy this *D* sees is the *C* on the right who obtains 15. Hence, our *D* will become a *C* in the next round, if he uses *copy best-average*. Our cluster of *C*s will be larger in the next round.

The following graph shows the typical result of a laboratory experiment by Kirchkamp and Nagel (2007) with 18 human subjects. Columns show the state of the (simple) network in a given round. Rows show the sequence of choices of *D* or *C* of a single subject. Decision makers interact repeatedly for 80 periods with their two fixed neighbors to the left and their two neighbors to the right. White dots denote *D* and black dots denote *C* (Fig. 1).

Fig. 1 Evolution of cooperation in an experiment. The graph shows the typical result of a laboratory experiment by Kirchkamp and Nagel (2007) with 18 human subjects. Columns show the state of the (simple) network in a given round. Rows show the sequence of choices of D or C of a single subject. Decision makers interact repeatedly for 80 periods with their two fixed neighbours to the left and their two neighbours to the right. White dots denote D, black dots denote C |

In the first round of this experiment, a large number of participants play *C*. This is a coincidence that should provide almost optimal conditions for the above-mentioned theory. Many players can now see that payoffs are large in a cluster of *C*s and, hence, the number of *C*s should stay large. In this (typical) experiment, *C* does not grow but, instead, dies out entirely. In the last (80th) round, all decision makers choose *D*. The reason is that decision makers are perhaps more naïve than *copy best* or *copy best-average*. In the experiment, choices do not depend much on success of neighbors but mainly on own success. In other words, decision makers do not learn from others, they mainly learn from their own experience, neglecting a large amount of the information available to them. There is actually a reason to treat experience of other decision makers in the network with caution. These other people are in a different situation on the network; they have different interaction partners – hence, what is good for them need not be good for us.

## Cross-References

Adaptation

Imitation

Learning

Social Networks

## References

- Axelrod, R. (1984).
*The evolution of cooperation*. New York: Basic Books. - Eshel, I., Samuelson, L., & Shaked, A. (1998). Altruists, Egoists and Hooligans in a Local Interaction Model.
*The American Economic Review, 88*, 157–179. - Kirchkamp, O., & Nagel, R. (2007). Naïve learning and cooperation in network experiments.
*Games and Economic Behavior, 58*(2), 269–292.