A field experiment on social preferences using Google Answers
The literature of experimental economics has documented that individuals consistently make voluntary payments. Two methodological questions arise from this fact. First, which are the precise drivers of this pro-social behavior and, second, whether these findings can be extended to real life situations.
Tobias Regner (2014) 1 addresses these questions comparing theoretical and laboratory results with the observational data collected from “Google Answers” (GA). In this on-line service a user posts a question and sets a price for an answer. A Google Answers researcher (GAR) responds, and then the user may optionally tip the GAR. The price is only paid if an answer is given, but there is a 0.5 US dollar non-refundable fee to file a question. Once a GAR takes a question, it is locked and no other GAR can answer it for a few hours. If the answer received is not satisfying, the user can first ask for additional research through an ‘answer clarification’ request. If still unsatisfied, users can request to have the question re-posted or apply for a refund. When the answer is completed, they can also rate the quality of the answer.
The service worked from April 2002 until December 2006, and more than 50,000 questions were answered, at an average price of more than $20. GARs are freelancers, some of whom took the job seriously, answering more than 1,000 questions. There are two important features in the GA design that allows empirical investigation. First, during the first months of the service there was no possibility for tipping. Its introduction provides an opportunity to compare the behavior before and after tipping was permitted. Second, the date of closing of the service was announced briefly before no more questions were allowed, meaning that no reputation could be gained in the last questions.
With the data, Regner can analyze pricing, effort and tipping decisions in a real life market. He considers two possible explanations for tipping and conducts empirical tests to check their validity to explain behavior at GA. One is reciprocity, that implies a social preference for rewarding good behavior, even if this implies a cost and could be done without (e,.g., when one leaves a tip after a good service in a restaurant he is never to visit again), Dufwenberg and Kirchsteiger (2004) 2 provide the theory. The other is reputation, where self-interested users may imitate reciprocals in order to induce GAR to make a high effort in delivering the answer. In this case the theory is in Kreps et al. (1982) 3. Lab experiments that follow a similar structure are reported in Fehr et al. (1997) 4 and Fehr et al. (2007) 5.
The interaction between GARs and users is modeled as a repeated game in which GARs can put different levels of effort, users get a positive payoff from a high-level-effort answer even when a tip is given, and putting in high effort is profitable for the GAR in case he receives a tip. If the interaction occurs just once, under self-interest it is always better not to tip. Anticipating this, GAR will put no effort. This is a bad equilibrium, in the sense that both the user and the GAR will be better off if the effort is made and a tip is given (this is an illustration of what is called the moral hazard problem). Under social preferences, and under certain conditions (reciprocity gains compensate the tip cost, and the proportion of reciprocal individuals is high enough) then the optimal situation (effort and tip) may be achieved in equilibrium.
In the repeated interaction, the model considers that each user has a finite number of questions to ask, and that there is perfect observation of past behavior regarding tipping and effort. Users may be of three types: Reciprocal (R), strategic self-interested (S), and myopic self-interested (M). R always tips a good answer, S tips only if the tip, seen as a reputation investment to induce high effort, is profitable, whereas M never tips. The type of the user is not observed, but GARs may use Bayesian updating to estimate the probabilities of a user being of either type as they collect information about its behavior. This repeated model has some predictions that allow Regner to formulate the following null hypotheses:
- The tip rate of single users is not significantly higher than zero.
- The total amount of a user’s questions has no effect on the tendency to tip.
- The tip rate in a “last period”-like situation is, ceteris paribus, higher than the tip rate of single users.
- There is no individual heterogeneity among users with respect to their tendency to tip.
- The tip history of a user has no effect on the effort level of the GAR.
- Effort levels do not increase compared to when tipping was not possible.
Rejection of these hypotheses means the acceptance of the different types of users and of the equilibrium behavior in the repeated situation. In fact, all null hypotheses except #3 are rejected.
For instance, almost 15% of all single users left a tip, occasional (circa 25%) or frequent (circa 35%) users tip even more often. The regression analysis shows that the tendency to tip is correlated with reciprocity proxies (“Effort of the GAR as perceived by the user”, “Timeliness of the answer”, “Has an answer clarification been provided?”) and reputation proxies (frequency of use). These results confirm the model’s predictions and complement results observed in the lab.
Lab experiments also showed a positive relationship between effort and tip in three-stage gift-exchange games (which include a last stage of voluntary tipping). There is also abundant lab evidence of a wage-effort relationship in the two-stage gift-exchange game (without the tipping), but not hard field evidence. The GA, being a real life analysis, provides evidence in favor of the need of the three stages design to reap the benefits of reciprocity.
The analysis of the data also suggests that two additional conditions are essential. First, the existence of genuine reciprocators is crucial. Without them strategic types have no one to imitate and the positive feedback loops of mutual opportunities to reciprocate would not even start. Second, agents need to be able to update beliefs about principal types. Only then the strategy of imitators pays off and they attract high effort.
A 36% of frequent users and only a 15% of single users tip. Thus, a minimum of 15% of users are reciprocal and of 20% are strategic self-interested (a 15% of the frequent users, included in the 36% are reciprocal and must be subtracted). When restricted to super-users, a smaller sample, but with better observations, 27% are myopic (give negligible tips), 35% are strategic selfish (they tip good answers, but not after the last question), and 38% are reciprocals (they always tip after a good answer).
During the first 6 months of the service no tipping was possible and the behavioral default has been not leaving a tip. After a slow start (6 out of the first 1,000 answers were tipped) in October, reciprocity and reputation proxies explain the tip in November 2002. It appears tipping is adopted slowly by some users motivated solely by reciprocity and is then recognized quickly as a strategy motivated by reputational concerns.
- Regner T. (2014). Social preferences? Google Answers!, Games and Economic Behavior, 85 188-209. DOI: 10.1016/j.geb.2014.01.013 ↩
- Dufwenberg, M., Kirchsteiger, G. 2004. A theory of sequential reciprocity. Games and Economic Behavior 47, 268–298. ↩
- Kreps, D., Milgrom, P., Roberts, J., Wilson, R. 1982. Rational cooperation in the finitely repeated prisoners dilemma. Journal of Economic Theory 27, 245–252 ↩
- Fehr, E., Gächter, S., Kirchsteiger, G. 1997. Reciprocity as a contract enforcement device: experimental evidence. Econometrica 65, 833–860 ↩
- Fehr, E., Klein, A., Schmidt, K. 2007. Fairness and contract design. Econometrica 75, 121–154 ↩