Robbers Cave Experiment: Procedure, Results & Legacy

In the summer of 1954, the Turkish-American social psychologist Muzafer Sherif and a small team of colleagues ran what looked, from outside the gate, like an ordinary boys' summer camp in the rolling sandstone country of southeastern Oklahoma. Twenty-two eleven and twelve-year-old boys swam, hiked, played baseball, raided each other's cabins, fought, made peace, and went home. None of them knew they were participants in one of the most ambitious field experiments in the history of social psychology — a controlled three-phase study designed to show how intergroup hostility forms and how it can be undone.

The study, conducted at Robbers Cave State Park near Wilburton, became the empirical core of Sherif's realistic conflict theory: the claim that competition between groups for scarce resources is sufficient to generate prejudice, stereotyping, and hostility, and that the surest route out is the introduction of goals that require both groups to cooperate. Three quarters of a century later, the Robbers Cave findings still anchor textbook treatments of intergroup relations, and the failed previous attempt at the same study — which Sherif suppressed during his lifetime — has reopened questions about how confidently those findings should be read.

Quick Facts About the Robbers Cave Experiment

Conducted in June–July 1954 by Muzafer Sherif, Carolyn Wood Sherif, O. J. Harvey, B. Jack White, and William Hood
Setting: Robbers Cave State Park, Latimer County, Oklahoma
Participants: 22 boys aged 11–12, all white, Protestant, middle-class, with no prior acquaintance
Two groups self-named the Rattlers and the Eagles
Three phases: in-group formation, intergroup competition, intergroup cooperation
Conflict was generated by a tournament with prizes; cooperation was generated by superordinate goals
Key superordinate goals included a sabotaged water supply and a stalled camp truck
An earlier 1953 version of the experiment at Middle Grove, New York, failed and was not published

1. Historical and Intellectual Context

The world of 1954 was a place where the question of how groups come to hate each other had unusual urgency. The Second World War was less than a decade gone, the Holocaust had been documented but not assimilated, segregation was still legal across much of the United States, and the U.S. Supreme Court was about to deliver Brown v. Board of Education that May. Postwar American social psychology had taken on the implicit assignment of explaining how ordinary people could be drawn into ethnic and racial conflict — and, more hopefully, how that pull could be loosened.

Muzafer Sherif's biography mattered here. Born in Ödemiş, Turkey, in 1906, he had witnessed sectarian violence during the collapse of the Ottoman Empire and the Greco-Turkish war. He studied at Harvard and Columbia, working with Gardner Murphy, and in the 1930s produced his classic autokinetic studies on the formation of group norms. His political views as a Turkish leftist had led to a brief imprisonment in Turkey in the 1940s before he settled permanently in the United States. He approached intergroup conflict less as an abstract theoretical puzzle than as a problem he had seen at close range.

Two existing theoretical traditions framed the work. The first was Gordon Allport's contact hypothesis, then taking shape and published in mature form in The Nature of Prejudice in 1954. Allport argued that, under the right conditions — equal status, common goals, institutional support, no competition — bringing groups into contact would reduce prejudice. The second was the psychoanalytically influenced "authoritarian personality" tradition that located prejudice in individual personality structure shaped by harsh child-rearing. Sherif rejected both as primary explanations. He wanted to show that the structure of intergroup relations itself — what groups were competing for and what they could only achieve together — was sufficient to determine attitudes, regardless of personality or mere contact.

The Robbers Cave study was the third in a series. The first, in 1949, was a preliminary camp study near New Haven, Connecticut, that produced suggestive but limited data. The second, in 1953 at Middle Grove, New York, was meant to be the full three-phase test but went wrong in a way that mattered. Robbers Cave was the do-over, and Sherif wrote it up as if the previous attempt had not happened.

2. Research Questions

Sherif posed three connected questions. First, what is sufficient to create a sense of group identity in a collection of strangers — does merely living and working together produce in-group attachment, even without an opposing group? Second, what is sufficient to produce hostility between two such groups — is structural competition for scarce resources enough to generate stereotyping, name-calling, and aggression, even among unselected boys with no prior history? Third, what is sufficient to reduce that hostility — does mere contact between the groups in pleasant settings suffice, or is something stronger needed?

The conjectures behind the questions were specific. Sherif predicted that an in-group sense would emerge spontaneously through joint activity without any reference to outsiders. He predicted that competition with another in-group, even briefly, would produce hostility and the standard markers of prejudice. He predicted that ordinary friendly contact would not be sufficient to undo that hostility, and that what he called "superordinate goals" — goals genuinely shared by both groups and unachievable by either alone — would be both necessary and sufficient to reverse it.

These were strong predictions. They could in principle have been falsified at every stage. As we will see, in the 1953 Middle Grove run several of them effectively were falsified, and the 1954 design was adjusted accordingly.

3. Method and Procedure

Overall Design

The Robbers Cave study was structured in three sequential phases, each lasting roughly a week. The first was in-group formation. The second was intergroup competition. The third was intergroup cooperation through superordinate goals. Throughout, the camp staff were not staff at all in the conventional sense: they were research personnel acting as counselors, with planned activities, observation routines, and measurement tools woven into camp life.

Phase 1: In-Group Formation

The boys were assigned to one of two groups before they arrived, and the two groups were kept entirely separate for the first phase. They lived in different parts of the park, were unaware of each other's existence, and engaged in cooperative activities — hiking, swimming at a creek, pitching tents, preparing meals, building a rope bridge, improving a swimming hole. They developed leadership structures, status hierarchies, in-group jargon, and group names. One group called themselves the Rattlers; the other, the Eagles.

Phase 2: Intergroup Competition

At the start of the second week the groups discovered each other, and the staff announced a tournament: baseball, tug-of-war, tent pitching, cabin inspection, treasure hunts. Prizes — penknives, medals, and a four-bladed knife — were awarded to the winning group, with nothing to the losers. The competition was zero-sum by design. The staff also engineered small frustrations layered on top of the formal competition: a planned camp picnic was timed so that one group arrived first and consumed most of the food, leaving the other group to find scraps.

Hostility escalated rapidly. The Eagles burned the Rattlers' flag; the Rattlers retaliated by burning the Eagles' flag and raiding their cabin. Both groups carried out night raids in which beds were overturned, mosquito netting torn, and personal belongings disturbed. Group members hoarded rocks and socks filled with stones as weapons. Name-calling, derogatory stereotypes, and refusal to share spaces were universal across both groups by mid-tournament.

Phase 3: Intergroup Cooperation

Sherif first tested whether non-competitive contact alone could reverse hostility. He brought the groups together for shared pleasant activities: watching a movie, eating in the same dining hall, attending a fireworks display. These contact-only interventions produced no reduction in hostility and in some cases gave the groups fresh opportunities to insult one another (food fights in the dining hall became a regular event).

The staff then introduced superordinate goals — situations the experimenters had engineered to be jointly necessary and impossible without cooperation. The camp's water supply was sabotaged at the source by the staff, who attributed the failure to vandalism; the boys had to inspect a roughly one-mile water line together and clear it. A truck carrying food for an outing supposedly broke down on a remote road; both groups had to pull it with a rope to get it started. A movie all the boys wanted to see required a joint financial contribution to rent. Each scenario forced the two groups to work together, with shared incentives.

Measures

The team gathered systematic observational data through trained adult observers embedded in each group. They administered sociometric questionnaires asking each boy to rank his preferences for companionship, and adjective-rating questionnaires assessing attitudes toward the in-group and the out-group. They also recorded behavioral incidents and tallied indices of cooperation and conflict. Photographs and field notes filled out the data.

4. Participants and Setting

The Boys

The 22 boys were drawn from a careful screening pool. They were all 11 to 12 years old, all white, all Protestant, all from middle-class, intact, two-parent families in the Oklahoma City area, all with average to above-average school performance, all psychologically screened, and none knew any other participant before the camp. None had a history of family disruption, behavioral problems, or psychiatric care. The selection was deliberately homogeneous so that whatever conflict emerged could not be attributed to pre-existing demographic differences.

The boys were told their parents had enrolled them in a normal summer camp. They paid a token fee. They did not know they were research participants. Sherif viewed this concealment as necessary; he believed knowledge of observation would corrupt the data.

The Site

Robbers Cave State Park covers a large wooded area in the San Bois Mountains of southeastern Oklahoma. It includes a real cavern said by local tradition to have sheltered the bank robber Belle Starr and the Dalton Gang. The site provided two physically separate camp areas connected by trails, ample wilderness in which the groups could be kept unaware of each other in Phase 1, and a swimming creek and dining facilities that could be shared in later phases.

The Staff

The research team played the role of camp counselors but in fact directed the program. Sherif himself supervised from a distance, with O. J. Harvey, Jack White, and William Hood serving as group counselors and observers. The staff intervened actively to produce the conditions of the experiment, including the sabotage of the water supply and the staged failure of the truck — actions that have drawn ethical scrutiny in subsequent decades.

5. Results

Phase 1: Cohesion Without Outsiders

Within days, each group had spontaneously developed an in-group culture. The Rattlers and the Eagles each adopted a name, designed a flag, painted their group identity on signs at the swimming hole and on tee-shirts, and developed informal status hierarchies in which leadership and roles like "tough guy" and "joker" became visible. Sociometric data showed strong preferences for fellow group members. Importantly, this cohesion appeared without any reference to an external out-group; in Phase 1 the boys did not know that another group existed.

Phase 2: Rapid Emergence of Hostility

From the moment the groups discovered one another and learned of the tournament, intergroup hostility began. Each group held a meeting on the morning of contact in which strategies and insults were rehearsed. Within a day, the Eagles were referring to the Rattlers as cheaters, sneaks, and stinkers; the Rattlers returned the favor with their own epithets. Within three days, both groups were burning each other's flags, raiding cabins, and hoarding weapons. Sociometric data after Phase 2 showed near-complete inversion: in-group members were rated overwhelmingly positively, out-group members overwhelmingly negatively. Boys who had been friends in school before the camp but ended up in different groups now wanted nothing to do with each other.

Phase 3: Contact Failed, Superordinate Goals Worked

The contact-only interventions failed clearly. Shared meals in the same dining hall produced food fights; shared entertainment produced jeering. Mere proximity did not reduce hostility, and in some respects it offered new arenas in which the groups could express it.

The superordinate-goal interventions reversed the pattern, but slowly and incompletely after each individual episode. After the joint repair of the water line, hostility briefly resurfaced. After the joint pulling of the food truck, it diminished further. After multiple superordinate-goal episodes, sociometric data showed substantial improvement: out-group ratings became more positive, cross-group friendships developed, and on the final bus ride home the Rattlers and Eagles sat together and used Rattler funds to buy malted milks for everyone. The change was real but gradual; a single cooperative episode did not undo a week of hostility.

Quantitative Snapshots

The published data are mostly observational and sociometric. After Phase 2, the proportion of out-group members rated favorably by Eagles and Rattlers fell to single-digit percentages. After Phase 3 superordinate goals, the proportion rose to roughly a third or more. Stereotyped adjectives applied to the out-group ("sneaky," "smart-alecks") gave way during Phase 3 to more neutral or positive descriptors.

6. The Researchers' Interpretation

Sherif's interpretation was direct and theoretically ambitious. He argued that intergroup behavior is determined by the functional relations between the groups — what they are doing with respect to each other in the structure of incentives — and not primarily by the personality characteristics of their members. If the structure rewards competition, ordinary boys will produce hostility, stereotypes, and aggressive behavior in days. If the structure rewards cooperation, the same boys will reduce hostility and revise stereotypes. The participants are essentially a constant; the structure is the variable.

This is the core of what came to be called realistic conflict theory or realistic group conflict theory. The label "realistic" is meant to distinguish it from theories that emphasize displaced aggression or projected anxiety. The conflict in Sherif's view is not symbolic; it is anchored to actual competition for actual resources.

Sherif also drew a sharp methodological lesson. He argued that the contact-only phase had falsified a naive reading of Allport's contact hypothesis: friendly contact in the absence of cooperative structure was demonstrably insufficient. Reduction of hostility required common goals that the groups could only meet together. This was not a refutation of Allport — Allport had specified cooperative interdependence as one of the necessary conditions — but it was a sharpening, and Sherif treated the superordinate-goals mechanism as central in a way Allport's broader checklist did not.

Implications were drawn outward. Sherif and his colleagues argued that the same principles applied to interethnic, interreligious, and international conflict. Reducing prejudice in adult society, on this view, required identifying or creating shared projects whose success depended on the cooperation of antagonistic groups. The framework was congenial to the postwar internationalist mood; it suggested that peace work was structural, not therapeutic.

7. Modern Reanalyses and Criticisms

The Failed Middle Grove Experiment

The most consequential modern reassessment came from the Australian writer Gina Perry, whose 2018 book The Lost Boys reconstructed the 1953 Middle Grove run that Sherif had not published. At Middle Grove, in the Catskills, the same general design was attempted, but the boys behaved very differently. They quickly recognized that the staff were manipulating them, refused to be drawn into the competition the experimenters tried to engineer, formed cross-group friendships in defiance of group lines, and explicitly objected to the framing the experimenters tried to impose. The experiment collapsed; Sherif's own reaction included a recorded incident in which he berated a colleague and threatened the integrity of the work.

Sherif treated Middle Grove as a failed pilot rather than as a finding to be published. At Robbers Cave, the design was tightened to make resistance more difficult: the groups were kept apart in Phase 1 for longer, the staff were more directive in producing frustration, and the tournament structure was more aggressive. Critics argue that this redesign means Robbers Cave does not show that competition produces hostility under natural conditions; it shows that an active research team can engineer hostility under controlled conditions and that, under different conditions, the same kind of boys may instead resist.

The Active Role of the Experimenters

Across both runs, the experimenters were not passive observers. They scheduled the tournaments, hoarded scarce resources, planted incidents (the stolen picnic, the burned flag responses, the sabotaged water line). At points, staff influence shaded into provocation: there are reports that the Eagles' flag burning was tacitly suggested or rationalized by staff comments. This violates a strong reading of the field-experiment ideal in which the researcher merely sets a structure and watches behavior unfold.

Sample Constraints

The participants were a narrowly selected demographic: 22 white, Protestant, middle-class, behaviorally screened boys aged 11–12 from a small geographic area. Generalization beyond this group requires assumption rather than evidence. The exclusion of girls, racial minorities, working-class boys, and children with prior conflict or behavioral histories means the Robbers Cave findings speak to a particular slice of a particular era. Subsequent work has shown that intergroup dynamics vary by gender, age, culture, and prior experience.

Suppression and Publication Bias

The non-publication of Middle Grove is itself a finding of broader methodological interest. If Robbers Cave succeeded only after a previous attempt had failed, the published study is the survivor of a small but biased sample. This does not mean its findings are wrong, but it means they should be read more cautiously than the textbook narrative suggests. Field experiments are often presented as single decisive demonstrations; in practice, they are often the last of several runs.

Replications and Modern Parallels

Direct replications of Robbers Cave have not been attempted, both because deceptively running such an extended camp would not pass modern ethics review and because the historical specificity of the design makes faithful replication impossible. However, the basic findings — that intergroup competition rapidly produces hostility and that superordinate goals reduce it — have been supported in laboratory minimal-group studies, in school-based jigsaw classroom interventions (Aronson and colleagues), and in a substantial meta-analytic literature on the contact hypothesis (Pettigrew and Tropp). The Robbers Cave conclusions, even if methodologically vulnerable as standalone evidence, fit a larger body of converging evidence.

8. Ethical Considerations

By contemporary standards Robbers Cave would not be approvable. The participants were children. Their parents had been told the boys were enrolled in a research-affiliated camp but the nature of the manipulation — deliberate provocation of conflict including physical aggression and property destruction — was not disclosed in the form that modern informed consent requires. The boys themselves were unaware they were participants throughout.

Several specific incidents raise ethical concern. The night raids that escalated into physical aggression and damage to personal property were anticipated and partly facilitated by the staff. The hoarding of rocks and weapons was observed and not stopped until late. The sabotaged water supply put the boys into a state of real thirst. The food shortage at the picnic was real. Children were exposed to genuine distress, real conflict, and meaningful frustration as the price of the experimental design.

Defenders of the study emphasize that no participant was seriously injured, that the third phase was designed precisely to repair the relationships the second phase had damaged, and that long-term follow-up — limited as it was — did not report lasting harm. Sherif and his colleagues took the question of debriefing seriously by the standards of the day, though they did not conduct the kind of structured debriefing later required.

The deeper ethical concern is structural rather than incident-by-incident. The participants were children whose participation was a product of their parents' enrollment in a "summer camp," and the central manipulation — conflict between groups — was concealed from both parents and boys. Concealment of this magnitude with minors is, under modern human-subjects review, almost never permissible.

9. Influence on Psychology

Realistic Conflict Theory

Robbers Cave is the single most cited empirical anchor of realistic conflict theory. The theory has been extended and refined: by Donald Campbell into the propositions of realistic group conflict theory in the 1960s; by sociological work on labor and ethnic competition; by international relations scholars analyzing resource conflict. The basic claim — that competition for valued resources is a sufficient cause of intergroup hostility — has held up well, with later qualifications about the role of perceived (rather than actual) competition.

The Contact Hypothesis

Sherif's findings put pressure on a naive reading of Allport's contact hypothesis and helped shift the empirical literature toward emphasizing the structural conditions under which contact succeeds. The Robbers Cave demonstration that mere contact failed but cooperative interdependence succeeded is a standard reference in this literature. Pettigrew and Tropp's later meta-analysis of contact effects across hundreds of studies has refined the picture, but the cooperative-interdependence component remains central.

Cooperative Learning and the Jigsaw Classroom

Elliot Aronson and colleagues developed the jigsaw classroom in the early 1970s in Austin, Texas, partly as an applied response to Robbers Cave. By structuring classroom learning so that each student possessed a unique piece of the material that other group members needed, the design built superordinate-goal interdependence into the structure of school work. Outcome studies showed reduced intergroup prejudice and improved performance among minority students.

Peace Psychology and Conflict Resolution

The Robbers Cave model has been an explicit reference in peace-psychology work on ethnic and national conflicts, including programs that bring members of antagonistic groups into shared task projects rather than simply into shared rooms. The Seeds of Peace camps, dialogue projects in Northern Ireland, and various Israeli-Palestinian youth initiatives have all drawn on the logic that interdependent tasks reduce prejudice in ways that contact alone does not.

Social Identity Theory

The next major theoretical step in intergroup-relations research, Henri Tajfel and John Turner's social identity theory in the 1970s, both built on and partly displaced Sherif's framework. Tajfel's minimal-group experiments showed that even without competition, arbitrary categorization alone produced in-group favoritism. This suggested that competition was sufficient but not necessary for hostility — a friendly amendment to Sherif rather than a refutation. Modern intergroup-relations textbooks usually present realistic conflict theory and social identity theory side by side as complementary accounts.

10. What the Experiment Means Today

The Robbers Cave study now sits in an interesting double position. Methodologically, it has been substantially shaken by Perry's reconstruction of the failed Middle Grove run, by attention to the active role of the experimenters, and by the limits of its sample. Conceptually, its core findings have been corroborated again and again in different methodologies and across different populations: competition tends to generate hostility, mere contact often does not undo it, and structured cooperative interdependence often does.

For students of social psychology, the study is now best taught with both stories alongside each other. The first story is Sherif's: that you can take ordinary boys, divide them, set them in competition, and produce hostility, then introduce shared goals and reverse it. The second story is the meta-story: that the first story is itself the product of a research process that included one failed run, considerable experimenter activity, and ethically dubious concealment from participants and parents.

Read in the context of current intergroup conflicts — ethnic, political, online — Sherif's claims have an almost uncomfortable applicability. The structural emphasis suggests that producing peace requires more than tolerance campaigns and contact events; it requires building projects whose success depends on the cooperation of the antagonists. That is harder to engineer in a polarized democracy than it is in an Oklahoma state park, but the diagnosis remains the same.

For contemporary research, Robbers Cave's most lasting bequest may be a research-design instinct. Field experiments are powerful because they generate behaviors at a scale and depth laboratory studies cannot. They are also fragile because they are run once, in one place, with one experimenter team, with high context-specificity. The right response is not to dismiss them as anecdote, nor to canonize them as proof, but to read them as the first chapter of a longer empirical conversation.

Conclusion

The Robbers Cave experiment was an ambitious, ethically problematic, methodologically vulnerable, and conceptually generative piece of field research. It took ordinary preadolescent boys, divided them, watched them turn hostile under competition, and watched them reconcile under cooperation. From those weeks in southeastern Oklahoma, Sherif derived a theory of intergroup conflict that has influenced school integration, peace work, organizational design, and the textbook understanding of prejudice for seven decades.

It would be a mistake to read the study as a clean morality tale and equally a mistake to dismiss it as merely a contrived theater piece. The mid-twentieth century did not provide tools for the kind of decisive demonstrations that current social science aims at, but it did permit the kind of richly observed, structurally informative work that Robbers Cave represents. The findings have been corroborated by later experimental and applied work even where the original methodology is now seen as flawed.

The honest summary is that Sherif identified something real — the centrality of structural interdependence in shaping intergroup attitudes — and that he identified it through a research process whose ethics and methods modern psychology has been right to outgrow. The challenge for the present generation is to keep what the Rattlers and Eagles taught us, and to find ways of confirming and extending it that do not depend on doing to children what was done in those Oklahoma summers.