The Unz Review - Mobile
A Collection of Interesting, Important, and Controversial Perspectives Largely Excluded from the American Mainstream Media
Email This Page to Someone

 Remember My Information

Topics Filter?
Nothing found
 Teasers[email protected] Blogview

Bookmark Toggle AllToCAdd to LibraryRemove from Library • BShow CommentNext New CommentNext New ReplyRead More
ReplyAgree/Disagree/Etc. More... This Commenter This Thread Hide Thread Display All Comments
These buttons register your public Agreement, Disagreement, Troll, or LOL with the selected comment. They are ONLY available to recent, frequent commenters who have saved their Name+Email using the 'Remember My Information' checkbox, and may also ONLY be used once per hour.
Ignore Commenter Follow Commenter
🔊 Listen RSS

In 2007, Dan Dediu and Bob Ladd published a paper claiming there was a non-spurious link between the non-derived alleles of ASPM and Microcephalin and tonal languages. The key idea emerging from this research is one where certain alleles may bias language acquisition or processing, subsequently shaping the development of a language within a population of learners. Therefore, investigating potential correlations between genetic markers and typological features may open up new avenues of thinking in linguistics, particularly in our understanding of the complex levels at which genetic and cognitive biases operate. Specifically, Dediu & Ladd refer to three necessary components underlying the proposed genetic influence on linguistic tone:

[…] from interindividual genetic differences to differences in brain structure and function, from these differences in brain structure and function to interindividual differences in language-related capacities, and, finally, to typological differences between languages.”

That the genetic makeup of a population can indirectly influence the trajectory of language change differs from previous hypotheses into genetics and linguistics. First, it is distinct from attempts to correlate genetic features of populations with language families (e.g. Cavalli-Sforza et al., 1994). And second, it differs from Pinker and Bloom’s (1990) assertions of genetic underpinnings leading to a language-specific cognitive module. Furthermore, the authors do not argue that languages act as a selective pressure on ASPM and Microcephalin, rather this bias is a selectively neutral byproduct. Since then, there have been numerous studies covering these alleles, with the initial claims (Evans et al., 2004) for positive selection being under dispute (Fuli Yu et al., 2007), as well as any claims for a direct relationship between dyslexia, specific language impairment, working memory, IQ, and head-size (Bates et al., 2008).

A new paper by Dediu (2010) delves further into this potential relationship between ASPM/MCPH1 and linguistic tone, by suggesting this typological feature is genetically anchored to the aforementioned alleles. Generally speaking, cultural and linguistic processes will proceed on shorter timescales when compared to genetic change; however, in tandem with other recent studies (see my post on Greenhill et al., 2010), some typological features might be more consistently stable than others. Reasons for this stability are broad and varied. For instance, word-use within a population is a good indicator of predicting rates of lexical evolution (Pagel et al., 2007). Genetic aspects, then, may also be a stabilising factor, with Dediu claiming linguistic tone is one such instance:

From a purely linguistic point of view, tone is just another aspect of language, and there is no a priori linguistic reason to expect that it would be very stable. However, if linguistic tone is indeed under genetic biasing, then it is expected that its dynamics would tend to correlate with that of the biasing genes. This, in turn, would result in tone being more resistant to ‘regular’ language change and more stable than other linguistic features.

To test if these features are indeed stable, Dediu employs a Bayesian phylogenetic approach to compare a large body of linguistic features and language families from WALS. Without going into specific methodological details, the current study uses “multiple software implementations, data codings, stability estimations, linguistic classifications and outgroup choices” to provide a greater degree of robustness. His general findings suggest linguistic tone is comparatively stable: both as a polymorphic feature (8th out of 68 polymorphic features) and in its two binary aspects (simple tone systems, 23rd out of 86; complex tone systems, 8th out of 86).

Besides tone, the current study also turned up other interesting patterns. For instance, the presence of front rounded vowels appears to be extremely stable and shows a skewed geographical distribution, perhaps indicating a new case of genetic biasing. Dediu additionally points towards alternative methods to specifically test the genetic biasing hypothesis, such as correlated evolution (the extent to which change in one trait is associated with change in another). For now, though, I want to remain focused on tone. In particular, another new study by Javikivi et al (2010) draws into question some of the distinctions linguists make between tone and duration.

For those of you who are relatively lay on the subject: speakers of tone languages use variation in the fundamental frequency (F0) to perceive differences in lexical meaning. Conversely, for those of us who speak non-tonal languages, such as English, the fundamental frequency is used in other functions (e.g. extra-linguistic aspects, such as emotional content). Another commonly made distinction are those languages where durational information relays these differences in lexical meaning. Finnish is a good example this and these types of languages are known as quantity languages. One last point to make is how these phonological phenomena of segments, tones and length appear discrete and categorical, very much like colours in vision, yet this apparent conscious perception of speech sounds is actually continuous.

Now, what Jarvikivi et al present is evidence to suggest both duration and pitch are intimately related, challenging our assumptions of there being a clear-cut conceptual distinction. Besides reviewing a whole load of literature suggesting at such a relationship, the authors also offer two experiments, of which I’ll cite their conclusions:

Experiment 1 showed that, in addition to duration, participants’ categorization of syllables as long or short was robustly dictated by whether the relevant (first) syllable had a level or a falling tone corroborating similar observations in previous studies… More importantly, however, the result from Experiment 1 extended to Experiment 2 where we showed for the first time that the effect of tonality is not restricted to simple offline categorization, but is systematically and automatically used by speakers in rapid online word recognition to identify phonological quantity.

In relation to Dediu’s current paper, and his previous work with Robert Ladd, the typological generalisations, which includes quantity languages in the non-tone sample, may need revising somewhat. With Jarvikivi et al.‘s results, the use of pitch to mark and process phonological distinctions does not necessarily mean a language is categorised as a tone language. A remedy to this problem, albeit one which is very difficult to implement, would be to directly test whether those genotypes associated with tone and non-tone languages are “equally sensitive to tone information without the confounding influence of language background entering the picture”.

The geographic distribution of non-tonal languages taken from WALS. Note that Finnish is included in this sample.

Of course, even if the typological categories were reclassified, this would in no way be a direct refutation of the genetic anchoring hypothesis, and nor is it probably adequate to rule out the significance of the ASPM/MCPH1 and tone correlation. Jarvikivi et al‘s point is one where we cannot unequivocally state that the Finnish quantity system is solely based on a heightened sensitivity to acquire and process durational information. Instead, we must also take into account its sensitivity to pitch information as well. Still, it may be the case there are differences between prototypical tone languages and those such as Finnish. For one thing, Finnish speakers could be less sensitive to pitch, as the authors note:

Thus, pitch sensitivity could be a continuous trait that affects cross-population variance with respect to the sensitivity to tone information; some speakers may carry alleles that make them relatively more sensitive to tone information; or, vice versa, Finnish speakers might carry alleles that make them less sensitive to tone information than speakers of tone languages.

To conclude, if future experimental work confirms the results and hypotheses reported by Dediu, then approaches in linguistics and biology will be significantly altered. Still, even if it turns out the relationship between these two derived haplogroups and tone is non-existent, the overarching approach applied in this paper certainly warrants future investigation and elucidation.


Dediu D (2010). A Bayesian phylogenetic approach to estimating the stability of linguistic features and the genetic biasing of tone. Proceedings. Biological sciences / The Royal Society PMID: 20810441

Järvikivi J, Vainio M, & Aalto D (2010). Real-time correlates of phonological quantity reveal unity of tonal and non-tonal languages. PloS one, 5 (9) PMID: 20838615

• Category: Science 
🔊 Listen RSS

Mirrored from A Replicated Typo

It’s long since been established that demography drives evolutionary processes (see Hawks, 2008 for a good overview). Similar attempts are also being made to describe cultural (Shennan, 2000; Henrich, 2004; Richerson & Boyd, 2009) and linguistic (Nettle, 1999 a; Wichmann & Homan, 2009; Vogt, 2009) processes by considering the effects of population size and other demographic variables. Even though these ideas are hardly new, until recently, there was a ceiling as to the amount of resources one person could draw upon. In linguistics, this paucity of data is being remedied through the implementation of large-scale projects, such as WALS, Ethnologue and UPSID, that bring together a vast body of linguistic fieldwork from around the world. Providing a solid direction for how this might be utilised is a recent study by Lupyan & Dale (2010). Here, the authors compare the structural properties of more than 2000 languages with three demographic variables: a language’s speaker population, its geographic spread and the number of linguistic neighbours. The salient point being that certain differences in structural features correspond to the underlying demographic conditions.

With that said, a few months ago I found myself wondering about a particular feature, the phoneme inventory size, and its potential relationship to underlying demographic conditions of a speech community. What piqued my interest was that two languages I retain a passing interest in, Kayardild and Pirahã, both contain small phonological inventories and have small speaker communities. The question being: is their a correlation between the population size of a language and its number of phonemes? Despite work suggesting at such a relationship (e.g. Trudgill, 2004), there is little in the way of empirical evidence to support such claims. Hay & Bauer (2007) perhaps represent the most comprehensive attempt at an investigation: reporting a statistical correlation between the number of speakers of a language and its phoneme inventory size.

In it, the authors provide some evidence for the claim that the more speakers a language has, the larger its phoneme inventory. Without going into the sub-divisions of vowels (e.g. separating monophthongs, extra monophtongs and diphthongs) and consonants (e.g. obstruents), as it would extend the post by about 1000 words, the vowel inventory and consonant inventory are both correlated with population size (also ruling out that language families are driving the results). As they note:

That vowel inventory and consonant inventory are both correlated with population size is quite remarkable. This is especially so because consonant inventory and vowel inventory do not correlate with one another at all in this data-set (rho=.01, p=.86). Maddieson (2005) also reports that there is no correlation between vowel and consonant inventory size in his sample of 559 languages. Despite the fact that there is no link between vowel inventory and consonant inventory size, both are significantly correlated with the size of the population of speakers.

Using their paper as a springboard, I decided to look at how other demographic factors might influence the size of the phoneme inventory, namely: population density and the degree of social interconnectedness.

Phoneme Inventory Size and Demography

The first step was to gather demographic data and segment inventory data from two sources: Ethnologue and UPSID. Ethnologue is a great resource for finding out speaker population size and its geographic spread — from which we can then use to work out the speaker density per km2. The UCLA Phonological Segment Inventory Database (UPSID) contains statistical surveys of the phoneme inventories for 451 world languages. The final number of languages used in my sample was 397. The removal of some languages was based simply on the lack of data pertaining to geographic spread. In line with Hay & Bauer I also removed any languages that fell more than four standard deviations from the mean. In this particular sample these languages were !Xu (141 phonemes) and Archi (91 phonemes). Next, I plugged the data into R and used this to perform simple correlations on speaker size, geographic spread and speaker density (see below).

As you can see the linear relationship for population/segment (rho=.33) and density/segment (rho=.38) is highly significant (p<.0001). However, there appears to be no significant relationship between the geographic spread of the language (area) and the phoneme inventory size (rho=.06, p=.2069). By the way, in case you hadn’t guessed: each point corresponds to a language. The red line shows a non-parametric scatterplot smoother fit through these points (Fox, 2002). To measure the degree of social interconnectedness (SI) I multiplied both speaker size and density together and then took the log to limit the effect of outliers. The idea being that SI is a product of these two interacting variables (Lycett & Norton, 2010). Again, this shows a highly significant correlation (rho=.39, p<.0001).

Obviously it’s important to remember that correlation is not equivalent with causality. Also, despite spending a fair amount of time on collecting the data, I’m not going to take my results too seriously. The primary reason being that I would prefer to do a far more comprehensive study, which includes a larger sample and more fine grain demographic information, before considering these observations in a theoretical context. Unlike Lupyan & Dale, for example, a key feature not included in my study is the degree of inter-language contact. I decided against this, not because its an unimportant factor — on the contrary, I think it might prove to be highly relevant in explaining how large scale inter-language contact may drive speaker populations into supporting larger phoneme inventories — but due to the fact it would simply take too long to collect all that additional information without any motivating factor other than a blog post.

There are also numerous ways to measure the capacity of a language’s phonological resources (Ke, 2006). A commonly employed approach is to consider its phonemes (as in this study). There are plenty of languages, however, where differences in meaning are marked through changes in the duration or tone of their vowels and consonants. Yet, the current dataset does not consider the extensive use of phonemic length distinctions in Japanese vowels — a language with an apparently small inventory, yet a large and dense population. Conversely, there are languages on the opposite end of the spectrum, such as Yulu, where they have a large phoneme inventory and a small and not particularly dense population. This could be one of two things: either, (1) my density information was not accurate, and the Yulu are actually living in a tight social network, albeit with a limited by population size, or (2) this would be expected, as Trudgill (2004) claims, because of “the ability of such communities to encourage continued adherence to norms from one generation to another, however complex they may be”.

Still, to give an understanding as to why I think studies into this relationship is a laudable pursuit, I’ll provide a tentative hypothesis based primarily on several recent papers into linguistics and cultural transmission.

Cultural Transmission, the Linguistic Niche Hypothesis, and Pressures for Learnability

In recognising the broad similarities between cultural and genetic forms of transmission, researchers are now starting to apply population genetic models to cultural data. The idea being that those factors known to influence the patterns of genetic variation and transmission might also impact upon the patterns of variation we see in cultural products (Lycett & Norton, 2010). Henrich (2004), for instance, looked at how a decrease in effective population size may ultimately result in the loss of pre-existing socially transmitted cultural elements. Here, demography shapes the cultural landscape through three inter-related factors: population size, density and social interconnectedness. As Lycett & Norton note:

Social interconnectedness reflected the likelihood of encountering a given craft skill and the regularity of such encounters. Social interconnectedness is thus somewhat proportional to the parameters of effective population size (i.e. number of skilled practitioners) and population density (i.e. probability of encounter due to degree of aggregation).

Although this example is focused on the transmission of technology, I think the same logic broadly applies to instances of linguistic transmission (albeit with different outcomes): that the degree of social interconnectedness will provide different pressures through which a language will adapt (see diagram below). Going back to my dataset and we can clearly see an example of where such thinking might be useful in regards to languages like Faroese, which, as Hay & Bauer point out, has a small number of speakers (48,000), yet a large phoneme inventory (37 segments) — contra the general positive trend of population and inventory sizes. However, if the density of the speaker population is taken into account (34.5 per km2), then Faroese actually fits in quite nicely. This especially true when compared with a language from a similar sized, yet less dense population as is the case with Pohnpeian (29,000; 3 per km2; 20 segments).

As you can see from my graphs there’s a lot of scattering, so many examples will exist that counter the general trend, and this may be due to other factors influencing what Lupyan & Dale (2010) referred to as different linguistic niches. In particular, languages spoken in exoteric niches (large number of speakers, large area, many linguistic neighbours) and esoteric niches (small number of speakers, small area, few linguistic neighbours) will produce languages adapted to these socio-demographic conditions. In their study, they found that those languages spoken in an exoteric niche are prone to simpler inflectional morphology, with an increased reliance on lexical strategies to encode certain linguistic features (e.g. evidentiality). Conversely, languages spoken in esoteric niches are morphologically complex, and as such show greater levels of redundancy.

In fact, Lupyan & Dale’s use of word niche is not coincidental: for it is from the way in which biological organisms adapt to their ecological niche that they take inspiration. The basic premise places the structure of language as having emerged from its interaction with the linguistic environment (Wray & Grace, 2007). As such, underlying demographic conditions are a vital component in defining the linguistic environment, which in turn generates different pressures for languages to adapt. One of these pressures in learnability: that languages adapt to become increasingly learnable for their speaker-hearer community (Brighton et al., 2005). Lupyan & Dale identify the different learnability biases of children and adults as determinants of the relative differences in trajectory between esoteric and exoteric niches:

With the increased geographic spread and an increased speaker population, a language is more likely to be subjected to learnability biases and limitations of adults learners.

The link between Lupyan & Dale’s study is more than just methodological. I predict certain demographic conditions will induce different learnability pressures — and that these pressures will influence the trajectory of a language at multiple levels of organisation. Take, for instance, the claim that a language operating in an exoteric niche will increase its learnability for adult language acquisition. At one end of the scale it is predicted the phoneme inventory size will increase, allowing new avenues for a language to develop adaptive strategies in regards to learnability. An information theoretic perspective will show that as more phonemes become available, these can combine to produce shorter word lengths. This is evident from laboratory experiments (Costa et al., 1998; Selten & Warglien, 2007) and statistical studies (Nettle, 1999). Nettle, for instance, discovered a high negative correlation between the mean word length and the size of the phoneme inventory across ten languages (see below). Lexical strategies such as shorter word lengths (among others) will gradually reduce a language’s reliance on inflectional morphology — an excessively redundant method of coping with a limited inventory — and cause another transition at the other end of the scale: an increased amount of transparency between word-forms and meanings (form-meaning compositionality).

The learnability pressure may also manifest itself in the diversity of exposure. Exoteric languages are much more likely to be exposed to a larger range of speakers than those in the small communities of esoteric speakers. If “variability causes the need for abstraction” (Pierrehumbert, Beckman and Ladd, 2001) then, as Hay & Bauer speculate, exposure to less variability would lead to a decrease in the robustness of phonemic categories. Experiments into the acquisition of phoneme categories show that infants discern phoneme boundaries through through the use of distributional information in the signal:

That is, when an infant (or adult) is exposed to tokens from a particular phonetic space in a uni-modal distribution, they tend to learn this as a single category. When a distribution over the same phonetic space is bimodal, it is learned as two categories. Increased exposure to [a] large number of speakers would lead to denser distributions and so (presumably) make learning of this kind more robust. With sufficient exposure, categories could be easily learned which would be difficult with more limited, less varied, exposure.

Ideally, all these factors would be considered under one large-scale project. I’ll reiterate my point on not taking the results I generated too seriously. But it’s certainly something I want to continue pursuing. Also, if anyone can provide information to improve any of the following points, then I’d be happy to hear from you. In particular, more work needs to be done on the following:

  1. Larger sample sizes, fine-grain demographic data, and a more rigorous statistical analysis;
  2. Testing for additional factors, such as inter-language contact;
  3. Taking into consideration other features, besides the inventory, that distinguish between meanings (e.g. tone).

Main References
Hay, J., & Bauer, L. (2007). Phoneme inventory size and population size Language, 83 (2), 388-400 DOI: 10.1353/lan.2007.0071

Lupyan G, & Dale R (2010). Language structure is partly determined by social structure. PloS one, 5 (1) PMID: 20098492

Lycett, S., & Norton, C. (2010). A demographic model for Palaeolithic technological evolution: The case of East Asia and the Movius Line Quaternary International, 211 (1-2), 55-65 DOI: 10.1016/j.quaint.2008.12.001

Trudgill, P. (2004). Linguistic and social typology: The Austronesian migrations and phoneme inventories Linguistic Typology, 8 (3), 305-320 DOI: 10.1515/lity.2004.8.3.305

Nettle, D. (1999). Linguistic Diversity. Oxford University Press, Oxford.

• Category: Science 
🔊 Listen RSS

Cultural differences are often attributed to events far removed from genetics. The basis for this belief is often based on the assertion that if you take an individual, at birth, from one society and implant them in another, then they will generally grow up to become well-adjusted to their adopted culture. Whilst this is more than likely true, even if there may be certain cultural features that may disagree with someone of a different ethnic background (e.g. degrees of alcohol tolerance), the situation is not as clear cut as certain political factions may have you believe. Yet, largely due to studies on gene-culture coevolution, we are now starting to understand the complex dynamics through which genes and culture interact.

First, a particular culture may exert selection pressures on genes that provide an advantageous benefit to the adoption of a particular cultural trait. This is evident in the strong selection of the lactose-tolerance allele due to the spread of dairy farming. Second, pre-existing gene distributions provide pressures through which culture adapts. Off the top of my head, one proposed example of this is a paper by Dediu and Ladd (2007), which looked at how the distribution of the derived haplotypes of ASPM and Microcephalin may have subtly influenced the development of tonal languages. The paper in question, however, is looking more broadly at culture. Specifically, the authors, Baldwin May and Matthew Lieberman, examine recent genetic association studies and how within-variation of genes involved in central neurotransmitter systems are associated with differences in social sensitivity. In particular, they highlight a correlation between the relative frequencies of certain gene-variants and the relative degree of individualism or collectivism within certain populations.

Genetic Variation and Social Sensitivity

The first part of their review is on genetic variation in genes associated with the serotonin transporter, the μ opioid receptor and monoamine oxidase A. Each of these genes influence social sensitivity in various ways. For instance, the widely studied polymorphism of the serotonin transporter gene (SLC6A4), known as 5-HTTLPR, is believed to influence personality traits. At this location an individual can have one of the following genotypes: short/short, short/long, long/long. In a report looking at the influence of life stress on depression, Caspi et al (2003) found that individuals with one or two copies of the short allele were more likely to exhibit depressive symptoms than individuals homozygous for the long allele. Later studies have shown that the interaction between the 5-HTTLPR and stress extends to other phenotypes influenced by the serotonin system, including:

[…] post-traumatic stress disorder (Xie et al., 2009), antisocial behaviour (Lie and Lee, in press), substance use (Brody et al., 2009a), suicidality (Roy et al., 2007), sleep quality (Brummett et al., 2007) and anxiety sensitivity (Stein et al., 2007). The multiple phenotypes affected by this interaction attests to the robustness of the effect.

The serotonin transporter gene (SLC6A4). 5-HTTLPR is located on chromosome 17.

Despite making individuals more susceptible to stress-induced situations, the short allele is also associated with a greater sensitivity to positive environments. Taylor et al (2006) demonstrate that when short/short individuals experienced more positive than negative events over a 6-month period, they tended to show the lowest levels of depression. As May & Lieberman (2010) note:

Subsequent research has shown that this relationship between life events and affect for individuals with the short/short genotype was primarily driven by the social events, as the nonsocial events were not significantly related to the affect (Way and Taylor, 2010). Other groups have found heightened sensitivity to positive social influences amongst short allele carriers as well, which has even been documented using neurochemical measures (Manuck et al., 2004). Thus, these results suggest that the 5-HTTLPR moderates sensitivity to social influence regardless of its valence.

The argument developed from these studies, among others, is one where short/short individuals embedded in a highly interconnected social network, as seen in collectivist cultures, are somewhat insulated against negative events that may impact upon them more harshly in individualist societies. Similar conclusions are drawn for polymorphisms influencing the opioid system (A118G) and monoamine oxidase A (MAOAuVNTR): here, individuals with social sensitivity alleles are particularly prone to the effects of social exclusion. With all of these studies being at the individual-level, the authors ask the following question: If a population had more or less of these alleles, might this affect the preferred forms of social interaction?

Social Sensitivity Alleles and Cultural Norms

In looking at populations, the general trend of the studies mentioned in this review suggest there is a higher prevalence of social sensitivity alleles in East Asian than in Caucasian populations. Take the A118G polymorphism: in a forthcoming study by Way et al, they find a robust correlation between this polymorphism and individualism-collectivism (see figure below). So, in populations showing a greater amount of collectivism, the G allele is more prevalent. Furthermore, this relationship remained significant even when per capita GDP and other factors were controlled for. This is also true for the 5-HTTLPR and, to a lesser extent, the MAOA-uVNTR alleles.

As mentioned earlier, the idea the authors are putting forward is one of relationship between the relative proportion of these alleles and the predominant cultural features in population:

In collectivistic cultures, relationships are enduring due to social ties that are reified by mutual obligations between members of the family, clan, or religion. These relationships are so salient that the self is defined by them. Thus, the implicit construction of the self in members of these cultures is inherently relational (Fiske et al., 1998). This social construction of the self may function akin to an implicit social support network (Kim et al., 2008) that is likely to buffer individuals with social sensitivity alleles from the adverse consequences of stress and improve life satisfaction.

Conversely, in individualistic cultures there is a tendency for a high degree of personal autonomy — that is, individual need can often trump the requirements of the group. It also has implications for those who are susceptible to the influence of social sensitivity alleles, including those of ethnic backgrounds where such alleles will appear in high frequencies. Take, for instance, those of East Asian descent living in the USA: here, these individuals suffer higher levels of major depression than Asians still living Asia. The same is also apparently true of US-born Latino’s who experience higher rates of depression than those born in foreign countries. A potential explanation, then, is that collectivism provides a network suitable for sustaining emotional well-being in those populations containing a high levels of social sensitivity alleles.

Of course, there are numerous problems with making such interpretations. The correlations here could simply be the product of random processes such as genetic drift. The big question that remains is the relationship between these alleles and the type of culture they foster. This brings us back to the start. So one interpretation, which ultimately depends on them controlling for demographic processes, is that the culture of collectivist societies provided a significant pressure in shaping the frequencies of the social sensitivity alleles. A similar scenario can be proposed for individualistic societies where instead they decrease the prevalence of social sensitivity alleles. However, it may simply be the case that the correlation is not significantly different from background genetic variation, with the culture instead being shaped by pre-existing distributions. As the authors note:

[…] it would suggest that genetic selection at these loci is not the explanation for the correlation. Rather, it would suggest that collectivism was ‘stickier’, representing a better fit in populations with a high proportion of putative social sensitivity alleles (Lieberman, 2009). In other words, the psychological and behavioral tendencies associated with collectivism may have been more likely to have been adopted and transmitted in populations with a higher prevalence of such social sensitivity alleles. Similarly, individualism may have represented a better fit for populations with a low proportion of social sensitivity alleles where less reactivity to social rejection or exclusion would have been beneficial.

Citation: Way & Lieberman (2010). Is there a genetic contribution to cultural differences? Collectivism, individualism and genetic markers of social sensitivity. Social Cognitive and Affective Neuroscience, 5 (2-3), 203-211 DOI: 10.1093/scan/nsq059.

• Category: Science 
🔊 Listen RSS

Most of you in the science blogosphere have probably come across Razib’s recent post on linguistic diversity and poverty. The basic argument being that linguistic homogeneity is good for economic development and general prosperity. I was quite happy to let the debate unfold and limit my stance on the subject to the following few sentences I posted previously:

From the perspective of a linguist, however, I do like the idea of really obscure linguistic communities, ready and waiting to be discovered and documented. On the flip side, it is selfish of me to want these small communities to remain in a bubble, free from the very same benefits I enjoy in belonging to a modern, post-industrialised society. Our goal, then, should probably be more focused on documenting, as opposed to saving, these languages.

Since then, the debate has become a lot more heated, with Neuroanthropology wading in against Razib, which, in the second-half of the post at least, is worth reading just to get the general flavour of the other side in this debate. Having said that, I wasn’t convinced by the evidence Greg Downey used to dismiss Razib’s hypothesis, so I decided to actually look at the literature on the subject. The first paper I found upon searching was one by Nettle et al, in which they examine the relationship between cultural diversity and societal instability using a large cross-national data set of 212 nations. Importantly, they look at cultural diversity in the context of three areas: linguistically, ethnically and religious affiliation. Also, they draw a distinction between within-nation (alpha) diversity and between-nation (beta) diversity. Lastly, unlike other studies on the subject, where simple regression or correlation methods are used, the current study employs structural equation modelling (SEM):

SEM is a multiequational modeling system suitable for asking complex questions about the responses of systems to interconnected sets of explanatory factors [9,10]. Using SEM we probed the contributions of multiple cultural diversity measures to international variations in economics and societal instability. We evaluated the direct and indirectly-mediated effects of linguistic, religious, and ethnic diversity (both within-nations and across-nations) on indicators of societal instability while controlling for the correlated effects of population size and number of borders.

Looking at these factors, rather than using economic performance, the current study’s key finding is that more diversity is associated with more instability. What’s really interesting, however, is that different types and domains of diversity have interacting effects. As Razib predicted, linguistic diversity does have a negative effect on economic performance — and it is through this economic mechanism that societal instability increases. This is true for both within and between measures of diversity. On the other hand, religious diversity between-nations produces the opposite effect: it reduces instability. Now, the surprising part is that religious diversity is especially effective at reducing instability in the presence of high linguistic diversity.

But why should large religious variability between neighbours produce this stabilizing effect? The authors offer one interesting solution:

Alexander[19] argued that religions are cultural inventions which function to extend nations, suggesting that the unit of a ‘nation’ is an emergent property of unifying distinctive belief systems. It therefore may be that a shared religious or moral system within a country which differs from those surrounding countries leads to a sense of shared identity, common purpose or harmony.

I don’t expect this to be the final say on the matter, and, like all studies looking at such a complex subject matter, there are limitations to the scope of current study (some of which are noted in the paper). It also doesn’t really solve the question as to why linguistic diversity is correlated with poor economic performance. This is clearly a job for economists. Though, for now at least, I’m reasonably convinced by Razib’s take on the matter.

Citation: Nettle, D et al. (2007). Cultural Diversity, Economic Development and Societal Instability. PLoS One, 2(9) PMID: 17895970.

• Category: Science 
🔊 Listen RSS

I always remember 2008 as the year when the entire UK media descended upon the former mining town of Bridgend. The reason: over the course of two years, 24 young people, most of whom were between the ages of 13 and 17, decided to commit suicide. At the time I was working in Bridgend, so I’m able to appreciate the claims of local MP, Madeleine Moon, that media influence had become part of problem. After all, most editors will tell you: the aim is to sell newspapers. And when this rule is rigorously applied, it should not come as a surprise at the depths some journalists will sink to recycle a news story. Even at a local-level, where you’d think some civic responsibility might exist, journalists clambered over themselves to find a new angle, generating ridiculous claims such as: electromagnetic waves from mobile phones caused the suicides.

In fact, the role media plays in suicide dates back to the publication of Johann Gothe’s 1774 novel, The Sorrows of Young Werther, where the protagonist shoots himself after being involved in a love triangle. It was subsequently banned, after several men throughout Europe were believed to have taken their own lives imitating Werther. Myths aside, and there’s plenty of research showing suicides increase following reporting by the media (Phillips, 1974; Stack, 2003; Ishii, 2004). And while the causes of suicide are immensely complicated, their localisation in time and/or space may partially be the result of social learning: exposure to another individual’s suicide may lead to the imitation of suicidal behaviour. In a recent paper, Alex Mesoudi (2009) looks at several hypotheses relating to two general patterns of suicide clusters: point clusters and mass clusters.

The first of these, point clusters, are confined temporally and spatially: that is, there’s a temporary increase in the frequency of suicides within a small community or institution. This is an example of the above copycat suicides. Here, suicidal behaviour spreads through a localised network via social learning mechanisms, such as imitation and emulation. Meanwhile, mass clusters are differentiated by the lack of spatial clustering, with a temporary increase in the total frequency of suicides across an entire population. An example of this is when a celebrity suicide will garner a ridiculous amount of media attention, which, due to the wide-reach of the mass media, leads to people across the country imitating the suicide behaviour. As Mesoudi notes:

Consistent with a social learning effect, this increase is found to be proportional to the amount of media coverage, e.g. the number of column inches devoted to the suicide[8] or the number of television networks covering the suicide[10]. Moreover, suicide rates do not show a corresponding drop some time after the publicised suicide, suggesting that the immediate increase is not caused by already-vulnerable people committing suicide earlier than they otherwise would have[8] […] There is also evidence that people are more likely to imitate the suicides of celebrities who match them in gender and nationality[9], although this effect is less robust than the celebrity effect[11].

To further investigate these two types of clustering, Mesoudi uses agent-based simulations, combined with statistical cluster-detection analyses, to determine the population-level patterns of behaviour generated by interactions between individuals. One of these questions relates to whether or not spatiotemporal point clusters are caused by social learning, or some other mechanism, such as homophily: where a tendency exists for preferential association between similar individuals. These factors will increase the risk of suicide, and show spatial clustering, due to each member being independently at a high risk of suicide — that’s to say it’s not transmitted via social learning (non-copycat).

For mass suicide clusters, the tendency for researchers is to focus on three explanations: (i) that only the prestige bias of the celebrity is responsible; (ii) the effect is enhanced because celebrities share common characteristics with the target individual (similarity bias); and, (iii) the dissemination of suicide information by the mass media is responsible for suicide behaviour. For the first two hypotheses, evolutionary models suggest both prestige bias and similarity bias are an adaptive method for acquiring accurate information, which, for the most part, is far more efficient than trail-and-error individual learning and unbiased copying. Although these behaviours are generally considered adaptive social learning rules, there is also room for maladaptive behaviours, such as suicide, to spread throughout the population when exhibited by prestigious and/or similar individuals.

The last hypothesis places the mass media as a central component of mass suicide clusters. It’s important note that, rather than being mutually exclusive, the mass media is actually an amplification of our social learning rules:

Formally, mass media dissemination resembles “one-to-many” cultural transmission[24], where a single individual can influence a large number of other individuals simultaneously. Cultural evolution models suggest that the extreme one-to-many transmission that is permitted by the mass media can greatly increase the rate at which behavioural traits spread[24], thus potentially generating temporal clusters.

The general results of the simulations provide support that social learning generates spatiotemporal point clusters. As for the other hypothesis, of homophily generating spatiotemporal clusters, Mesoudi found it was partially supported (see figure 1 below). Homophily-based clusters only occurred when there was large individual variation in agents’ suicide risk, with these clusters mainly being spatial but never solely temporal. This is fairly obvious when you consider these are high risk agents, who, without social learning, have no reason to cluster their suicides across some point in time. On the basis of these findings, Mesoudi suggests future empirical tests should taking into account the “degree of individual variability in known suicide risk factors (e.g. age, sex, ethnicity) in a region, and by distinguishing between the spatial-but-not-temporal clusters generated by homophily and the spatiotemporal clusters generated by social learning.”

For the second set of simulations, neither prestige bias nor similarity bias were capable of generating mass clusters alone:

Both prestige and similarity bias act to reduce the subset of potential models from whom suicide-related behaviour can be learned. For prestige bias, this is because only a minority of the population can be, by definition, prestigious. For similarity bias, requiring that models must be similar to oneself in some respect reduces the number of potential models from whom one can learn. Both biases therefore reduce the frequency of social learning events and reduce the probability of clustering. This reduction in the probability of clustering was counteracted under certain conditions […] Yet even under these conditions (strong prestige bias, homophily) mass clusters were no more likely to emerge than purely spatial clusters or spatiotemporal clusters.

But what about the one-to-many transmission produced by the mass media? Well, this did generate mass clusters, but only when social influence was relatively weak — either directly via a reduced strength in social learning, or indirectly via prestige bias or similarity bias. When it was unrealistically strong the social influence resulted in suicide pandemics where all agents eventually committed suicide (see figure 2 below). Of course, this isn’t really reflective of real world situations. As Mesoudi notes in his discussion:

In summary, prestige and similarity bias were neither necessary nor sufficient for mass clusters, while one-to-many transmission was necessary but not sufficient. The three processes in combination generated mass clusters, which is consistent with sociological evidence for each in actual cases of mass suicide clusters. However, the model highlights the very different roles that each plays: one-to-many transmission acts to spread suicide behaviour across the entire population thus eliminating spatial clustering, while prestige and similarity bias somewhat counter-intuitively […] prevent copycat suicides from persisting and becoming pandemic.

Obviously there are many assumptions made in this model and, after all, this is an extremely complex phenomenon. On that note, however, the current model does highlight the need to restrict the dissemination and glorification of suicides. This is backed up by real world evidence. From 1983 to 1986 a large number of people were throwing themselves in front of trains on a recently introduced Vienna subway system. The reporting during this period provided detailed expositions of the victims and their lives. Finally, in an effort to combat the media influence, strict guidelines were introduced to stop reporting on suicides. From the first to the second half of 1987, subway-suicides and -attempts dropped by more than 80% — and are still at a low level ever since (Etzersdorfer & Sonneck, 1998).

Despite the Viennese experience, media coverage, in the UK at least, is still very dubious. We can clearly see this in media attempts to find any excuse to recycle, and continually maintain, a particular news story. But this isn’t just about suicides either. Recently, the British media is providing non-stop coverage of serial murderer Raoul Moat, not too long after they lavishly presented Derek Bird‘s rampage across Cumbria. I’m not saying there’s necessarily a causal relation between the two, even though it makes me wonder whether we’ll once again find ourselves in a similar situation.

Citation: Mesoudi, A (2009). The Cultural Dynamics of Copycat Suicide. PLoS One 4(9): e7252. doi: 10.1371/journal.pone.0007252

• Category: Science 
🔊 Listen RSS

For me, recent computational accounts of language evolution provide a compelling rationale that cultural, as opposed to biological, evolution is fundamental in understanding the design features of language. The basis for this rests on the simple notion of language being not only a conveyor of cultural information, but also a socially learned and culturally transmitted system: that is, an individual’s linguistic knowledge is the result of observing the linguistic behaviour of others. Here, this well-attested process of language acquisition, often termed Iterated Learning, emphasises the effects of differential learnability on competing linguistic variants. Sounds, words and grammatical structures are therefore seen to be the products of selection and directed mutation. As you can see from the use of terms such as selection and mutation it’s clear we can draw many parallels between the literature on language evolution and analogous processes in biology. Indeed, Darwin himself noted such similarities in the Descent of Man. However, one aspect evolutionary linguists don’t seem to borrow is that of a null model. Is it possible that the changes we see in languages over time are just the products of processes analogous to genetic drift?

Such questions are asked in a paper by Reali & Griffiths (2009) that also takes some steps to remedying the situation, by defining a neutral model at the level of linguistic variants. Specifically, they apply this neutral model to three linguistic phenomena: the s-shaped curve of language change; the distribution of word frequencies; and, the relationship between word frequencies and extinction rates. But before getting into these three phenomena, a quick overview of the model is needed (for reasons you’ll see later on):

Defining a model of language evolution that is neutral at the level of linguistic variants requires an account of learning that is explicit about the inductive biases of learners–those factors that make some variants easier to learn than others–so that it is clear that these biases do not favour particular variants. We model learning as statistical inference, with learners using Bayes’ rule to combine the clues provided by a set of utterances with inductive biases expressed through a prior distribution over languages. We define a neutral model by using a prior that assigns equal probability to different variants of a linguistic form. While it is neutral at the level of variants, this approach allows for the possibility that learners have more general expectations about the structure of a language–such as the amount of probabilistic variation in the language, and the tendency for new variants to arise–that can result in forces analogous to directed mutation at the level of entire languages.

So, if this is the case, we can instead appeal to high-level inductive biases as explanatory mechanisms for the structure of languages, without necessarily appealing to selective forces at the level of linguistic variants. By combining the Iterated Learning Model (ILM) with Bayesian learners, the authors arrive at two surprising conclusions: (1) the model is equivalent to the Wright-Fisher model of allele transmission; and, (2) the model is able to reproduce basic regularities in the structure and evolution of languages without the need for the selection or directed mutation of linguistic variants. To reach this equivalence with the Wright-Fisher model linguistic variants are treated as different alleles of a gene, with the Markov chain produced by Iterated Learning matching the model of genetic drift. Essentially, they are proposing the results from population genetics can help define the dynamics and stationary distribution of the Markov chain: this will give a good indication of what kind languages will emerge.

By using Bayesian learners, the authors are able to explicitly relate language change to individual inductive biases: by manipulating the parameters of the prior and seeing the consequences produced via Iterated Learning. Also, their mathematical formulation allows them to generalize the biological-linguistic equivalence to the case of an unbounded number of variants:

Following an argument similar to that for the finite case, iterated learning with Bayesian learners considering distributions over an unbounded vocabulary can be show to be equivalent to the Wright-Fisher model for infinitely many alleles (see the electronic supplementary material for a detailed proof).

Just to reiterate:

  • Their model is neutral in regards to both selection and directed mutation at the level of linguistic variants, assuming a symmetric mutation between competing variants.
  • The goal is to provide a null hypothesis to evaluate certain claims about the importance of selective pressures, which are often used to explain the “statistical regularities found in the form and evolution of languages”.
  • The authors also note that the model “allows for a kind of directed mutation at the level of entire languages, with expectations about the amount of probabilistic variation in a language shaping the structure of that language over time. These expectations play a role analogous to setting the mutation rate in the Wright-Fisher model.”

Frequency effects in lexical replacement rates

As you’ve probably guessed, the authors find that their null model can account for three linguistic phenomena previously mentioned. I’m going to focus on lexical replacement rates because it’s what I’m most familiar with. Anyway, two recent studies show how the frequency of use predicts: (1) the rates at which verbs change from irregular to regular forms (Lieberman et al., 2007); and, (2) the word replacement rates in Indo-European languages (Pagel et al., 2007). So, if a word is used frequently, then it is replaced much more slowly than a less frequently used one. For instance, Pagel et al found that over a 10,000 year time scale, some words show a minimal amount of cognate replacement (zero to one) for words used frequently, while less-frequently used words might have up to nine cognate replacements. Furthermore, certain classes of words evolve at different rates, with prepositions and conjunctions changing more quickly than pronouns and numbers. When plotted, this shows an inverse relationship between the frequency of use and replacement rates.

One suggestion is that some form of linguistic, frequency-dependent, purifying selection is a central factor in determining the slow rate of evolution in highly expressed words. However, as Reali & Griffiths show, their neutral model alone is sufficient to account for the frequency effect:

In the infinite case, mutation occurs only to new variants, thus, all variants are eventually lost from the population. A new cognate is represented by a new variant. Replacement happens when the variant corresponding to the old cognate becomes extinct. The case of verb regularization is modelled by assuming that irregular and regular verbs coexist as two variants among other words in the vocabulary. Extinction of the irregular verb happens when the regular form replaces completely the irregular one.

Importantly, their analytic results and simulations indicate the replacement rate follows an inverse power-law relationship with frequency (see figure below):

Now, for me, I think this paper is quite important for the field of language evolution: all too often we assume some sort of purifying selection is shaping the trajectory of language. However, one of the apparent problems with their model is the use of a single chain of Bayesian learners. Recent studies (see Ferdinand & Zuidema, 2009) show that the outcome of iterated learning is sensitive to more factors than are explicitly encoded in the prior, including population size. This is true even when a small modification is made: e.g. expanding the population size to two individuals at each generation. I’m not entirely sure whether or not this has a profound impact on the current results, but it does stress the need for more work on developing null models for language evolution. As the authors themselves note:

While this analysis of one of the most basic aspects of language–the frequencies of different variants–emphasizes the connections between biological and cultural evolution, it also illustrates that the models developed in population genetics cover only one small part of the process of cultural evolution. We anticipate that developing neutral models that apply to the transmission of more richly structured aspects of language will require a deeper understanding of the mechanisms of cultural transmission–in this case, language learning.

Citation: Florencia Reali and Thomas L. Griffiths. Words as alleles: connecting language evolution with Bayesian learners to models of genetic drift. Proc R Soc B (2010). 277: 429-436. DOI: 10.1098/rspb.2009.1513.

N.B. You can currently download this article, and the whole backlog of Royal Society articles, for free until the end of July.

• Category: Science 
🔊 Listen RSS

Throughout much of our history language was transitory, existing only briefly within its speech community. The invention of writing systems heralded a way of recording some of its recent history, but for the most part linguists lack the stone tools archaeologists use to explore the early history of ancient technological industries. The question of how far back we can trace the history of languages is therefore an immensely important, and highly difficult, one to answer. However, it’s not impossible. Like biologists, who use highly conserved genes to probe the deepest branches on the tree of life, some linguists argue that highly stable linguistic features hold the promise of tracing ancestral relations between the world’s languages.

Previous attempts using cognates to infer the relatedness between languages are generally limited to predictions within the last 6000-10,000 years. In the present study, Greenhill et al (2010) decided to examine more stable linguistic features than the lexicon, arguing:

If some typological features are consistently stable within language families, and resistant to borrowing, then they might hold the key to uncovering relationships at far deeper levels than previously possible. For example, Nichols (1994) uses typological features to argue for a spread of languages and cultures around the Pacific Rim, connecting Australia, Papua New Guinea, Asia, Russia, Siberia, Alaska and the western coasts of North and South America. If this is correct, then these typological features must be reflecting time depths at least 16 000 years and possibly as deep as 50 000 years ago

Still, to really get the most information possible, it’s best to use a large corpus reflecting the diversity of the world’s languages. This is where the World Atlas of Language Structures (WALS) comes in: it contains a vast body of information about 141 typological features across 2561 languages. It’s a great resource, comparable to the online tools available for geneticists, with Greenhill et al employing phylogenetic analyses of this typological data. They break up their approach into three parts. First, a network method is applied to the observed patterns of typological variability in an effort to find any deep signals within the data. Second, they “quantify the fit of typological and lexical features onto known family trees for two of the world’s largest and best-studied language families — Indo European and Austronesia”. Lastly, they estimate the rates of evolution for typological and lexical features within these families, subsequently comparing the two.

Using a network technique (see figure below), the authors are able to visualise the divergence between languages by looking at the length of the branches, with the box-like structures representing a conflict between signals when certain typological features support incompatible language groupings. So if typological features are stable, then we would expect to see instances where known linguistic history is displayed in the groupings, whilst having a relatively minimal amount of conflicting signals. Conversely, those typological features tending to evolve too rapidly, or undergo diffusion between adjacent languages, will produce a star-like network — creating many boxes and lots of clustering.

So how did they fare? Well, the network shown above does group some of the languages into known families, as shown in Indo-European, Altaic and Nakh-Daghestanian. In other instances, however, the language families were not recovered — including, Sino-Tibetan, Uralic, and Trans-New Guinea. They also note that there are substantial number of conflicting signals (box-like structures), leading to an inaccurate recovery of many well-attested phylogenetic relationships within major language families. An example being the network linking German to French, when in fact German is more closely related to English. There do exist high level clusters in the data, including languages from continental Eurasia, which may suggest an ancient common ancestry. This is consistent with the hypothesis that typological features evolve slowly enough to allow linguists to identify deep historical relationships, but as the authors note:

[…] phylogenetic networks cannot distinguish between similarity owing to common ancestry and similarity owing to areal diffusion or chance resemblances arising through independent innovation… If some typological features are highly stable and good indicators of common ancestry, then we would expect them (i) to fit well with established language groupings and (ii) to show slower rates of change than lexical features as a whole.

To assess the shape of language evolution they fitted typological and lexical data onto the established family trees. In both Indo-European and Austronesian, the lexical data provided a significantly better fit to the expected family trees than the typological data, with lexical networks displaying a much more tree-like signal. Now, as for estimating the rates of change, they calculated the maximum-likelihood estimate for the rate of evolution across the posterior distribution of trees in each family:

In both families, the distributions of lexical and typological rates are comparable. The similar ranges evident in these plots indicate that there is in fact no substantial difference between the slowest rates of lexical and typological change in either family.

Lastly, they found that, in agreement with their previous research, rates of lexical change are correlated across language families. In contrast, the rate of typological feature change shows no significant correlation between Indo-European and Austronesian. The general conclusion from this absence of correlation suggests there are not any sets of universally stable typological features. In fact, their analysis of rates of evolution failed to identify any typological features that evolve at consistently slower rates than the basic lexicon. Assuming, then, that the signal in the lexicon does stretch back 10,000 years, the authors suggest the typological data is constrained across a similar temporal horizon. And this is not the only difficulty in inferring deep ancestral relationships. First, there are high rates of homoplasy across the typological features. So shared typological features are even less reliable an indication of common ancestry than shared basic vocabulary. Second, those languages situated geographically close to one another might undergo diffusion:

This can occur through processes like language shift (Thomason & Kaufman 1988)–where speakers of one language change to another owing to societal influences, yet retain morphology or phonology from their original language, or metatypy (Ross 1996)–where a language rearranges some aspect of typology (e.g. morphosyntax) owing to contact between languages without explicit borrowing between the languages, usually as an outcome of intimate cultural contact.

Ultimately, I think the current study highlights how little linguists know about the shape and tempo of language change. Contrary to the notion that structural elements of language change on a near-glacial time scale, it appears structural change is comparable to lexical change; and nor does it limit the diffusion of features between languages. Another important finding is the difference between the rates of structural evolution in language families. According to Greenhill et al, just as frequency of use is crucial in lexical change, so it may be true for the use of different structural elements in determining structural change. Complicating the situation somewhat is whereas word use is relatively constant across languages, structural features are dependent on what other structural constraints are operating within a language. It might invite incredulity, but despite the present problems outlined I do think future studies will be able to use phylogenetic methods, and the increasing body of data available, to test specific hypotheses relating to the underlying mechanisms driving the shape and tempo of language evolution.

Citation: Greenhill et al (2010). The shape and tempo of language evolution. Proceedings of Royal Society B. doi: 10.1098/rspb.2010.0051.

• Category: Science 
🔊 Listen RSS

Here is a far-reaching and crucially relevant question for those of us seeking to understand the evolution of culture: Is there any relationship between population size and tool kit diversity or complexity? This question is important because, if met with an affirmative answer, then the emergence of modern human culture may be explained by changes in population size, rather than a species-wide cognitive explosion. Some attempts at an answer have led to models which make certain predictions about what we expect to see when populations vary. For instance, Shennan (2001) argues that in smaller populations, the number of people adopting a particular cultural variant is more likely to be affected by sampling variation. So in larger populations, learners potentially have access to a greater number of experts, which means adaptive variants are less likely to be lost by chance (Henrich, 2004).

Models aside, and existing empirical evidence is limited with the results being mixed. I previously mentioned the gradual loss of complexity in Tasmanian tool kits after the population was isolated from mainland Australia. Elsewhere, Golden (2006) highlighted the case of isolated Polar Inuit, who lost kayaks, the bow and arrow and other technologies when their knowledgeable experts were wiped out during a plague.Yet two systematic studies (Collard et al., 2005; Read, 2008) of the Inuit case found no evidence for population size being a predictor of technological complexity.

In the current paper, Kline & Boyd (2010) investigate the effects of population size on the complexity of marine foraging tool kits among island populations in Oceania. Importantly, they consider contact between populations:

However, the sample used in both analyses did not include any measure of contact between populations and was drawn mostly from northern coastal regions of the western North America where intergroup contact was probably common (Balikci 1970; Jordan 2009), but difficult to estimate. If, as the cultural adaptation models predict, frequent contact between groups mitigates the effects of small population size, then the results from these analyses do not provide a test of the models.

Using the electronic Human Relations Area Files (eHRAF), Kline & Boyd take a sample of information on indigenous marine foraging tool kits from 10 island societies. This also includes data on the rates of contact between populations and controls for aspects such as resource failure. The general results support the hypothesis in three ways. First, large populations retain a larger repertoire of tools than small island populations (see graph below) — with population size being a much better predictor than other explanatory variables. Second, there is some support for the prediction that contact will be less important in larger populations. For instance, four of the five high-contact societies have more tool types than expected given that they fall within the intermediate range of population sizes. Conversely, low contact groups displayed a trend of having fewer tools than expected, with the overall predictive power of population size and contact being second to population size and fish genera.

Their third and final point concerns how complex tools will be particularly prone to loss, due to how much harder it is to learn and make them. To quantify tool complexity, the authors used techno-units: defined as “an integrated, physically distinct and unique structural configuration that contributes to the form of a finished artefact”. As an example, on one end of the scale (one techno-unit) there is a stick used for prying shellfish, whilst an untended crab trap utilising a baited lever is on the end end with 16 techno-units. Applied to the current dataset, and the mean number of techno-units is significantly higher in large populations than in smaller populations.

If Kline & Boyd’s view of human adaptation is correct, then the constraints imposed on cultural adaptation by population size and rate of contact are more relevant determinants than ecological factors (in this case at least). As they note:

To test this hypothesis, we chose to study island populations because they are ecologically similar and because population size and contact rates are easier to estimate than in continental populations. Then by limiting the analysis to marine foraging tools, we hoped to minimize the effects of ecological variation on tool kit complexity. Thus, our observation that larger populations have more kinds of marine foraging tools and more complex tools than smaller, isolated populations supports the hypothesis that gradual cultural evolution plays an important role in human adaptation.

There are alternative explanations for the relationship between population size and tool kit complexity. It could be the case that more complex marine foraging technology merely increases the local carrying capacity, subsequently allowing for larger population sizes. However, this explanation does not explain the relationship between population contact and tool kit complexity. Yet there are some aspects of the study I would be cautious about. Take the data they used to measure the rate of contact between populations: it only distinguishes between high or low levels of contact. These limitations certainly set the stage for a more extensive study; using finer-grained measures of contact. But as the authors themselves conclude:

These findings are a first step in understanding the nature of cumulative cultural gains and losses. Although our sample size is small and our analysis is restricted to a limited range of tool types, our results suggest that cultural drift of the treadmill mechanism may have influenced the evolution and adaptive radiation of Homo sapiens as a cultural species.

Citation: Kline, M., & Boyd, R. (2010). Population size predicts technological complexity in Oceania. Proceedings of the Royal Society B: Biological Sciences DOI: 10.1098/rspb.2010.0452.

• Category: Science 
🔊 Listen RSS

How does natural selection account for language? Darwin wrestled with it, Chomsky sidestepped it, and Pinker claimed to solve it. Discerning the evolution of language is therefore a much sought endeavour, with a vast number of explanations emerging that offer a plethora of choice, but little in the way of consensus. This is hardly new, and at times has seemed completely frivolous and trivial. So much so that in the 19th Century, the Royal Linguistic Society in London actually went as far as to ban any discussion and debate on the origins of language. Put simply: we don’t really know that much. Often quoted in these debates is Alfred Russell Wallace, who, in a letter to Darwin, argued that: “natural selection could only have endowed the savage with a brain a little superior to that of an ape whereas he possesses one very little inferior to that of an average member of our learned society”.

This is obviously relevant for those of us studying language evolution. If, as Wallace challenged, natural selection (and more broadly, evolution) is unable to account for our mental capacities and behavioural capabilities, then what is the source behind our propensity for language? Well, I think we’ve come far enough to rule out the spiritual explanations of Wallace (although it still persists on some corners of the web), and whilst I agree that biological natural selection alone is not sufficient to explain language, we can certainly place it in an evolutionary framework.

Such is the position of Prof Terrence Deacon, who, in his current paper for PNAS, eloquently argues for a role for relaxed selection in the evolution of the language capacity. He’s been making these noises for a while now, as I previously mentioned here, with him also recognising evolutionary-similar processes in development. However, with the publication of this paper I think it’s about time I disseminated his current ideas in more detail, which, in my humble opinion, offers a more nuanced position than the strict modular adaptationism previously championed by Pinker et al (I say previously, because Pinker also has a paper in this issue, and I’m going to read it before making any claims about his current position on the matter).

At its core, Deacon’s proposal is that the relaxation of selection pressures allows genetic control to be offloaded onto epigenetic processes. This in turn allows for a greater influence of social transmission due to development being open to experiential modification. Our capacity for language, then, is a story of how developmental and evolutionary dynamics interact. As a recent post over at Babel’s Dawn notes, this is basically a three-phase scenario:

  1. Standard primate brain in which midbrain areas (older parts of the brain) control vocal emotional communications.
  2. A duplication of a section of the genome leads to “relaxed selection” and extensive cross talk between many cerebral cortical systems (newer parts of the brain).
  3. “Unmasked selection” fixes new functional coordination and drives the brain’s anatomical reorganization.

The first aspect we need to appreciate is how Darwinian-like processes operate at the developmental-level. Deacon cites many instances, such as the fine-tuning of axonal connection patterns in the developing nervous system, where developmental processes are achieved through selection-like operations. Importantly, though, the logic differs from natural selection in one respect: “selection of this sort is confined to differential preservation only, not differential reproduction. In this respect, it is like one generation of the operation of natural selection”.The point he’s trying to get across is that these intraselection processes are taking place right across nature. Take, for instance, the genus Spalax (the blind mole rat): during development its thalamic visual nucleus is dominated by brainstem auditory and somatic projections. This is because the blind mole rat has vestigial eyes (hence the name), with projections from their small retinas being out-competed in favour of somatic and auditory functions. As Deacon notes:

Experimental manipulations in other species, in which projections from one sensory modality are reduced in early development, likewise exhibit analogous takeover effects, and manipulations of the sensory periphery likewise demonstrate that intraselection adapts neural functional topography with respect to functional experience.

Such developmental flexibility is crucial in that it provides a general mechanism for natural selection to recruit in brain evolution. And as such, it is almost certainly relevant to the evolution of the human brain in relation to language. From this evodevo perspective, Deacon goes onto highlight how Darwinian processes characterising natural selection (replication, variation, and differential fitness) have analogous counterparts in intraorganismic processes (redundancy, degeneracy, and functional interdependencies):

First, they involve processes that produce functional integration and/or adaptation even though they are generated by mechanisms that are dissociated from this consequence. Second, they all involve the generation of redundant variant replicas of some prior form (gene, cell, connection, antibody, etc.) brought into interaction with each other and with an external context in a way that allows these differences to affect their subsequent distribution. And third, their preservation and expression are dependent on correlation with context.

According to Deacon, these parallels with evolutionary processes are generally distinguished through the level at which selection operates — and how these interactions generate functional redundancy. Specifically, he looks at three types of redundancy: (i) internal redundancy; (ii) external redundancy; and, (iii) global external redundancy.

Types of Redundancy

In internal redundancy, duplicated genes enable relaxed selection. It’s an odd inversion natural selection, where instead of the competitive elimination of gene variants, evolution favours preservation. Under this scenario, the original gene will continue its functional role, whilst the duplicate gene is allowed to accrue mutations, subsequently increasing the possibility of deleterious interactions and exposing synergistic possibilities. Selection then either removes these deleterious interactions or takes advantage of the these new synergistic interactions. Furthermore, this is happening both within and between organisms, as Deacon explains:

For example, the duplication and differentiation of regulatory genes, such as the well-studied homeobox containing genes that control segmental organization in insects and vertebrates via their regulation of the expression of a diverse range of other genes, enables duplication-degradation-complementation at the phenotypic level. The generation of structural redundancy of body parts (e.g., limbs) via segmental duplication similarly relaxes selection on some with respect to others. Again, this increases the probability that random walk degradation will expose synergistic possibilities (e.g., locomotor function) that will become subject to selective stabilization in their own right.

Next is external redundancy which is the product of functional duplication, rather than gene duplication. Here, an extrinsic source provides the organism with a function previously supplied by a particular gene. Although these forms of duplication are analogous to one another in influence, they can produce significantly different consequences. Take, for instance, the loss of endogenous vitamin C synthesis in some lineages. Instead of internally synthesising vitamin C, humans, and some other species, must frequently acquire vitamin C from external sources. But as Deacon notes, “the human genome includes a pseudogene for the final enzyme in the ascorbic acid synthesis pathway: l-gulono-gamma lactone oxidase (GULO)”. Thanks to external sources of vitamin C, the human GULO gene underwent functional degradation and, if you think about it, shifted selection pressures onto a form of dietary addiction:

Because this essential nutrient was only available extrinsically, selection to maintain its antioxidant function shifted to any sensory biases, behavioral tendencies, and digestive-metabolic mechanisms that increased the probability of obtaining it. What was once selection focused on a single gene locus became fractionally distributed across a great many loci instead.

His last type, global external redundancy, is basically when the relaxation of selection produces global dedifferentiation effects. An example of this is when a species is not under strict reproductive and survival limitations, as seen in domestication. One example Deacon focuses on is the White-Backed Munia: over the course of approximately 250-years, Japanese breeders bred the Munia for colouration, and eventually came up with Bengalese Finch. What’s relevant about these domesticated hybrids is in their ability to acquire songs via social learning. Conversely, the White-Backed Munia does not acquire its song via social learning; rather, it’s genetically inherited. This leads to differences in the rigidity of the songs — the Bengalese Finch having far more variability within and between individuals than their wild cousins.

Oddly enough, it appears that by inhibiting the stabilising effects of natural and sexual selection for birdsong, the Bengalese Finch actually underwent behavioural complexity. This is clear in how the socially-acquired songs require far more forebrain nuclei and their interconnections, than the innately pre-specified song of the White-Backed Munia. As Deacon explains:

As constraints on song generation degraded with prolonged domestication, other neural systems that previously were too weak to have an influence on song structure could now have an effect. These include systems involved in motor learning, conditionally modifiable behaviors, and auditory learning. Because sensory and motor biases can be significantly affected by experience, song structure could also become increasingly subject to auditory experience and the influence of social stimuli. In this way, additional neural circuit involvement and the increased importance of social transmission in the determination of song structure can be reflections of functional dedifferentiation, and yet can also be sources of serendipitous synergistic effects as well.

By appealing to the dedifferentiation and redistribution effects of relaxed selection, Deacon argues for a tendency to shift from an innate, localised function onto a more distributed array of systems. Of course, Birdsong is far simpler than human language, and we’d be careful to avoid drawing too many conclusions. But there are, however, some interesting parallels between the differentiating aspects of human language and primate vocal communication and the Finch/Munia example:

These include (i) a significant decrease in the specific arousal-coupling of vocal behaviors, (ii) minimization of constraint on the ordering and combination of vocal sounds, (iii) reduction, simplification of the innate call repertoire, (iv) subordination of innate call features to a secondary role in emotional tone expression via speech prosody, (v) a significantly increased role of auditory learning via social transmission, (vi) widely distributed synergistic forebrain control of language compared with highly localized subcortical control of innate vocalizations, and, of course, (vii) an increased social-cognitive regulation of the function of vocal communication.

All this basically leads to the following question: Are humans a self-domesticated species? If so, genetic dedifferentiation of the nervous system may not only have led to the functional complexity in human language, but also contributed to more widespread degeneration; influencing our suite of seemingly unique cognitive, social and emotional abilities. Deacon’s central point is that these processes are not exclusively the products of natural and sexual selection. Instead, we must also appreciate the many levels of inter-linked dynamics at play, including phylogeny and ontogeny. It is also the case that language itself exhibits an evolutionary dynamic. This additional twist allows language to evolve and adapt irrespective of human biological evolution, which, as I mentioned here, can account for the arbitrary features of language, such as X-bar theory and case marking, without appealing to a domain-specific language module.

This is somewhat of a return to Deacon’s older argument: a coevolutionary scenario between language evolution and biological evolution. So that whilst our brains have undergone adaptation to the special demands of language processing, languages themselves are also utilising similar evolutionary mechanisms to favour advantageous variants. We should therefore see language adapting to constraints like learnability: where languages become increasingly learnable for their speaker-hearer community. Importantly, language is generally evolving faster than biology:

This means that brain functions selected for the special cognitive, perception, and production demands of language will reflect only the most persistent and invariant demands of this highly variable linguistic niche. This is another reason to expect that the synergistic constellation of human brain adaptations to language will not include specific grammatical content, and to suspect that much of the rich functional organization of any language is subject to influences on this extragenomic form of evolution. In other words, the differential reproduction of language structures through history will be dependent on the fidelity and fecundity of their transmission.

We’re only dealing with the basic framework here, and not any systematic treatment for how language evolved. What’s now needed is to incorporate findings, and test specific hypotheses, relating to these three basic evolutionary systems: phylogeny, ontogeny and glossogeny (picture taken from Kirby & Hurford, 2001).

Citation: Deacon, T.W (2010). A role for relaxed selection in the evolution of the language capacity. PNAS. DOI: 10.1073/pnas.0914624107.

• Category: Science 
🔊 Listen RSS

For those of you familiar with the formal mathematical models of cultural evolution (Cavalli-Sforza & Feldman, 1981; Boyd & Richerson, 1985), you’ll know there is a substantive body of literature behind the process of cultural transmission. In this respect, we have a great deal of theoretical knowledge regarding the three vectors of transmission: vertical, oblique and horizontal. We also know about the myriad of ways in which individuals may copy one another: via their parents or peers; conformist transmission (copying the most popular variant) or anti-conformist; randomly selecting someone to copy; and copying cultural traits produced by the most prestigious individuals within a group. Complicating the situation even more are social learning mechanisms: emulation, imitation, explicit teaching, and language. In addition to this, models have addressed when cultural transmission is preferable to individual learning and/or genetic evolution, with the general conclusion being: “that cultural transmission should be favoured when (i) environments change too rapidly for genes to track them effectively, but not so rapidly that the behaviour of a potential model becomes outdated, and/or (ii) individual learning is particularly costly or difficult” (Mesoudi & Whiten, 2008, my emphasis).

Keeping in mind all of these factors, it comes as a surprise that experiments into cultural transmission are lacking. If we look at evolutionary biology, then there are many experiments into small-scale microevolutionary processes, such as natural selection, sexual selection, mutation and drift, which are then applied in showing how these processes generate population-level, macroevolutionary patterns. It follows then, that this sort of population-level thinking can be applied to cultural evolution: the forces and biases of cultural transmission can be studied experimentally to see if they fit with population-level patterns of cultural change documented by scientists. As the current paper by Mesoudi & Whiten (2008) notes, this potentially gives cultural transmission experiments added significance: “cultural transmission should not only be studied for its own sake (i.e. in order to better understand cultural transmission itself), but also in order to explain broader cultural patterns and trends, all as part of a unified science of cultural evolution”.

Indeed, lab experiments into cultural transmission are growing in popularity. I think this is important because these experiments not only add to the insights revealed by mathematical modelling, theoretical literature and real-world data, they also provide alternative explanations for the observed patterns and idiosyncrasies of human behaviour. The current paper provides a nice overview of the literature and methodology used to study “some kind of transmission of information (knowledge or behaviour) along a chain or within a group of more than two participants”. Specifically, they focus on three experimental paradigms: the linear transmission chain, the replacement model and the closed-group method.

Linear Transmission Chain Method

Originally created by Federic Bartlett (1932) to investigate the role of memory, diffusion chains are comparable to the children’s game of Chinese whispers: here, some sort of cultural material (usually a sentence or phrase) is passed along a linear chain of individuals (see figure 1, taken from Mesoudi & Whiten), until it reaches the final person. At this point, the sentence or phrase is normally different to its original incarnation, having accrued errors due to repeated retellings.

These early experiments examined a whole host of material: from Native American folktales to descriptions of sporting events. In each of these studies the original material retained its overall meaning once reaching the end of the chain, but through repeated retellings along successive participants, the material also displayed two consistent factors of change: 1) the material became much shorter in length, and 2) the material lost much of its original detail. Bartlett also observed what he believed to be evidence for memory being reconstructive, with cultural material becoming distorted through a process of conforming to pre-existing mental schemas.

Recent studies using diffusion chains have further supported Bartlett’s claims of generalisation, whilst also investigating his claims of assimilation, such as pre-existing gender stereotypes and prior cognitive biases. In my own field of evolutionary linguistics, we have combined the diffusion chain methodology with artificial language learning to show that languages, as a consequence of intergenerational transmission, evolve to increase their own ability to be transmitted: by becoming easier to learn and increasingly structured (note: this is for the initial evolution of language (and its underlying structures), and is probably not as relevant to language change). The chains have also been adapted to investigate foraging techniques of chimpanzees and children, the transmission of social learning techniques in chimpanzee tool use, and observing the establishment of a wild-type song culture in zebra finch.

The general findings of all these studies demonstrate humans of all ages, non-human primates, and even non-human animals are all capable of high-fidelity cultural transmission – and that this can be studied empirically. In language evolution, for instance, there are ongoing debates surrounding the relative importance of the transmission versus pre-existing biases. However, the current paper focuses on the important point that some cultural transmission experiments resembles reconstruction rather replication. Now, for those of you familiar with memetics, this is certainly a counterpoint against the differential selection of high-fidelity memes. But importantly it does not contradict cultural evolution:

[…] the broader cultural evolution literature has long recognized that cultural transmission can be imperfect, vulnerable to distortion by content biases, and based on continuous rather than discrete (meme-like) traits (Cavalli-Sforza & Feldman 1981; Boyd & Richerson 1985). Models that make these assumptions are just as useful as models that assume high-fidelity particulate inheritance (Henrich & Boyd 2002). Similarly, while certain patterns of cultural variation might be explained by the operation of cognitive attractors, as argued by Sperber & Hirschfeld (2004), this should not preclude the possibility that cultural variation can be influenced by other cultural transmission biases too (e.g. conformity, see §5), as acknowledged by Claidiere & Sperber (2007). Or perhaps both model-based and content-based biases operate simultaneously but at different levels: for example, content biases might favour the transmission of minimally counter-intuitive concepts in general, but which specific minimally counter-intuitive concept a person adopts is determined by model-based biases such as conformity.

The Replacement Method and Microsocieties

First proposed by Gerard et al. (1956), the replacement method seeks to simulate generational turnover by the repeated removal and replacement of participants within small groups. Taking place over a short period of time, a single replacement can be viewed as a single cultural generation:

Researchers can then examine how group performance changes over successive generations, and how the socialisation of each new participants into the group affects this change… Generally, the replacement method is useful for simulating cultural change that occurs with changing group membership, as is found, for example, in business organizations with frequent staff turnover or traditional hunter-gatherer societies in which small groups maintain stable traditions despite continual population replacement via births, deaths and migration.

Three areas of research are highlighted in relation to the replacement method: cultural group selection; cumulative cultural evolution; and cultural innovation. All are yielding intriguing insights, but because I think the most interesting developments are those emerging from cumulative cultural evolution I’m going to focus solely on this area. First, to help you visualise the replacement method in action, here is an example of the design (taken from Mesoudi & Whiten, 2008):

Cumulative culture is simply the ability to accumulate cultural innovations across successive generations. As such, each new generation benefits from, and subsequently adds upon, the prior generations’ cultural knowledge. This observation is hardly new and is nicely captured by Isaac Newton’s popularisation of the utterance: “If I have seen a little further it is by standing on the shoulders of Giants“. By this, Newton was paying homage to previous generations (or it was a veiled jibe at Hooke) by saying that his particular insights and innovations in physics and mathematics are derived from past thinkers. Without these foundational bricks (such as the ancient Egyptian successes in calculating volumes and areas) it is highly unlikely the development of complex cultural products (such as integral calculus) is possible by a single individual within a single lifetime. In fact, that the fundamental theorem of calculus was independently discovered by Newton and Leibniz suggests the accumulation of knowledge during this time and place had reached a point in which the connection between integration and differentiation was ripe for a genius, or in this case, geniuses, to formulate.

Returning to the point, and most replacement studies reached the conclusion that participants at later generations generally demonstrate a marked improvement to those in the initial generations, regardless of the particular task. An example the current paper points to are studies by Insko et al. (1980, 1983), which found:

[…] the voluntaristic groups of traders increased their productivity and earning during successive replacements due to the emergence and intergenerational transmission of increasingly efficient trading tactics (e.g. soft bargaining: giving more than is received) and division of labour (e.g. seniority rules for leadership, where the longest serving member took charge).

A more recent study by Caldwell & Millen (2008) used the replacement methods to show how successive groups developed increasingly effective artefacts (paper aeroplanes and spaghetti towers). For instance, paper aeroplanes in later generations flew significantly further than those created by earlier generations. However, a common problem with this, and other studies into cumulative culture, is that they do not control for individual learning. The basis for this control being that it is difficult to conclude whether or not these experiments have demonstrated true cumulative culture: given the same length of time and ability to successively modify their artefact (e.g. paper aeroplane), could an individual have invented a product that performed equally as well (e.g. flew far) as those found in these transmission experiments?

The Closed Group Method

One method that controls for individual learning, but does not simulate replacement, is the closed group:

Individuals within a group repeatedly engage in a task or game over the course of the experiment, and the experimenter can manipulate the opportunities for cultural transmission (i.e. who can view and copy other participants’ behaviour and when) within the group (figure 3). This method is useful for simulating under controlled conditions the various cultural transmission biases modelled in the cultural evolution literature concerning ‘who’ people copy, such as conformity or prestige bias, as well as testing cultural evolutionary hypotheses regarding the conditions under which cultural transmission is predicted to be employed relative to individual learning (‘when’ questions).

There are several advantages to using this method over the other two mentioned. For instance, it is far quicker to complete these experiments than to have to account for the replacement of participants. What’s most relevant, however, is that in contrast to the general body of literature pertaining to the former methods, the closed-group studies are often “explicitly designed to test the assumptions and findings of existing theoretical models of cultural evolution”.

Mesoudi & O’Brien (2008) applied the closed-group method to experimentally simulate the cultural transmission of arrowhead designs. Specifically, they tested a hypothesis put forward by Bettinger & Eerkens (1999) in which the variation of Great Basin projectile points between Nevada and California were the consequence of two different transmission modes — indirect bias and guided variation. Looking at various design features, such as weight, width and length, Bettinger & Eerkens concluded that those points found in eastern California were the products of guided variation, due to the poor correlation between points within the group, whereas those found in central Nevada were subject to indirect bias, given their highly correlated features.

In the experiment, participants designed and tested their own virtual arrowheads, with certain types of arrowheads contributing to greater pay-off than others (you can play the game here). Modifying arrowhead designs could be achieved through two methods: either trial-and-error individual learning (guided variation) or by copying the most successful member of the group (indirect bias). Mesoudi & O’Brien found that during individual learning the arrowheads displayed an increasing amount of diversity. Those participants who could copy the most successful individuals eventually ended up with more homogeneous arrowhead designs.

But why did the arrowheads in Nevada converge towards a uniform design, in contrast to those found in California? One possible explanation is that the environment in prehistoric Nevada was harsher than the Californian environment. This rests on the assumption that social learning is more successful (adaptive) when individual learning is costly, with the Nevadan environment imposing a greater cost on experimentation, restricting the diversity, and increasing the reliance on indirectly biased cultural transmission.

However, as the authors point out, all of these conclusions are dependent on the shape of the fitness function underlying pay-offs of differing arrowhead designs:

Mesoudi & O’Brien (2008) assumed a multimodal adaptive landscape underlying arrowhead fitness, with multiple locally optimal designs (‘peaks’ in the landscape). Consequently, during periods of individual learning, different participants converged on different peaks in this adaptive landscape, thus maintaining within-group diversity in arrowhead designs. During periods of cultural learning, different participants converged on the high-fitness peak found by the most successful group member, thus reducing diversity and increasing overall group fitness. However, Mesoudi (2008b) showed that when the adaptive landscape is unimodal — with a single peak and a single optimal arrowhead design — then individual learners easily converge on this single peak and perform just as well as the cultural learners, thus eliminating the adaptive advantage of cultural learning.

The shape of the fitness landscape, then, is crucial in determining the benefits, and adoption, of a particular cultural learning strategy. For instance, point designs might display functional tradeoffs across a range of factors: accuracy, range, durability, killing power etc. Cheshier & Kelly (2006) examined this problem, and found that “thin, narrow points have greater penetrating power, but wide, thick points create a larger wound that bleeds more easily”. As such, we end up with a situation where there are several, local optimum point designs: one may maximise penetrating power and another maximises bleeding. To solve this problem, arrowhead designs, among many other cultural artefacts, are likely to be compromises between multiple functions and requirements.

I think the central point emerging from all of these experiments is an obvious one: that we need to build up a body of mutually supporting evidence (or in some cases, we need to challenge long-standing assumptions from one area to see if they hold). This is echoed by the authors in their conclusion:

While mathematical models in the gene-culture coevolution/cultural evolution tradition have produced invaluable insights into the processes of cultural change, laboratory experiments are needed to test the assumptions and findings of these models with actual people. Similarly, while, historical, ethnographic and archaeological studies of cultural evolution… are invaluable in providing real-world data regarding cultural change, laboratory experiments offer a degree of control and manipulation that is is impossible to achieve with naturalistic studies. Of course, laboratory experiments also have their shortcomings, most obviously deficits in external validity resulting from the simple tasks and artificial laboratory settings involved. However, when experiments are used in conjunction with other methods, as part of a unified science of cultural evolution (Mesoudi et al. 2006b), then a better understanding of cultural phenomena can be attained than when a single method is used alone.

Citation: Mesoudi, A & Whiten, A (2008). The multiple roles of cultural transmission experiments in understanding human cultural evolution. Philosophical Transactions of the Royal Society B 363, 3489–3501. DOI: 10.1098/rstb.2008.0129.

• Category: Science 
🔊 Listen RSS

Over at Babel’s Dawn, Edmund Blair Bolles has written several blog posts about the recent Evolang 2010 conference. They’re all worth reading, just to get a gist of the varying approaches taken to language evolution, with Bolles singling out talks by Morten Christiansen and Terrence Deacon as being particular highlights. Not too surprisingly, Deacon is discussing his latest idea (which I mentioned here) about relaxed selection. In Bolles’ own words:

The strength of Deacon’s presentation was that it described a mechanism for the brain changes that support language. The old view that language functions are confined to a few regions like Broca’s area and Wernicke’s area, or even the left hemisphere can no longer stand. Language processing involves complex coordination between multiple systems. But the modern human brain is a relatively recent acquisition. How did all that complexity evolve and become coordinated?

Deacon proposes a three-phase scenario:

  1. Standard primate brain in which midbrain areas (older parts of the brain) control vocal emotional communications.
  2. A duplication of a section of the genome leads to “relaxed selection” and extensive cross talk between many cerebral cortical systems (newer parts of the brain).
  3. “Unmasked selection” fixes new functional coordination and drives the brain’s anatomical reorganization.
• Category: Science 
🔊 Listen RSS

It is well documented that Thomas Robert Malthus’ An Essay on the Principle of Population greatly influenced both Charles Darwin and Alfred Russell Wallace’s independent conception of their theory of natural selection. In it, Malthus puts forward his observation that the finite nature of resources is in conflict with the potentially exponential rate of reproduction, leading to an inevitable struggle between individuals. Darwin took this basic premise and applied it to nature, as he notes in his autobiography:

In October 1838, that is, fifteen months after I had begun my systematic inquiry, I happened to read for amusement Malthus on Population, and being well prepared to appreciate the struggle for existence which everywhere goes on from long-continued observation of the habits of animals and plants, it at once struck me that under these circumstances favourable variations would tend to be preserved, and unfavourable ones to be destroyed. The results of this would be the formation of a new species. Here, then I had at last got a theory by which to work.

The interaction of demographic and evolutionary processes is thus central in understanding Darwin’s big idea: that exponential growth will eventually lead to a large population, and in turn will generate competition for natural selection to act on any heritable variation which conferred a greater fitness advantage. Under these assumptions we are able to interpret the evolutionary record of most species by appealing to two basic causal elements: genes and the environment. As we all know, in most cases the environment generates selection pressures to which genes operate and respond. For humans, however, the situation becomes more complicated when we consider another basic causal element: culture. The current paper by Richerson, Boyd & Bettinger (2009) offers one way to view this muddied situation by delineating the demographic and evolutionary processes through the notion of time scales:

The idea of time scales is used in the physical environmental sciences to simplify problems with complex interactions between processes. If one process happens on a short time scale and the other one on a long time scale, then one can often assume that the short time scale process is at an equilibrium (or in some more complex state that can be described statistically) with respect to factors governed by the long scale process. If the short time scale and long time scale interact, we can often imagine that at each time step in the evolution of the long time scale process, the short time scale process is at “equilibrium.” A separation of time scales, if justified, makes thinking about many problems of coupled dynamics much easier.

The potential problem emerges when you consider the rapid pace of cultural evolution: is there even a separation between demographic and cultural evolutionary time scales? Of the data available to us, it seems throughout most periods of human history there appears to be a time lag. For instance, the Acheulean lasted for around a million years, and even during the Holocene the first states emerge approximately 5,000 years after the origins of agriculture. However, as the authors note, during the modern Industrial Revolution, large scale changes in culture occurred on the same time scale as population growth rates. This also leaves room for the potential situation of culture evolving even more rapidly than population growth. From these observations they argue the most important factor on time scales of a millennium or greater is the rate of intensification by innovation, not population growth:

Thus in the conventional Darwinian picture population pressure plays an exceedingly important role but on a short time scale. The struggle for existence can be taken for granted. Evolution plays out as adaptive innovations on the long time scale increase the carrying capacity for the environment for the population in question […] For example, we do not expect to see any systematic evidence of increased population pressure immediately before major innovations. Population growth is likely to result from innovations, not the other way around, on the time scales that we normally observe in the archaeological record.

Richerson et al use these links between demography and innovation rates to consider three periods of human history: the late Pleistocene, the Holocene, and modern times. The period I’m going to focus on is the Pleistocene. Here, large-brained hominins existed in Africa and west Eurasia for approximately 150,000 years with relatively slow rates of technical innovation. Then following the last glacial period around 50,000 years ago, significant modernisation began to take place – albeit limited to certain geographical locations. Prior to this we do instances of complex tools and symbolic inventions in Africa, yet these cultural advances were frequently followed by retreats. To help explain these patterns, Richerson et al ask the following:

What limited “progress” at different periods of our evolution? Was environmental change leading the trajectory by means of a more or less monotonic increase in selection for larger brains and increased cultural sophistication? Or were genes or culture slow to respond to selection pressures that were exerted from the beginning of the Pleistocene or even earlier?

One potential solution lies in genetics: that despite having big brains, our ancestors simply lacked some crucial cognitive ability. After all, we know brain size is not necessarily the overriding feature of complex cognition. It could come down to differences in development, neuroantomical organisation, cortical gyrification, cytoarchitecture etc. We know in the cases of transcription factor-encoding genes, like egr1, social information can lead to changes in brain-gene expression, brain function, and social behaviour throughout the lifetime of an organism. Genes also influence the social behaviour of an individual by effecting brain development and physiology. A commonly cited candidate is foxp2: it’s implicated in underlying many socially embedded behaviours, with some proposing the human variant underwent a relatively recent selective sweep of ~42,000 BP. This is certainly congruent with claims by the likes of Richard Klein, in which this genetic change took place approximately ~50,000 BP.

These issues aside, Richerson et al refer to environmental conditions as being the primary rate-limiting factor on cultural innovation during the Pleistocene. Specifically, the authors refer to instances of high-frequency high-amplitude climate variation known as Dansgaard-Oeschger cycles:

The apparent intensification of the Dansgaard-Oeschger cycles over time may have driven much of the evolution in human cultures in the last 250,000 years. The very intense Dansgaard-Oeschger cycles after 60,000 years ago are a potential explanation for the spread of modern humans out of Africa and the evolution of mode 4 toolmakers in western Eurasia. But why would cold, dry, variable environments have favored increases in the cultural sophistication of humans?

At first, it may appear that fluctuating and variable environments would impede rather than favour human populations. When you factor in culture, however, our species are in a unique situation to exploit an environment far more rapidly than other competing carnivores. This competition may well explain why humans were found at such low-levels prior to 50,000 years ago: even with a certain degree of technical sophistication, humans were unable to compete with large animals that are specially adapted to hunting. We can contrast this situation with contemporary cheetah and wild dog populations: competition with larger predators, such as lions, have forced these smaller predators from habitats abundant in medium and large herbivores. As a consequence, both cheetahs and wild dogs are found at relatively low population levels, displaying low levels of genetic diversity – a feature they share with us.

Unlike cheetahs and wild dogs, one suggestion for humans being able to overcome their low population levels, and move out of their narrow niche, is because our cultural and behavioural sophistication allows us to navigate these complex environmental situations:

Humans could probably have tracked the ever-changing kaleidoscope of large animal prey more easily than their competitors (lions, dogs, wolves, and hyenas). Humans could find the dynamic, ephemeral situations where an herbivore population was temporarily out of equilibrium with their prey and exploit the windfall before our competitors could figure out the rapidly changing ecosystems.

Social organisation and weapon technology are also novel ways to reduce competition. For instance, hunting or trapping the predators themselves would be an active way of getting rid of them. Generally though, the authors provide three instances where hominins produced adaptive solutions to the challenges presented by these dynamic glacial environments: 1) mode 3 toolmakers were able to routinely hunt herbivores, and be marginally successful, even when in competition with other competitors; 2) Increasing social complexity could overcome the challenge of uncertainty that’s inherent to a noisy environment. Here, the larger and denser West Eurasian populations of the Upper Paleolithic may have discovered a better solution to food security problems that eluded Middle Paleolithic people; 3) The ability to maintain a cultural evolutionary system that is responsive to intense millennial and submillennial scale variation.

Based on these points, Richerson et al build up a scenario whereby prior to 50,000 years ago human populations were kept small, probably due to competition with large predators. For humans, another consequence of having a small population is the Tasmanian Effect (which I discussed here): where complex technologies may be lost in small populations by chance. When the Dangaard-Oeschger cycles began to increase in frequency, culture gave humans an adaptive advantage in a situation their competitors could not exploit. Still, the palimpsest of mode 3 and mode 4 industries at different locations and points in history were subject to highly variable conditions. When these were favourable, hominin populations could reach sizes capable of sustaining certain levels of sophistication. However, as the authors note:

If environments remained poor enough for long enough, a population that had achieved Upper Paleolithic complexity might suffer a Tasmanian-style loss of complexity and drop back to the Middle Paleolithic equilibrium. This sort of dynamic is sometimes called a hysteresis loop. Rather than reacting directly to an environmental change, a population will have a strong tendency to remain either large or small. Given a sufficiently large and persistent increase in K [carrying capacity], it will jump to a higher equilibrium, where it will persist under deteriorating environmental conditions under which the high equilibrium can be sustained but cannot be attained by a population at the low equilibrium. Time lags will be built into the cultural system. Complex elements of technology will not be gained or lost instantly.

Richerson et al also refer to another process that might contribute to the palimpsest mixture of mode 3 and mode 4 tool traditions in Africa. Given their larger and greater levels of sophistication, human populations could have overhunted. The subsequent collapse of prey populations therefore cause human populations to shrink, which in turn may revert their production capabilities to mode 3 tools. This then allows the prey populations to recover. In west Eurasia, though, hominin populations were quite capable of maintaining mode 4 tool kits for long periods during the upper Paleolithic. One explanation for this is their location: they were situated on the maritime end of a huge Mammoth Steppe biome:

Both Neanderthals and anatomically modern humans lived in central Siberia, at least during the favorable interstadial periods, but apparently never penetrated the Beringian region… [the] fuelwood shortages in the Verkhoyansk mountains on the western boundary of Beringia formed an impenetrable barrier to human settlement until substantial climate warming… Big game animals depended on heavy fur, not fires, as protection from cold and probably spread readily across the Verkhoyansk barrier. Thus Mammoth Steppe hunters would have had what amounted to a large natural protected reserve in Bergingia.

Human populations near these natural reserves could exist at high levels because they did not drive down the populations of Mammoths, and other large game, to low levels. This is compared to Africa, South and East Asia and Australia – all of which consist of warmer climates and no natural refuges for large game. West Eurasia then, seems to be in a unique situation at this point in history, with human populations being able to maintain a large carrying capacity and subsequently sustain mode 4 technologies. Of course, there are alternative explanations as to why hominins were able to maintain a complex culture during the particularly inhospitable parts of the Dansgaard-Oeschger event. One example is that an increased frequency of these cycles prior to the expansion of modern humans selected genes allowing for higher rates of cultural innovation (in contrast to Richard Klein’s proposed mutation).

Aside from the issues raised, the maximum rate of innovation is not necessarily dependent on population density or size. Richerson et al point to several rate-limiting factors, including the possibility of delayed social innovation retarding the rate of technical progress. The gene-culture coevolution process is another instance where the need for major genetic evolution is an impediment to the rate of cultural evolution. And what about the possibility of innovations occurring in punctuated bursts? That is, if large technical revolutions are the basis for change, then the assumption of a smooth innovation rate is incorrect. Their last point, and what I think is an important one, argues the number of individuals may not be the most important demographic factor contributing to the Tasmanian Effect. Rather, it’s the ratio of old to young adult individuals:

Caspari and Lee (2006) used dental wear to roughly estimate the ratio of old to young adult individuals in hominin fossil death assemblages from the Australopithecines to the Upper Paleolithic. Slight increases are evident at each major change of taxa with one major exception: Upper Paleolithic people had an old to young adult ratio of about 2.1, whereas the European Neanderthals had a ratio of only 0.35. In Southwest Asia, where Neanderthals and anatomically modern humans coexisted using Mousterian technology, the small dental sample suggests that both populations had an old to young ratio of about 1… Caspari and Lee suggest that a cultural rather than genetic change was responsible for this difference. The changes are reciprocal in that older adults can accumulate and transmit more culture than young adults and can accumulate more individually acquired knowledge. Caspari and Lee’s analysis lends weight to the idea that large-brained hominins of the late Pleistocene had bi- or multi-stable population dynamics.

All of these are potentially salient factors, but as the authors themselves state:

Our objectives here are limited. We cannot provide a thorough review of the literature on paleodemography, paleoecology, and paleoanthropology. We realize that many elements of our scenarios rest on controversial evidence if not rank speculation. We do hope to clarify the relationship between demographic and cultural evolutionary processes so that we can formulate better hypotheses about several of the puzzling aspects of the paleoanthroplogical record as we currently understand it.

Citation: Richerson PJ, Boyd R, & Bettinger RL. (2009) Cultural innovations and demographic change. Human Biology, 81(2-3), 211-235. DOI: 10.3378/027.081.0306

• Category: Science 
🔊 Listen RSS

For those of you more interested in listening rather than reading, then the journal Language Learning has a load of podcasts about language as a complex adaptive system. If you fancy some reading, here is the position paper by the Five Graces group. Below is the abstract:

Language has a fundamentally social function. Processes of human interaction along with domain-general cognitive processes shape the structure and knowledge of language. Recent research in the cognitive sciences has demonstrated that patterns of use strongly affect how language is acquired, used, and changes over time. These processes are not independent from one another but are facets of the same complex adaptive system (CAS). Language as a CAS involves the following key features: The system consists of multiple agents (the speakers in the speech community) interacting with one another. The system is adaptive, that is, speakers’ behavior is based on their past interactions, and current and past interactions together feed forward into future behavior. A speaker’s behavior is the consequence of competing factors ranging from perceptual constraints to social motivations. The structures of language emerge from interrelated patterns of experience, social interaction, and cognitive mechanisms. The CAS approach reveals commonalities in many areas of language research, including first and second language acquisition, historical linguistics, psycholinguistics, language evolution and computational modeling.

• Category: Science 
🔊 Listen RSS

The Guardian has a great extract from Alex Bellos‘ new book Alex’s Adventures in Numberland. Besides sounding like the title to a mathematician’s experimentation with LSD, the book dedicates a section (the abstract in the Guardian) on the work of a linguist, Pierre Pica, and his discovery that the Munduruku tribe only count up to five. Although even this claim is dubious:

When there was one dot on the screen, the Munduruku said “pug“. When there were two, they said “xep xep“. But beyond two, they were not precise. When three dots showed up, “ebapug” was said only about 80% of the time. The reaction to four dots was “ebadipdip” in only 70% of cases. When shown five dots, “pug pogbi” was managed only 28% per cent of the time, with “ebadipdip” given instead in 15% of answers. In other words, for three and above the Munduruku’s number words were really just estimates. They were counting “one”, “two”, “three-ish”, “four-ish”, “five-ish”. Pica started to wonder whether “pug pogbi“, which literally means “handful”, even really qualified as a number. Maybe they could not count up to five, but only to four-ish?

The whole article reminded me of another Amazonian tribe, the Pirahã, and how they don’t even have a numerical system. In fact, many of the Amazonian tribes and languages are throwing up some interesting findings, like a restricted numerical system and the absence of both quantifiers and grammatical tense. For those of you interested, I’d recommend reading Dan Everett’s book (Don’t Sleep There Are Snakes) on his experiences with the Pirahã.

• Category: Science 
🔊 Listen RSS

I’m always interested in ways of using social networking sites as massive pools of data for researchers to mine. So on that note check out The Adventures of Auck‘s post on Cultural Variation and Social Networks. The post is more of a rumination, accompanied by some simple statistical analyses, than a detailed exposition. But it is does highlight the potential of using social networking sites (he uses Twitter) to investigate a specific question, which in this case is: Do bilingual communities have different social network structures to monolingual communities? He’s got some pretty intriguing results, and like him I’m not sure what to make of them, but I certainly hope this is something he’ll develop in the future.

On another note, the blog also examines the surprising relationship between language evolution and Acacia trees.

• Category: Science 
🔊 Listen RSS

For some time now, evolutionary biologists have used phylogenetics. It is a well-established, powerful set of tools that allow us to test evolutionary hypotheses. More recently, however, these methods are being imported to analyse linguistic and cultural phenomena. For instance, the use of phylogenetics has led to observations that languages evolve in punctuational bursts, explored the role of population movements, and investigated the descent of Acheulean handaxes. I’ve followed the developments in linguistics with particular interest; after all, tracing the ephemeral nature of language is a daunting task. The first obvious road block is that prior to the invention of writing, the uptake of which is limited in geography and history, language leaves no archaeological record for linguists to examine. One particular note I’d like to make is that when Charles Darwin first formulated his theory of natural selection, he took inspiration from linguistic family trees as the basis for his sketch on the evolutionary tree of life. So it seems rather appropriate that phylogenetic approaches are now being used to inform our knowledge regarding linguistic evolution.

Like many other attempts applying evolutionary thinking in culture, phylogenetic approaches are, at times, met with contempt. This stems from assertions that cultural evolution and biological evolution differ greatly in regards to the relative importance of horizontal transmission, as evinced in these two quotes:

The course of organic evolution can be portrayed properly as a tree of life, as Darwin has called it, with trunk, limbs, branches, and twigs. The course of development of human culture in history cannot be so described, even metaphorically. There is a constant branching-out, but the branches also grow together again, wholly or partially, all the time. A branch on the tree of life may approach another branch; it will not normally coalesce with it. The tree of culture, on the contrary, is a ramification of such coalescences, assimilations, or acculturations – Alfred Kroeber (1948, pg. 138).

Human cultural evolution proceeds along paths outstandingly different from the ways of genetic change. Trees are correct topologies of biological evolution. In human cultural evolution, on the other hand, transmission and anastomosis are rampant. Five minutes with a wheel, a snowshoe, a bobbin, or a bow and arrow may allow an artisan of one culture to capture a major achievement of another – Stephen Jay Gould (1987, pg. 70).

Both Gould and Kroeber provide an interesting entry point into some of the criticisms of phylogenetics. Cladistic (phylogenetic) theories approach the problem of modelling culture from a historical perspective: languages, cultures and populations are primarily derived from a parent group. A visual representation of this would be a bifurcating tree. However, what the above quotes emphasise is a rhizotic (reticulate) approach, in which the focus is more geared towards a non-treelike representation of transmission: here, linguistic and cultural features are shaped by several different, antecedent groups. In considering these different mechanisms of transmission we are faced with two broad themes. The first thought is in regards to the relative influence of horizontal and vertical forms of transmission. Is there an overarching form of transmission that dominates the variation found in cultural and linguistic features? Or do we need to examine this on a case by case basis? So, for instance: some cases are dominated by vertical transmission, whilst other cases are dominated by horizontal transmission. Secondly, even if there is a reasonable level of horizontal influence, then does this invalidate the use of cultural phylogenies?

Addressing these issues are two papers. The first provides an overview of the relative importance of both horizontal and vertical intergroup transmission in human culture, and then applies these findings to chimpanzee cultural diversity. In the second, the authors test the robustness of phylogenetic inferences in regards to horizontal transmission.

The first paper by Lycett, Collard & McGrew (2009) uses data from long-term field studies to investigate behavioural differences among groups of wild Pan troglodytes. A revealing aspect of this paper, and testament to the power of phylogenetic methods, is their use of cladistics to shed light on three sets of hypotheses: 1) that genetics explains the behavioural diversity seen in chimpanzees; 2) evaluating “the importance of vertical intergroup transmission in the evolution of chimpanzee diversity” (my emphasis); 3) to investigate whether chimpanzee culture is adaptive. I’m not going to discuss points (1) and (3), mainly due to limitations of space, time and relevance, but suffice to say that having established the first point can’t account for chimpanzee behavioural variation, the authors move onto investigating the role of vertical intergroup transmission. To do this, they carried out two analyses:

First, we reasoned that if vertical intergroup transmission has been the dominant process, the multiregion MP cladogram should be statistically indistinguishable from a cladogram in which the eastern and western populations form separate clades. Next, we first used the RI [Retention Index] to assess how tree-like are patterns in comparative samples of human cultural data sets and biological data sets. Given that the biological data sets can be confidently assumed to have been structured by speciation, which is a branching process, our rationale was that if the human cultural and the biological RIs are not significantly different, then it is reasonable to conclude that the human cultural data sets have been structured by vertical intergroup transmission. We also reasoned that if the chimpanzee cultural RIs fall within the range of human cultural RIs, then whichever process is found to have structured the human cultural data sets is likely to have structured the chimpanzee data sets.

They found the first analysis revealed no significant differences between the multiregion MP cladogram and a cladogram in which eastern and western populations form separate clades. Furthermore, the returned RIs show human cultural data sets are not significantly different to biological data sets. In fact, the authors note that:

[…] the fit between the bifurcating tree model and the human cultural data sets is little different from the fit between the bifurcating tree model and the biological data sets. Not only are the averages similar, but also the ranges are comparable. The RIs for 25 human cultural data sets range from 0.42 to 0.80. The mean RI for the human cultural data set is 0.60. The RIs for the biological data sets range from 0.35 to 0.94. Their mean RI is 0.61. Thus, on average, the human cultural data sets appear to be no more reticulate than the biological data sets.

Thus, based on the aforementioned data sets, it appears vertical intergroup transmission is more salient in generating human cultural diversity than horizontal intergroup transmission. This also appears to be the case for the chimpanzee data sets, with both the multiregion chimpanzee cladogram (RI: 0.44) and the regional chimpanzee cladogram (RI: 0.53) falling within the range of RIs produced by the human data sets. As you can see, the regional chimpanzee RI is close to the mean in the human data sets. All of this suggests a tree-like appearance for chimpanzee and human cultural data sets, and supports the notion that cultural behaviour is primarily the product of vertical intergroup transmission. Of course, as the authors note: if we look closely at the human data sets, then there are individual instances where horizontal transmission is the dominant evolutionary process. One example being the transmission of basketry traditions amongst Californian Indians. Still, it appears in the majority of cases, the dominance of horizontal transmission is the exception rather than the rule. Time will tell whether these observations hold up to further scrutiny; as more experiments into the transmission of cultural variants allow us to produce larger analyses of the data sets.

The second paper by Greenhill, Currie & Gray (2009) asks two specific questions relating to phylogenetics and culture: 1) What are the effects of different levels of horizontal transmission on the accuracy of phylogenetic estimates? 2) Are the problematic levels of horizontal transmission actually representative of real situations? Another major point in this paper is the use of Bayesian inference in phylogenetics. Basically, it’s the use of a likelihood function as a method of phylogeny estimation in generating a posterior distribution for the parameter (the phylogenetic tree and a model of evolution). This is in contrast to Maximum Parsimony which, besides being non-parametric, is not statistically consistent.

Using computer models, the authors simulate the evolution of languages under a natural model of language change:

First, we generated data by simulating the evolution of linguistic traits on two different tree topologies under varying degrees of horizontal transmission. Then, we evaluated how horizontal transmission affects the ability of these methods that do not account for horizontal transmission to recover the ‘true’ tree topology. Finally, we explored how borrowing affects inferences taken from the tree structure by attempting to estimate the age at the root of the trees, and how this varies from the root age of the true trees.

The general conclusions of the paper is that phylogenetic inference is surprisingly robust, even when high levels of borrowing are introduced. This is based on instances of a local borrowing scenario (where horizontal transfer takes place between geographical neighbours) over a range of 0-30 per cent of unidentified borrowing. Under these conditions, there are few differences between true and estimated tree topologies. Discrepancies do arise in the root time estimates, with an increased amount of borrowing tending to yield younger age estimates. This is important if your hypothesis is dependent on the phylogenetic dating of historical events:

For example, there has been considerable controversy over the use of phylogenetic methods to infer the age of Indo-European language family […] Gray & Atkinson support an older farming-based dispersal from Anatolia ca 8500 BP rather than the ‘Kurgan’ hypothesis that dates this family to ca 6000 years BP. Our results suggest that if unidentified borrowing has affected these divergence time estimates, then the real age may be older than that suggested by Gray & Atkinson, making the Kurgan hypothesis even less probable.

They also explore how the shape of tree topologies offer different levels of robustness. For instance, a tree with a ‘balanced’ topology — in which most nodes reflect an equal number of offspring cultures and have relatively long internal branches — are more robust to the effects of borrowing than a tree composed of shorter internal branches and a chained pattern of descent (an unbalanced topology). Lastly, some types of borrowing will adversely affect the trees. The authors identify two major categories of borrowing: non-systematic and systematic. Simply put, in a non-systematic scenario the levels of borrowing increases the amount of noise found in the data. However, this is not enough to introduce any systematic bias. An example here would be the English language borrowing the word ‘taboo‘ from Tongan, where the influence is relatively minute in shaping the historical trajectory of English. A systematic scenario then, is when systematic biases are introduced into the data. An extreme example of this is the Oceanic language Yapese, which has several different sources of vocabulary. As the authors note:

This systematic borrowing will perturb the topology by drawing the interacting languages together, making them appear to be more similar. This will have the further effect of making any time-depth estimates shallower. However, these types of borrowing tend to occur within a small subset of the taxa being examined and will not necessarily affect other parts of the tree or the broader scale inferences.

I guess the general points to take away from this post are: 1) Do not necessarily assume horizontal transmission is dominant in shaping culture; and, 2) Even with certain levels of reticulation, it does not necessarily invalidate a phylogenetic approach in investigating cultural and linguistic evolution.


Lycett, S.J., Collard, M., & McGrew, W.C. (2009). Cladistic analyses of behavioural variation in wild Pan troglodytes: exploring the chimpanzee culture hypothesis. Journal of Human Evolution, 57, 337-349. DOI: 10.1016/j.jhevol.2009.05.015

Greenhill, S.J., Currie, T.E., & Gray, R.D. (2009). Does horizontal transmission invalidate cultural phylogenies? Proc. R. Soc. B, DOI: 10.1098/rspb.2008.1944

• Category: Science 
🔊 Listen RSS

When examining the dispersal of Pleistocene hominins, one of the more fascinating debates concern the patterns of biological and technological evolution in East Asia and other regions of the Old World. One suggestion emerging from palaeoanthropological research places a demarcation between these two regions in the form of a geographical division known as the Movius Line. Specifically, the suggestions that initially led to the Movius Line were based on observations of differing technological patterns, namely: the lack of Acheulean handaxes and the Levallois core traditions in East Asia.

Since Hallam L. Movius’ initial proposal, the recent discovery of handaxes within East Asia have led to suggestions that the Movius Line is in fact obsolete. Suggesting this may not in fact be the case is a recent paper by Stephen Lycett & Christopher Norton, which highlights three central points coming from a growing body of research: 1) “several morphometric analyses have identified statistically significant differences between the attributes of specific biface assemblages from east and west of the Movius Line”; 2) “The number of sites from which handaxes have been recovered in East Asia tend to be geographically sparse compared with many regions west of the Movius Line”; 3) “‘handaxe’ specimens tend only to comprise a small percentage of the total number of artefacts recovered, a situation that contrasts with many classic Acheulean sites in western portions of the Old World, where bifacial handaxes may dominate assemblages in large numbers”.

In light of these developments, the current paper combines cultural transmission theory and demography to produce a “generalised model for Palaeolithic technological evolution during the Pleistocene”. As it is generally assumed, cultural transmission underlies technological traditions, which may be taken as “a particular behaviour (e.g., tool manufacture and use) that is repeated over generations, and is learned and passed on between individuals via a process of social interaction”. Furthermore, we now also know such forms of transmission can be modelled in an analogous manner to that of genetic transmission. In fact, the parallels between genetic and cultural transmission extend to the point where “factors known to structure patterns of genetic variation and transmission (e.g., drift, selection, dispersal and demography) must also be taken into account when examining patterns of cultural variation across space and time”.

Like in population genetics, Lycett & Norten refer to effective population size (Ne), except instead of meaning the number of individuals actively involved in passing on genetic material, they use the term to mean “the number of skilled practitioners of a given craft tradition involved in passing on those skills to subsequent generations via social transmission”. As such, small effective populations rely more on stochastic factors, such as drift, in shaping which cultural variants will be passed on to future generations. By contrast, large populations are less likely to be impacted upon by stochastic sampling effects – allowing for a greater number of innovations to diffuse throughout the population. This is supported via mathematical modelling, which shows:

[…] that a decrease in effective population size (Ne) may lead to a loss of pre-existing socially transmitted cultural elements… [therefore] the greater the number of models, the more choice is available for selecting the best (i.e. most skilled) models from which to copy. That is, in larger populations, cumulative cultural learning is possible because the effect of having a larger number of models from which to pick the most skilled, exceeds the losses resulting from imperfect copying of that skill. Hence, the chance of copying the most skilled elements of a given practice correlates directly with the number of models from which to copy.

An example of these demographic conditions in action comes from the Tasmanian Islanders. When Tasmania became isolated from Australia approximately 10-12 thousand years ago, the newly established islanders appear to have lost and/or never developed the ability to manufacture a range of technologies, including: fishing spears, cold-weather clothing and boomerangs. Described as a ‘cultural founder effect’, Lycett & Norton use these observations to implement the following three conditions in constructing their model: population size, density and social interconnectedness:

Social interconnectedness reflects the likelihood of encountering a given craft skill and the regularity of such encounters. Social interconnectedness is thus somewhat proportional to the parameters of effective population size (i.e. number of skilled craft practitioners) and population density (i.e. probability of encounter due to degree of aggregation).

By varying these population conditions, Lycett and Norton use their model to show how different stages of lithic technologies may be sustained (see figure below, taken from Lycett & Norton(2010)). Crucially, demographic levels may decrease and lead to a situation where sustaining already-created technological innovations may not be tenable for the long-term. In providing a null model, they remain neutral as to the effects of cognitive and biomechanical evolution on the emergence and disappearance of technological patterns. In fact, the demographic processes described (larger populations and greater density) mirror those conditions under which human biological evolution may have accelerated. Meaning, human population structures are advantageous for the spread of novel technologies, cultural variants and adaptive genes. This may have implications for why we are seeing growing evidence of gene-culture coevolution in modern human populations.

Although it’s prohibitively difficult to estimate the demographic parameters of ancient hominin populations, it is fairly certain that demographic levels will have been relatively higher within Africa than in other parts of the world during the Early to Middle Pleistocene. When coupled with evidence showing the earliest First Appearance Dates (FADs) for Mode 1, Mode 2 and Mode 3 technologies in Africa, the demographic model presented is certainly congruent with the assertion that there is a definitive link between the spread of technological innovations and sustained population growth. Thus, if hominin populations were temporally and spatially discontinuous, then they simply lacked the necessary population conditions to maintain a certain degree of technological sophistication. By extension, explaining the relative absence of bifacial implements in East Asia is perhaps a demographic problem.

As mentioned earlier, we do see some evidence for Acheulean technology in East Asia. But this is not a refutation of the model. Rather, the discovery of these technologies perfectly fits within the model, especially as it is not at any significant level when compared to those found west of the Movius Line. That is, even if Acheulean technologies were imported to East Asia, the demographic conditions would be unfavourable in maintaining them. In this case, we would expect to see Acheulean technology at relatively low densities in East Asia. As the authors note:

One further observation that is not often noted is that the artefact density of most of the Early Palaeolithic sites in East Asia is also usually very low. For instance, in Fangniushan and Chenshan, two Middle Pleistocene open-air sites in central-east China, the artefact densities are less than one per m3

Of course, artefact density may be influenced by a whole host of factors. Raw material availability is one example. More broadly, the markedly different technological patterns observed may not be causally related to demography at all. We could find evidence for sharp cognitive differences between hominin populations east and west of the Movius Line. However, the authors predict that these explanations will become increasingly problematic if “site densities and the chronological distribution of sites in East Asia […] continue to differ from those in the west”.

But this is what’s great about null models: they are testable. In this case, the model predicts that:

[…] evidence for demographic levels in East Asia will be found to be significantly different from those in many parts of western Eurasia and Africa during the Early and Middle Pleistocene. Here, we have hinted at some of the currently available evidence that suggests this may have been the case. What is now urgently needed are more sophisticated means than we have provided here of assessing Pleistocene demographic parameters in the key regions east and west of the Movius Line.

So it’s still very much an open question as to why we find differing technological patterns east and west of the Movius Line. But demography certainly seems like a good candidate.

Citation: Lycett, S.J & Norton, C.J. A demographic model for Palaeolithic technological evolution: The case of East Asia and the Movius Line. Quaternary International, 211 (1-2), 55-65 DOI: 10.1016/j.quaint.2008.12.001.

• Category: Science 
🔊 Listen RSS

Not only do Dolphins have the ability to use marine sponges as foraging tools, they now also emit Chi according to a BBC article:

Humans do seem to feel a sense of kinship with dolphins, intelligent, playful, talkative creatures that they are. And separate research shows people feel the benefit from getting up close and personal with dolphins, says Dr Dobbs. This is because dolphins are thought to emanate “chi” – the essential life force in Chinese medicine – and the basis of various therapies for clinical depression, autism and brain damage.

I’m eagerly awaiting the discovery that Dolphins rearrange the ocean floor according to the principles of feng shui.

• Category: Science 
🔊 Listen RSS

Terrence Deacon and Ursula Goodenough have written a great article on the evolution of symbolic language. I’m mentioning it because they make two particularly interesting points. First point:

Language is in effect an emergent function, not some prior function that just required fine-tuning. Our inherited (“instinctive”) vocalizations, such as laughter, shrieks of fright, and cries of anguish, are under localized, mostly subcortical, neurological control, as are analogous instinctive vocalizations in other animals. By contrast, language depends on a widely dispersed constellation of cortical systems. Each system is also found in other primate brains, where they engage in other functions; their collective recruitment for language was apparently driven by the fact that their previously evolved functions overlapped with particular processing demands necessitated by language. Old structures came to perform unprecedented new tricks.

Using their own interpretations of previous research into birdsong, they also claim a relaxation of selection pressures may have played a role in the emergence of human language:

This reduction of emotional and contextual constraint on sound production opens the door for numerous other influences to play a role, allowing many more brain systems to participate in vocal behavior, including socially acquired auditory experience. In fact, such freedom from constraint is an essential precondition for being able to correlate learned vocal behaviors with the wide diversity of objects, events, properties, and relationships that language is capable of referring to. Hence an evolutionary de-differentiation process, while clearly not the whole story, may be a part of the story for symbolic language evolution.

• Category: Science 
🔊 Listen RSS

On the basis of recent Genome-wide association research, a review by Plomin et al. (2009) predicts that, in line with R.A. Fisher’s reconciliation of Mendelian inheritance and quantitative genetics, investigations “on polygenic liabilities will eventually lead to a focus on quantitative dimensions rather than qualitative disorders”. Basically, they are proposing a shift in thinking: moving from medical diagnoses and towards a broader level of analysis using quantitative traits.

By doing this, we should begin to get a better understanding of pleiotropic relationships and quantitative traits. As the authors highlight using an example of fat mass and obesity (FTO):

Although medical diagnoses (such as obesity) provide a convenient pragmatic framework for the initial discovery of genetic variants, in scientific terms there are no real ‘genes for disorders’. On the contrary, the genetic variants that are implicated in complex traits are associated with quantitative traits at every level of analysis. Thinking and researching quantitatively will provide a much richer picture of the complex biological pathways that lead from genes to disorders and will help us to generate biologically meaningful models of disease aetiology.

They also discuss the possibility of using weighted sets of variants to predict the polygenic risk score, which refers “to the set of multiple DNA variants that are associated with a disorder”. From here, a particularly salient point is raised about the inherent limitations of traditional case-control studies: that the control subjects are normally chosen on the basis of them not having the disorder in question, even though their phenotypic score may be near to the actual cases. By this, they mean that if we characterise disorders in terms of quantitative traits, then on a normal distribution some members of the control group fall very close to the low-end tail. To enhance the statistical power of these studies, the authors propose two alternative risk distributions: either contrast both ends of the distribution (the actual cases versus what they dub as super controls) or assign each participant their own phenotypic score and then study the entire distribution (what I dub as ambitious).

Still, there are limitations to this approach, namely: “for most disorders, we do not know what the relevant quantitative traits are”.

Here’s the abstract:

After drifting apart for 100 years, the two worlds of genetics – quantitative genetics and molecular genetics – are finally coming together in genome-wide association (GWA) research, which shows that the heritability of complex traits and common disorders is due to multiple genes of small effect size. We highlight a polygenic framework, supported by recent GWA research, in which qualitative disorders can be interpreted simply as being the extremes of quantitative dimensions. Research that focuses on quantitative traits – including the low and high ends of normal distributions – could have far-reaching implications for the diagnosis, treatment and prevention of the problematic extremes of these traits.

Citation: Plomin, Haworth & Davis. Common disorders are quantitative traits. Nature Reviews Genetics, 2009; 10, 872–878. DOI: 10.1038/nrg2670.

Hat-tip F1000.

• Category: Science 
The “war hero” candidate buried information about POWs left behind in Vietnam.
Are elite university admissions based on meritocracy and diversity as claimed?
The sources of America’s immigration problems—and a possible solution