Preserving Cultural Traditions
Note: The above image is global geocoded tone of New York Times content
Jean-Baptiste Michel holds joint academic appointments at Harvard University (FQEB Fellow) and at Google (Visiting Faculty).
His research focusses on using large volumes of data as tools that help better understand the world around us — from the way diseases progress in patients over years, to the way cultures change in human societies over centuries.
With his colleague Erez Lieberman Aiden, Jean-Baptiste is a Founding Director of Harvard’s Cultural Observatory, where their research team pioneers the use of quantitative methods for the study of human culture, language and history.
His research was featured on the covers of Science and Nature, on the front pages of the New York Times and the Boston Globe, in The Economist, Wired and many other venues. The online tool he helped create — ngrams.googlelabs.com — was used millions of times to browse cultural trends. Jean-Baptiste is an Engineer from Ecole Polytechnique (Paris), and holds an MS in Applied Mathematics and a PhD in Systems Biology from Harvard.
Question from Karim
Jean-Baptiste, what is “Culturomics” and how is this field of study providing scientists with insights into the evolution of human behavior and thought?
Response from Jean-Baptiste
Karim, although the world is extraordinarily complex, science has been equally successful in helping us understand it in simple terms.
Science captures organizing principles behind otherwise puzzling phenomena. Physics revealed that light and electricity are both waves that propagate according to the same equations. Biology explains that all populations of living organisms evolve under natural selection.
The news is, just as science has changed our view of the inert and the living world, it is about to change our view of human society and history.
Books we’ve written throughout history are being stored in silico: 20 million of the world’s 130 million books have been preserved in this way.
The historical record is being digitized. The books we’ve written throughout history are being stored in silico: about 20 million of the world’s 130 million books have been preserved in this way.
The art we have produced, the maps we have drawn, the music we sang: an enormous effort is under way to save them on hard drives.
This effort, led by companies like Google and by governments, provides scientists with an incredibly detailed way to make measurements about history and society. These data are a fossil record of our human culture. Just as archeologists do, we can measure the way our culture has changed. We can measure historical trends. And with these measurements, we can discover phenomena we didn’t know existed.
We can make science.
Culturomics is the first scientific attempt to leverage these data for scientific understanding. We’ve used 5 million digitized books to capture the yearly changes in cultural trends over 200 years. If words in these books were lined up in 12pt font, the result would stretch to the moon and back ten times over. To make sense of such big data, we deployed simple yet powerful mathematical tools derived, in particular, from genomics (the large-scale study of genomes). So, we called this method culturomics: the large-scale study of cultures.
Just as archeologists do, we can measure the way our culture has changed. We can measure historical trends. And with these measurements, we can discover phenomena we didn’t know existed. We can make science.
At the heart of culturomics – in its present incarnation – lie cultural trends. For any word or short sentence, we can plot the frequency with which it was used in the book record, every year going back at least two centuries. We observe the ebb and flow of words and the concepts they represent. There is an app for this: books.google.com/ngrams. It turns out that these trends contain a remarkable amount of information about history and cultures. We can measure how fast memory of the past decays, and see the extent to which this accelerates.
We observed the dynamics of language change. We discovered that censorship and suppression leave distinctive marks on the cultural record that one can detect systematically.
There is a bright future for the scientific understanding of human history, society and culture, based on quantitative data such as the ones the culturomic method uses. I strongly believe this science will reveal deep organizing principles about human society, culture and history. It will permanently change the way we look at ourselves, at our past, and at our future.