Google book database opens window on culture
Google has launched a searchable database of 500 billion words contained in 5.2 million books published between 1500 and 2008 — billed by The Atlantic magazine as the "time suck of the day."
The database of books in English, French, Spanish, German, Russian, and Chinese lets users search words and short phrases to get a handle on how certain terms have waxed and waned in popularity over time.
Called Ngram, it represents about 10 per cent of all books ever published and is a joint project with Harvard University researchers.
While it was designed as a tool for academics and other researchers, any user can search for a string of up to five words.
"The goal is to give an eight-year-old the ability to browse cultural trends throughout history, as recorded in books," said Erez Lieberman Aiden, a junior fellow at the Society of Fellows at Harvard who helped build the data set, told the New York Times.
A quick search reveals the word "women" is mentioned much less frequently than the word "men" from 1800 until the 1970s. By the mid-1980s, the trend is reversed — and then it reverses again after 2000.
The Harvard scientists used the database to show how it might allow researchers to conduct a quantitative analysis of culture and language. They published their findings in the journal Science on Friday.