New tool engineered in Edmonton mines Twitter for health trends
'Sometimes it's garbage, but sometimes you can use the tweets as nuggets to actually inform yourself'
If you've ever tweeted about your pesky cough or violent stomach flu, you're not alone.
So many people are sharing information about their medical issues in 280 characters or less that public health professionals are turning to Twitter as a new diagnostic tool.
A team of computing scientists at the University of Alberta has created a new data-mining program that tracks Twitter in real-time in an attempt to better understand emerging health trends in Alberta and across the country.
"It's funny what people post on Twitter," said Osmar Zaiane, a professor in the U of A's department of computing science, who helped develop the technology.
"They can write things about what they see in the street, they can write about their religious views, their political views, whatever.
"But they can also write other things. Sometimes it's garbage, but sometimes you can use the tweets as nuggets to actually inform yourself about the health of the population."
We are diving in this murky Twitter world.- Osmar Zaiane
The machine, called Grebe, gathers aggregated data from the social media site. The program trawls through millions of tweets, looking for any posts that could signal changes in public health.
The ongoing monitoring program has analyzed more than 18 million tweets so far.
"We called it Grebe," Zaiane said in an interview Wednesday with CBC Radio's Edmonton AM. "The Grebe, as you know, is a duck here in Alberta. It's a freshwater duck that dives in murky water and we are diving in this murky Twitter world."
Deciphering the signals
Zaiane said his team was tasked with creating Grebe by the Alberta Real Time Syndromic Surveillance Network (ARTSSN), a public-health surveillance project managed by Alberta Health Services.
"They were interested in identifying some signals in Twitter," Zaiane said. "We worked with them and designed some machine-learning techniques to teach the computer to recognize, out of this mess, which messages are relevant to health."
- New real-time tracking system locates planes anywhere in the world
- Why a Yellowknife man created a Twitter bot that monitors temperature trends
The project began in Edmonton, then expanded across the province. It has since been applied to all Canadian provinces, and researchers are now collaborating with the Public Health Agency of Canada and the Centers for Disease Control and Prevention in the United States.
Once the program is perfected, it will be made publicly available. All the aggregated data will be open source.
The 'big picture'
The program relies on machine learning to identify six different dimensions of health — physical, emotional, occupational, social, spiritual, and intellectual — as well as the emotions expressed in each tweet and the relevant location.
Tweets where users describe their symptoms can be useful in early detection of outbreaks or disease clusters, Zaiane said. By using location data, the program can also help pinpoint geographical trends — helping to identify if a certain community is struggling with a specific health issue.
It's a very quick signal, if you get it right.- Osmar Zaiane
Those reports can help enhance the records already being kept by hospitals and health providers, Zaiane said.
"Somebody may say, 'I'm not going to work today because I have the runs.' That is relevant to health and is captured so we know where it was sent and when," he said.
"That can be used by epidemiologists, they cross reference that information and then they can have a big picture of what is going on."
It's not a perfect diagnostic tool, Zaiane said. Only a small segment of the population is represented on Twitter, but the site is a rich and previously unmined source of health data, he said.
"A lot of it can be noise but a lot of it can confirm other signals they get from other channels," he said. "And Twitter is very fast … it's a very quick signal, if you get it right."