From virtual care apps to AI algorithms: the trouble with data collection in healthcare

Tech is changing the way we collect health data. What does the explosion of virtual healthcare services mean for patient data privacy? And what role will data-driven AI play in the future of medicine?

'Under most Canadian legislation, once you de-identify data, you can essentially do whatever you want with it'

A young female patient with curly hair and a yellow shirt is talking to a doctor, with pill bottles and a laptop on his desk between them.
Tech plays a role in every part of the healthcare experience, but what happens with all of the data that's collected along the way? (Rocketclips, Inc. / Shutterstock)

The number of telehealth apps spiked during the pandemic to meet a new demand for virtual healthcare services — but this convenience may come at the cost of data privacy.

Through virtual care platforms, patients can access physicians and healthcare services through an app or website, instead of visiting a physical clinic for an in-person appointment.

Many of the companies that run virtual care platforms collect, share and use personal information uploaded by patients, according to a 2022 report.

Through a grant from the Office of Privacy Commissioner of Canada, the research team interviewed 18 people affiliated with the platforms — including employees, consultants, contractors and academics.

"We asked them questions about how the platform gathered data, how they had it, what they did with the data, what they saw as the benefits of these different data uses, and what kind of concerns they had," Dr. Sheryl Spithoff told Spark host Nora Young.

Spithoff, a family physician at Women's College Hospital and an assistant professor in the department of family and community medicine at the University of Toronto, was the project's lead researcher.

A smiling, freckled woman with long straight hair is against a beige brick wall.
Dr. Sheryl Spithoff is a family physician at Women’s College Hospital and an assistant professor in the Department of Family and Community Medicine, University of Toronto. (UofT)

According to Spithoff, while virtual care platforms said they did not use personal health data — classified as information collected during a meeting with a physician — they did use patients' registration and user data.

Some companies collected registration data — including names, email addresses, and phone numbers —  for internal marketing. Some user data, including IP addresses and cookie histories, was also collected and shared with large analytics companies like Facebook and Google.

While this information may seem devoid of personal health details, Dr. Spithoff said it's not that simple.

"Some of these virtual care platforms only provide one type of service. So some are focused on HIV prevention services, some are focused on mental health services," she said. "When this is being shared with an analytics company, they have insight into the nature of someone's health concern."

Even the sharing of de-identified information carries a risk of harm, Spithoff said.

"Right now, under most Canadian legislation, once you de-identify data, you can essentially do whatever you want with it," she said.

"[But] even when identifiers like names and postal codes and dates of birth are removed, these data are often used to create algorithms, artificial intelligence systems, automatic decision making systems — and these can incorporate social biases."

Privacy concerns aside, telemedicine is meeting a demand and filling a gap. With primary health care providers, physicians, and nurses in short supply, Dr. Spithoff says "virtual care is an important solution."

"There's definitely changes that we need to make to ensure that privacy is protected — and one of them would be clearly defining all the data that's collected through these platforms as personal health information gathered in the context of providing a health service that deserves the same protections." 

From analog to AI

The beginning of data collection was "incredibly revolutionary for medicine," historian Caitjan Gainty told Spark. 

Before the amassing of health data began in the 20th century, doctors would diagnose and treat patients based on their own anecdotal experience. 

"Whereas after this kind of data compilation really starts to happen, you start to see the kind of sharing of information about how to diagnose and how to treat," said Gainty, a senior lecturer in the history of science, technology and medicine at King's College London. 

"It means that there's a higher standard of care sort of across the board for everybody who's diagnosed with a particular kind of disease."

A woman with short, windswept hair and black-framed glasses looks directly into the camera.
Caitjan Gainty is a Senior Lecturer in the History of Science, Technology and Medicine at King's College London. (Kings College London)

Gainty wrote about the history of data collection in medicine, and the role machine learning has played, for The Conversation.

Medicine's use of artificial intelligence (AI) applications exploded in the 2010s, according to Gainty. 

"What it has done, I think really successfully, is do a lot of the kind of information management that we used to do in this very analog sort of way, much more quickly and much more effectively than humans can," she said. 

Recent uses of AI in healthcare have spanned developments in long COVID treatments to tumor identification to determining lung cancer recurrence. 

Beyond these developments, proponents of using AI in healthcare are also looking at machine learning as a tool for solving the problem of how to provide personalized medicine. 

"This idea that we are all individuals and our bodies need individual things: can machine learning help us to understand individuals, and then be able to tailor diagnoses and also therapies to individual bodies," explained Spithoff. 

"I think that's one of the things that people are really hopeful [AI] will potentially help us to resolve within healthcare, but it can only be as good as the data sets that it has to work with."

When data sets contain societal and cultural biases, diagnoses and treatments suggested by data-driven AI will replicate those biases, Gainty said.

Gainty says we may need to rethink the way data is used in healthcare if the goal is to create a more individualized model of care. 

"I think there's a lot to be said, for the way that data has helped to make modern medicine very, very successful across the board. 

"But it's not going to be the thing that turns the corner and allows us to attend to everybody's individual health needs completely successfully all the time."


McKenna Hadley-Burke is an associate producer for CBC Radio. She previously worked as a reporter for Cabin Radio in Yellowknife, NT.