Ugh, you’re not going sign up for Project: Baseline, are you? That new, 10,000-person health study Google’s putting together? Well, OK, not Google, but Verily. Which used to be Google Life Sciences, and is part of Alphabet, the company that used to be called Google but now owns Google. (So, Google.)
You are? You’re going to apply? To wear the special new watch that monitors (but doesn’t tell you) your heart rate? To put the sensor under your mattress that can tell when you’re … er … you know … sleeping? To answer all the questions on your phone and report to one of the three groups Google is working with at Duke, or Stanford, or the ritzy private clinic in Los Angeles once a year for blood tests and imaging and genome scans and and and?
Yeah, you totally are. You figure, eh, Google already knows everything about me already—from the email and the photos and the address book and oh, right, the record of everything you’ve ever looked for on the internet. (Even that one time you used Bing, just to see. Google knows about that, too.) So what if your heart rate goes into the dossier, too?
But when it comes to big, ambitious science conducted on human beings, you have to ask: Who benefits? Is it you? Other people? Or the people collecting the data?
Studies on the health of large populations have been some of medicine’s great successes. The Framingham Heart Study began in that town in Massachusetts in 1948 with 5,209 people, and expanded to their children and grandchildren, thousands more, using biannual medical exams to establish that smoking, high blood pressure, and lack of exercise caused heart and lung problems. Iterations of the Nurses Health Study, which began in 1976, have looked at the effects of diet and physical activity on health in over a quarter of a million people.
Studies like these are supposed to anonymize the data they collect, de-identifying it so that the useful parts—blood glucose levels, weight, heart rate, whatever—are still there, but things like your name and your address are gone. That was a lot easier in 1948 or 1976. Today, in iterations like using data from tissue samples in biobanks, or Iceland’s sequencing of the genes of every single one of its citizens, anonymity is harder to come by. Overlaps among big databases, like voter registration or census data alongside personal health information, mean that a good coder can often de-anonymize that stuff. That’s potentially bad for the people in the database, but good news for groups who want to monetize it.
When you sign up for Baseline, you’re going to sign a consent form. Google hasn’t shared it. You’ll certainly consent to the use of your anonymized data for research. And maybe for eventual sale to clients, commercial and academic. And you’ll sign into Baseline with your Google account. “Consent itself does describe the different data types that are being collected,” says Jessica Mega, Verily’s chief medical officer. “Importantly, it does draw a distinction between the information that’s going to be a part of the Baseline study and what exists in someone’s Google account.” Which is to say, under the current agreement, Baseline’s researchers aren’t going to sync your genome with your Gmail.
Possibly you have internalized that you basically have no privacy online anyway. The Baseline FAQ says participants get to see some of their data, like from the labs and the surveys, along with (potentially) some of the collective results. Maybe it’s worth it for the public health benefits. Ten thousand people give up a little and the world gains a lot.
The Deep Dive
That’s what the researchers are hoping, at least. Google has been working with teams at Stanford and Duke for three years to figure out what Baseline will measure. “The more we understand across populations, the more we can do for science,” Mega says.
Stanford and Duke will get the data first—and then after two years it’ll be open to other qualified medical researchers. (They’ll establish their own ethics rules and consent for their work). “It’s not a hypothesis-driven study,” says Sam Gambhir, a Stanford cancer researcher who helped design the project. “It’s a study to acquire a longitudinal dataset.” If it works, it’ll be some serious data: blood, genome, urine, tears, activity via wearable, heart, sleep, state of mind. “No one has done this deep a dive into each and every one of 10,000 individuals,” Gambhir says.
That size and depth explain why Google is involved. Even if Gambhir had come up with the idea on his own and somehow gotten federal funding (to the tune of $100 million, some reports say), he says, “we’d still have to find someone like a Verily or Alphabet to work with because of the large data structure needs and interactivity between participants and the internet.”
Google remains a for-profit commercial entity. It’s going to want something to sell. “It’s the tools and technology to create something like this,” Mega says. “We’ve needed to create new tools like the Verily Study Watch, which has a number of different sensors. We’re creating platforms that pull in data that comes not only from sensors but from classical assessments and molecular assessments.”
Still, though … Google’s strength is data. So, maybe this is relevant: Last year DeepMind, an English artificial intelligence company that’s also part of Alphabet (so, Google) cut a deal with the National Health Service to share data in return for an app and AI brainpower to treat acute kidney injury. But the privacy details weren’t handled well. Google didn’t get full consent for data sharing, and it wasn’t clear how else the company might use the patient data. “At the moment it’s just, get in there, gain the first mover advantage, and build networks and knowledge about illness and disease,” says Julia Powles, a researcher at Cornell who has written about DeepMind and NHS. “It’s Alphabet, the most powerful company in the world. They can afford to do that for a while, and that looks indistinguishable from the public interest.”
And after that? Well, maybe this is also relevant: One of the trickiest parts of studies of populations is building the population. Even with 10,000 people on board, how do you make sure they’re representative of the wider world? Too many men, too many old people, too many anything and your results will suck. For example, because of Baseline’s use of “liquid biopsies” and other tests, subjects have to be able to get to Palo Alto, Durham, or Los Angeles. That’s a potential geographic bias. “Does the wearable itself change the behavior of the individual? There are some questions that we’ll answer by having subgroups within the 10,000. Looking at buckets of individuals allows us to reflect the US population,” Gambhir says. “But there will always likely be some bias because of the kinds of people that might sign up for this.”
That right there is fascinating. Google is taking great care to have a sample that’ll yield statistically significant results. It’ll always, by definition, be a sample of people who use Google. And that’s … almost everyone. “You don’t need to be a part of their study group for them to know a lot of stuff about you. As soon as they get the links from the guy down the street who is a volunteer, they have the links for you,” Powles says. “They already have half of it, 10 years of internet history, searches and anxieties and everything you look up about health matters. You connect that to a significant population sample study that combines genetic and medical data? Shit.”
So, you still want to get on board? Of course you do. Because you aren’t just the target for study. You’re also the audience. If you are the kind of person who wants to put on the watch, sleep on the coil, and transmit the data, you’re also the kind of person whose health status Baseline can eventually improve. You’re the subject and the object—customer and product, all at once.