Research Kit is Awesome – But is More Data Always Better?


Let me start by saying that I think Apple’s Research Kit is great, and can enable us to do great things. However, a lot of the hype I’m hearing has me a little concerned, because it is mainly centered around the idea that getting more data is the key to transforming the Healthcare Industry and the lives of patients.

As a person who does a lot of work in the Healthcare Industry, I see lots of data from clinical studies that is intended to inform us of how best to help patients. The traditional clinical study is designed to inform medical researchers of the efficacy of new drugs or therapy alternatives. To that end, it makes sense that traditional protocols work well to meet traditional needs. Getting access to patients for current research is no small thing, and ResearchKit is already doing a great job of that, as evidenced here.

The issue I have is that with the increased blurring of the lines between Healthcare and Lifestyle products, traditional research protocols can lead us collect data that is less relevant to the decisions that need to be made. This is the type of transformation I’m talking about, and to that end, more of the current type of data is not the answer.

Claro Partners did some excellent work in defining the Personal Data Economy. One of the most useful frameworks they developed as a result of their work was a ladder that describes the hierarchy of value of data.

Data Ladder
Data Ladder

The ladder shows that the real value of data is in its interpretation to draw meaningful conclusions. As you work your way up the ladder, unstructured data is organized to create useful information that creates context for the data. When the informational contexts are organized in such a way that it becomes meaningful knowledge, we can take action. Finally, we can apply our intelligence to predict likely outcomes based on the same original data.

Most clinical studies are designed as a framework around a hypothesized outcome. The study parameters provide the structure for the actionable meaning, and the statistical analyses chosen will put the data into the right context to support or refute the hypothesized outcome. Data sources are selected based on their fit with the framework. In that case, more data is usually better.

“More data is not better if what you need is a better way to make sense of the data.”

However, when we are developing products and services that need to best fit into a patient’s lifestyle, the existing clinical frameworks are less useful. In this case, data sources are selected because we want to understand the behaviors that drive the actions; they usually do not fit into a known framework. The context and meaning are constructed to fit the data presented. Rather than support or refute a hypothesized outcome, the data is the foundational step in the discovery of a new outcome. Traditional clinical protocols aren’t designed to do that. To that end, more data is of less value than tools that can discern patterns in unfamiliar data.

In theory this all makes sense. In practice, however, I often see very different behavior driving research and decision-making. But the key to a truly game changing application of Research Kit will be twofold. The first will be to create tools to find new patterns in the data collected. The second (and in my opinion more powerful) will be to enable quantitative analysis of information that can currently only be interpreted via qualitative means.

Enabling those two goals will enable the Research Kit platform to help us truly change the game; transforming Big Data into Big Intelligence.