The use of computer algorithms to differentiate patterns from noise in data is now commonplace due to advances in artificial intelligence (AI) research, open-source software such as scikit-learn, and large numbers of talented data scientists streaming into the field. There is no question that competency in computer science, statistics, and information technology can lead to a successful AI project with useful outcomes. However, there is a missing piece from this recipe for success which has important implications in some domains. It’s not enough to teach humans to think like AI. We need to teach AI to understand the value of humans.
Consider a recent peer-reviewed study from Google and several academic partners to predict health outcomes from the electronic health records (EHR) of tens of thousands of patients using deep learning neural networks. Google developed special data structures for processing data, had access to powerful high-performance computing, and deployed state-of-the-art AI algorithms for predicting outcomes such as whether a patient would be readmitted to the hospital following a procedure such as surgery. This was a data science tour de force.
Although Google’s top-level results in this study claimed to beat a standard logistic regression model, there was a meaningful distinction buried in the fine print. While Google beat a standard logistic regression model based on 28 variables, its own deep learning approach only tied a more detailed logistic regression model built from the same data set the AI had used. Deep learning, in other words, was not necessary for the performance improvement Google claimed. In this example, the AI did not meet expectations.
The Limits of Deep Learning
So, what was missing from the Google study?
To answer this question, it is important to understand the healthcare domain and the strengths and limitations of patient data derived from electronic health records. Google’s approach was to harmonize all the data and feed it to a deep learning algorithm tasked with making sense of it. While technologically advanced, this approach purposefully ignored expert clinical knowledge which could have been useful to the AI. For example, income level and zip code are possible contributors to how someone will respond to a procedure. However, these factors may not be useful for clinical intervention because they can’t be changed.
Modeling the knowledge and semantic relationships between these factors could have informed the neural network architecture thus improving both the performance and the interpretability of the resulting predictive models.
What was missing from the Google study was an acknowledgement of the value humans bring to AI. Google’s model would have performed more effectively if it had taken advantage of expert knowledge only human clinicians could provide. But what does taking advantage of human knowledge look like in this context?
Taking Advantage of the Human Side of AI
Human involvement with an AI project begins when a programmer or engineer formulates the question the AI is to address. Asking and answering questions is still a uniquely human activity and one that AI will not be able to master anytime soon. This is because question asking relies on a depth, breadth, and synthesis of knowledge of different kinds. Further, question asking relies on creative thought and imagination. One must be able to imagine what is missing or what is wrong from what is known. This is very difficult for modern AIs to do.
Another area where humans are needed is knowledge engineering. This activity has been an important part of the AI field for decades and is focused on presenting the right domain-specific knowledge in the right format to the AI so that it doesn’t need to start from scratch when solving a problem. Knowledge is often derived from the scientific literature which is written, evaluated, and published by humans. Further, humans have an ability to synthesize knowledge which far exceeds what any computer algorithm can do.
One of the central goals of AI is to generate a model representing patterns in data which can be used for something practical like prediction of the behavior of a complex biological or physical system. Models are usually evaluated using objective computational or mathematical criteria such as execution time, prediction accuracy, or reproducibility. However, there are many subjective criteria which may be important to the human user of the AI. For example, a model relating genetic variation to disease risk might be more useful if it included genes with protein products amenable to drug development and targeting. This is a subjective criterion which may only be of interest to the person utilizing the AI.
Finally, the assessment of the utility, usefulness, or impact of a deployed AI model is a uniquely human activity. Is the model ethical and unbiased? What are the social and societal implications of the model? What are the unintended consequences of the model? Assessment of the broader impact of the model in practice is a uniquely human activity with very real implications for our own well-being.
While integrating humans more deliberately in AI applications is likely to improve the chances of success, it is important to keep mind that this could also reduce harm. This is particularly true in the healthcare domain where life and death decisions are increasingly being made based on AI models such as the ones that Google developed.
For example, the bias and fairness of AI models can lead to unforeseen consequences for people from disadvantaged or underrepresented backgrounds. This was pointed out in a recent study showing an algorithm used for prioritizing patients for kidney transplants under referred 33% of Black patients. This could have an enormous impact on the health of those patients on a national scale. This study, and others like it, have raised the awareness of algorithmic biases.
As AI continues to become part of everything we do, it is important to remember that we, the users and potential beneficiaries, have a vital role to play in the data science process. This is important for improving the results of an AI implementation and for reducing harm. It is also important to communicate the role of humans to those hoping to get into the AI workforce.