Trust is the biggest obstacle to making AI work in healthcare – and we need to tackle it now

Mike David Smith
6 min readJan 16, 2019


Medicine is at an historic crossroads. Artificial Intelligence (AI) is already revolutionising industries in ways that are making the health sector look lethargic. Yet there are valid ethical and patient focused reasons that justify the unwillingness of doctors to embrace AI. How do we step into the future while respecting the traditions of the past that have made medicine the most trusted profession?

I am a GP and every day I think of novel ways we could be using AI in healthcare to make me better at my job and deliver better care. However, every time I come up against similar hurdles in contemplating how to make these daydreams into reality.


Most recently, I have been thinking about cancer. Cancer scares us all. It’s insidious nature, making hard to find until too late, and the seeming ability to strike anyone at any time make it every doctor’s nemesis. But actually we know it isn’t entirely random. Cancer, for the most part, strikes predictably in response to risk factors: age, smoking, obesity, pollution, genetics, sun exposure…the list goes on. The challenge is finding a way to collate all of those risk factors in a meaningful way, and then linking them to a patients current symptoms to decide if this is or is not cancer.

The role of a GP is to try and navigate that challenge and on the whole we aren’t bad at it. But we could be better. In a 10 minute consultation we have to digest the existing medical record for a patient, collect information from the patient about their symptoms, examine them and then assimilate that information into a decision – which then needs to be communicated back! Not easy.

The hardest part is often the first step – digesting a medical record. For a typical 70 year old, their medical record will run to hundreds of pages. A lot of the really important stuff is flagged to be quickly read – smoking history, past diagnoses of cancer or heart disease, etc. – but lots of really important but less dramatic data points are buried within hospital clinic letters (in PDF format) from 10 years prior. If you are lucky, the doctor you see has been around for a while and knows you well, so will remember those esoteric facts about you. These days though it is more common to see different a doctor every time, so that memory is lost.

Big Data and AI

Artificial Intelligence is the natural solution to this problem. Modern Neuro-Lingustic Programming (NLP) algorithms can digest “free text” information into a form a computer can understand. In milliseconds, a powerful computer could read every piece of information about you and know your medical history to a depth no doctor every could, then stand ready to help out with that data.

Already tools like QCancer have been produced that take coded data (numbers and important diagnoses mostly) about a patient and use that to estimate their risk of having certain cancers. However the utility of this tool is hindered by the fact that it works on only part of the picture. It only sees a tiny fraction of your medical record. Most of the data in your record was inputted using “free text” that is unreadable to this tool. That means that any result we get from QCancer is dependent both on the quality and consistency of coding, which is often poor and variable.

In a paper recently it was demonstrated that when using NLP to analyse medical records, the predictive value for a common “red flag” symptom of haematuria (blood in urine) dropped markedly compared to coded data alone. The author hypothesised the reason for this being that when a GP sees a patient who they think has cancer, say a 80 year old man with blood in his urine, they will code the term “haematuria”. Whereas if they see a 22 year old woman with a urinary tract infection who has blood in their urine, they will code the term “UTI”. What the algorithm concludes is that haematuria often means cancer, when reality is that haematuria often does not mean cancer!

Using NLP to analyse medical records to develop decision support tools to risk profile patients for cancer could offer a step change in cancer diagnostics.

Technically speaking, the system could be developed in a matter of weeks. If a big data firm were handed all of the UK general practice records, they could quickly develop algorithms to process records and then use machine learning to crunch the data to produce risk profiles that could then be applied to medical records on-the-fly in clinical practice. Sadly, it’s not that simple.

QCancer was developed using the General Practice Research Database. This database contains the coded medical records of 4.8 million UK patients. Using the database, researches can look at specific diagnoses of interest – say kidney cancer – then look at the records of all the patients with that diagnosis. Then you look at what symptoms and parameters those patients had in common in the run up to their diagnosis. Comparing that to matched patients – those without cancer of a similar age, height, weight, etc – we should be able to say that if you turn up to your GP with blood in your urine, age 80, without any history of kidney stones, that increases your chance of having cancer substantially.

Sensitive data

Crucially by only including coded data, it would be quite hard were that data to be leaked, for your record to be identified. Not impossible of course, but knowing that you are a 5’10’’ man, who is 47, who has diabetes and high blood pressure diagnosed in 2010 still doesn’t round it down enough to easily identify a person. Even if it did, with the exception of a few diagnoses – HIV being the most notable – the headline medical diagnoses you suffer with and the drugs you take aren’t all that sensitive.

To overcome the problem of using only coded data - fully embracing AI and NLP – we would need to have the full medical record of those millions of patients. Those full records as an order of magnitude more sensitive. Take this example of a fictional, but not unlikely consultation record:

Low Mood

History: Struggling with mood this past 6 months. Sleep poor. More snappy. All started after discovering wife had an affair with close friend. Have decided to work on marriage for sake of family but unable to move past this. Erectile dysfunction has emerged as a problem, libido generally low also. Has had fleeing thoughts of suicide – driving car off bridge – but would never act on account of kids (3 and 5). Stressful job – architect – but manager supportive, has had some time off…

Combining this single entry, with the fact the man is 38, lives in the SE2 post code (needed for better risk profiling), and is 5’10’’ you can already start figuring out who he is. And when you find that, you know things about him (and his wife) that are deeply personal and highly sensitive.


Doctors have a duty of confidence to their patients. Not revealing the content of consultations to those beyond those who need-to-know is essential to establish trust. Without that trust, patients will be selective about what they confide in their doctor. Without that openness, making a diagnosis and providing effective care is much more difficult – ultimately harming patients.

In a world of data leaks and Cambridge Analytica, doctors have a duty to protect their patients from those kinds of harm, not least to prevent the secondary harm caused by loss of trust.

How then do we then embrace technology – in this case in the form of AI applied to big data to improve cancer diagnosis – without failing to protect patients and maintaining their trust?


The answer has to be consent. Not implied consent whereby being registered with a GP means we share your data for research. Or even an Apple Terms & Conditions style consent – you sign a 100 page disclaimer of legalese every time you see a GP which includes that declaration. Patients need to understand what their data actually is (most patients have never seen their medical records); how we intend to use it – applied to detect cancer using algorithms rather than humans; what might go wrong – data leaking – and how that is being mitigated.

Projects such as the Great North Care Record are already underway developing models of consent that allow patients to control their data. These projects need to be funded heavily as they are integral to the future of healthcare.

Progress will be slow. Glacially slow by the standards of Silicon Valley. We cannot “move fast and break things”, because these “things” are peoples lives. However, I believe that most people would “donate” their data to medical research with the right assurances. Big tech and healthcare can then collaborate with full confidence to step up and fight cancer.



Mike David Smith

Doctor working in North East England with a keen interest in technology