How can we… use AI to predict the outcome of court cases?
1 October 2024
Ryan Geiser, PhD student, Yusuf Hamied Department of Chemistry
30 November 2022
Accelerate spark data science residency
Just weeks before losing my grandfather to Alzheimer’s, I took a 23andMe genetic test that showed I had a slightly increased risk of developing the same debilitating disease, which affects 57.4 million people around the globe.
There is still no treatment or way to prevent Alzheimer’s, which causes memory loss, confusion, as well as problems with language and understanding that worsen over time until the symptoms rob individuals of their independence. Having been affected personally by the disease and seeing the challenges facing dementia patients at the hospital I worked at, I wanted to understand better the underlying causes of Alzheimer’s, the most common form of dementia, by contributing to the research in the field.
Part of my work is focused on better understanding the folding of proteins associated with Alzheimer’s. According to a leading hypothesis, the disease is caused by the abnormal misfolding or the build-up of proteins like amyloid and tau in and around brain cells, which leads to a decrease in neurotransmitters that, over time, causes different areas of the brain to shrink, with devastating results.
It can be helpful to compare folding proteins to origami. The paper must be folded into a particular structure to make a specific origami shape. In a cell, proteins are supposed to fold in a specific way, so each protein can carry out a certain function, with sticky spots within the protein holding the structure in place. Alzheimer’s disease develops after some proteins in the brain fold incorrectly so that sticky spots are exposed outside of a protein. These toxic species can impair cell walls and create a cascade that recruits other proteins to clump together, causing a build-up that stops nutrients from reaching neurons. Without these nutrients, brain cells are destroyed.
Having joined Cambridge’s Centre for Misfolding Diseases (CMD) in 2017, my research focuses on finding existing medications that go beyond just breaking up the clumps or stopping them from forming in the first place after two new developments in the field. Firstly, preventing their formation is difficult as we don’t know what causes the misfolding process to begin, but research shows it might occur decades before symptoms appear. Secondly, recent work shows that even if clumps are dispersed, it doesn’t alleviate symptoms.
I use population-based longitudinal study data collected over several decades at the Cambridge Department of Public Health and Primary Care to identify drugs that might be able to slow or stop the progression of Alzheimer’s disease. The rich data includes information such as what drugs patients were taking, over what time, how their brains looked, and what their lifestyle was like, so I needed a way to sort through it.
I signed up for the Accelerate Programme for Scientific Discovery to learn how to use AI to organise, search through, and analyse this vast amount of data. I learned essential skills, such as how to clean data, which allowed me to expand and redefine my project. This was particularly helpful during Covid, as it permitted my team to shift our focus to computational work. I think the pandemic has accelerated the move to AI and other computational analysis methods in the lab, so I was lucky to learn the essential skillsets to do this work.
These data analysis methods have helped us identify four calcium-channel blocker drugs that may have potential to be repurposed for treating dementia. My team is further analysing these drugs in the lab, not least because one lesson I learned in the Accelerate program is to fact-check what the AI suggests. While there are no conclusive results yet, I hope this medication could one day lead to repurposed drugs to treat Alzheimer’s and related diseases. Being a part of the team that comes up with a treatment for these diseases is the dream, but the program has opened my eyes to using AI to tackle other massive datasets. We’re living in a world that generates an enormous amount of data. Wearable devices are collecting data from patients, and DeepMind – a subsidiary of Alphabet – has just released the AlphaFold Protein Structure Database to share predicted structures for nearly all catalogued proteins known to science with researchers. The program has given me the confidence to draw upon such massive datasets, which with the proper techniques, have the potential to dramatically increase our understanding of biology.
I am set to complete my PhD later this year, and while I am unsure of what comes next, I plan on using AI. I’ll have to see what unfolds!
Ryan took part in our Data Science Residency in 2021, you can find details about the course here. Please get in touch by emailing accelerate-science@cst.cam.ac.uk if you are interested in attending a future Residency.