UC San Diego: Pre-med Student Melds Matrix and Medicine
Gordon Ye, second-year pre-med student at the University of California San Diego. Ye utilizes computational techniques to help during research with a goal to better understand medicine and biology.
Nineteen-year-old Gordon Ye is a second-year pre-med student at the University of California San Diego, with studies focused on computer science and biomedical computation; however, that last part wasn’t something he figured he would pursue as a career.
“I was mostly interested in computer science and technology when I was younger. With my first laptop, I swapped out the hard drive and it was fascinating to me how something so simple and seamless on the outside could be so complex on the inside; there’s this juxtaposition of chaos and order,” said Ye. “Every component had a purpose, with everything in a specific place for a specific reason.”
It wasn’t until his senior year of high school that Ye applied his fascination with computers to medicine and realized how the two aren’t so different.
“Humans and computers are related,” said Ye. “Everything inside of our bodies, underneath the skin, serves a purpose, just like the components of a computer; however, humans have the unique ability— unlike a machine—to heal themselves. A computer will always need some sort of intervention to address breakdowns.”
Now, Ye is interested in using computers to help the body heal when it cannot do so alone.
During his first year of college, Ye contacted Dr. Sidney Zisook, Distinguished Professor of Psychiatry and director of the Department of Psychiatry Residency Training Program at UC San Diego, and Judy Davidson, a research scientist at UC San Diego.
“I wanted to use computational techniques to better understand medicine and biology as it could provide greater implications than working one on one with each person,” said Ye. “I reached out to Dr. Zisook and Dr. Davidson because I wanted to help with their research on depression and suicide prevention in nurses and physicians.”
word clouds.
Word clouds from an analysis of nurse suicide cases. These world clouds depict six “themes” or categories that the natural language processing analysis identified in the death investigation narratives.
As part of his work, Gordon applied Natural Language Processing (NLP) to the research team’s dataset that included qualitative information on patients, including lifestyle choices, home life and work. NLP is a subfield of artificial intelligence that equips computers with the ability to process human language and provide language models in order to find patterns.
“Gordon transforms qualitative information that is usually ignored because it’s hard to organize and analyze,” said Zisook. “In our research looking at suicide risk factors for physicians, Gordon was able to confirm findings that female physicians are at a higher risk for suicide than other non-physician females and that the additional risk was in part related to work stress and mental health problems. Without the help of Gordon, we might not have found that information so easily.”
Zisook added that attending to such known risks by providing a more supportive work environment and making mental health care more readily available could go a long way toward preventing future suicides.
The NLP techniques Ye uses work by leveraging frequently appearing words or phrases, such as “history of depression,” “family,” “work,” “attempt,” and “alcohol,” to discover hidden topics or themes in large sets of death investigation narratives. From there, researchers look over the topics to see if they also make sense from a clinical standpoint and then develop hypotheses based on the data.
“We use NLP and a thorough manual review process to determine what was found, and if it confirms our research or sheds light on things that we should look into further,” said Ye. “For psychiatric research focused on suicide prevention, this is especially important because every person’s case is different. Everyone has unique life experiences and it’s these life experiences combined with our biology that can contribute to heightened risk or even protect us from developing them.”
According to Gordon, psychiatric and mental health research focused on the genetics and genomics is an incomplete picture because it doesn’t take into account the life of the individual; their experiences and stressors. On the other hand, research solely focused on life experiences and ignoring the biology is also one-sided. Gordon believes each are complementary of each other and looking at both is essential for a future of precision medicine in psychiatry.
“I wanted to combine everything from our genes to our emotions to better understand what is going on,” said Ye. “If you don’t look at this entire picture, it’s incomplete and leads to treatment or drugs and procedures that may not be as effective or applicable to each person.”
Since his work with Zisook and Davidson, Ye has gone on to help with three other published research studies utilizing statistical and natural language processing techniques. In February 2021, Ye was first author of the paper “Physician Death by Suicide in the United States: 2012-2016,” published in the Journal of Psychiatric Research.
“One day, I got a phone call out of the blue from a student. He said my name is Gordon Ye, I’m interested in your work on suicide prevention and I’d like to work for you,” said Davidson. “And from day one, he just devoured the work. With his help, we were able to take qualitative paragraphs from medical examiners and police reports—data that had never been analyzed before—to support our research.
“I was amazed at the innate ability he had as a first-year student to approach these datasets that were so large they could crash a normal computer. I cannot wait to see what he will do in the future. I believe Gordon will change the field of medicine.”
Ye still has a couple years before he will graduate from UC San Diego. Afterwards, he plans to pursue a joint M.D./Ph.D. degree and work at the intersection of biomedical informatics and psychiatry as a professor or researcher.
Map depicting the six themes identified from natural language processing analysis.
A map depicting the six themes identified from natural language processing analysis. This map helps the team visually see how different the themes are, and explore words or phrases that are unique to each one.
“I hope to be involved in research that has societal benefits,” he said. “It’s not just about training high-performing machine learning models in tightly controlled lab settings. I want use data in a way that will help people in the real world.
“Right now, there’s a disconnect between our scientific community and the general population; science has become inaccessible in some ways and I want to help bridge that gap to ensure science and medicine are used to serve all individuals and never in a way that reinforces systemic health inequities.”