Data science and genetics are closely linked and have been for some time. But now, data science is playing an even larger role in genetics, a trend that is prompting researchers to look hard at their ethical responsibilities, says Chiara Sabatti, a professor of biomedical data science and statistics at Stanford University.
As is the case in many other fields, geneticists have access to much more data than in the past, and because it is digitized, it can be mined. “Scientists rely on statisticians to mine this data and help them formulate hypotheses,” Sabatti said during an interview recorded for this year’s Women in Data Science podcast at Stanford. Truly understanding and interpreting this data correctly will become increasingly important for the public good as the relationship between accessibility and privacy continues to grow, she noted.
Because there is such a wealth of data, there are potentially thousands of hypotheses that could be explored in some cases, an obviously unworkable situation. Data scientists need to determine which of the hypotheses drawn from the data are worth pursuing, says Sabatti. And that means developing new tools “to be able to confidently say to the scientist, ‘these are the hypotheses that you should follow up.’”
Sabatti voiced her concerns about the public’s confidence in science. “I am really worried that as scientists we contribute to this by putting forward results that are not as solid as they should be,” she says. “The idea that data speaks by itself is an illusion. It's very important for us to find a way to communicate to the general public what are the challenges of the data analysis.” This is particularly true in genetics, especially in light of increasing fascination with commercial DNA testing, says Sabatti. “I think the public is not aware of all the consequences of putting their data, genetic or not, online and available for mining. I think it's up to us as scientists to try to communicate clearly what it is that we can do with this data and what are the opportunities that come from data sharing,” she says.
Beyond genetics, Sabatti cited the need for “algorithmic fairness,” a new concept that seeks to eliminate biases and contribute to a more equitable understanding of data. She is also hopeful for the next generation of statisticians.
“I actually look at this field in a very optimistic view. I am amazed by the intelligence and the knowledge of young people coming into it. I cannot keep up with my students or the students in other people's labs. There is a lot of energy, and there's going to be a lot of interesting knowledge that comes out of this investigation,” she says.
The Power of Linguistics in Large Language Models and AI
Applying topological data analysis and geometry-based ML
Using Curiosity, Mentorship, and Education to Build a Career
Fighting Crypto Crime with Data Science
Using Storytelling to Communicate with Stakeholders
Data Science Leadership: Creating Meaningful Impact
Kate Kolich on Mentorship, Data Ethics, and Leadership
Breaking Barriers to Entry & Success for Women in Tech with Telle Whitney
Srujana Kaddevarmuth | Opening New Realms of Data Science and AI
Veronica Edwards | The Bridge Between Dance and Data Science
Jane Lauder | Using Data Science to Create Aspirational Products
Priya Donti | Using AI to Fight the Climate Crisis
Lesly Zerna | Teaching and learning data science in Latin America (Spanish)
Leda Braga | Applying data science to investment strategies
Jessica Bohórquez | Using AI for leak detection in water pipelines (Spanish)
Karolina Urbanska | Using data science to study human behavior
Welcoming our new podcast co-host, Cindy Orozco
Tahu Kukutai | Advocating for indigenous data sovereignty
Allison Koenecke | Researching algorithmic fairness and causal inference in public health
Karina Edmonds | Building bridges between business and academia
Create your
podcast in
minutes
It is Free
The No-Frills Teacher Podcast
Heal, Survive & Thrive!
Summarize | رادیو سامرایز
The Jordan B. Peterson Podcast
The Mel Robbins Podcast