Data science and genetics are closely linked and have been for some time. But now, data science is playing an even larger role in genetics, a trend that is prompting researchers to look hard at their ethical responsibilities, says Chiara Sabatti, a professor of biomedical data science and statistics at Stanford University.
As is the case in many other fields, geneticists have access to much more data than in the past, and because it is digitized, it can be mined. “Scientists rely on statisticians to mine this data and help them formulate hypotheses,” Sabatti said during an interview recorded for this year’s Women in Data Science podcast at Stanford. Truly understanding and interpreting this data correctly will become increasingly important for the public good as the relationship between accessibility and privacy continues to grow, she noted.
Because there is such a wealth of data, there are potentially thousands of hypotheses that could be explored in some cases, an obviously unworkable situation. Data scientists need to determine which of the hypotheses drawn from the data are worth pursuing, says Sabatti. And that means developing new tools “to be able to confidently say to the scientist, ‘these are the hypotheses that you should follow up.’”
Sabatti voiced her concerns about the public’s confidence in science. “I am really worried that as scientists we contribute to this by putting forward results that are not as solid as they should be,” she says. “The idea that data speaks by itself is an illusion. It's very important for us to find a way to communicate to the general public what are the challenges of the data analysis.” This is particularly true in genetics, especially in light of increasing fascination with commercial DNA testing, says Sabatti. “I think the public is not aware of all the consequences of putting their data, genetic or not, online and available for mining. I think it's up to us as scientists to try to communicate clearly what it is that we can do with this data and what are the opportunities that come from data sharing,” she says.
Beyond genetics, Sabatti cited the need for “algorithmic fairness,” a new concept that seeks to eliminate biases and contribute to a more equitable understanding of data. She is also hopeful for the next generation of statisticians.
“I actually look at this field in a very optimistic view. I am amazed by the intelligence and the knowledge of young people coming into it. I cannot keep up with my students or the students in other people's labs. There is a lot of energy, and there's going to be a lot of interesting knowledge that comes out of this investigation,” she says.
Fatima Abu Salem | Applying data science for the public good in Lebanon
Louvere Walker-Hannon | Gaining skills and overcoming barriers to a career in data science
Menglin Cao | Data science in fintech and financial services
Karen Hao | Covering AI and Ethics Washing in the Tech Industry
Cecilia Aragon | Aerobatic Pilot, Author and Data Scientist
Kristian Lum | Applying Statistics to Promote Fairness and Transparency
Lillian Carrasquillo | Using Human-Centric Data Science at Spotify
Femke Vossepoel | Applying Data Assimilation Tools to COVID Forecasting Models
Francesca Dominici + Rachel Nethery | Using Data Science to Study Air Pollution Effect on COVID-19 Outcomes
Manisha Desai | The Importance of Data Integrity in COVID-19 Clinical Trials
Newsha Ajami | Improving Urban Water Systems Through Data Science, Public Policy and Engineering
Andrea Gagliano | The Intersection of Arts and Technology
Ya Xu | Using Data To Create Economic Opportunities For All Members Of Global Workforce
Susan Athey | Bringing an Economist’s Perspective to Data Science
Montse Medina | Lessons Learned Building a Data Science Startup
Bonus Episode: Margot Gerritsen | How to Get More Women Into Data Science
Timnit Gebru | Advocating for Diversity, Inclusion and Ethics in AI
Christiane Kamdem + Lama Moussawi | WiDS Ambassadors Bring Education and Role Models to their Communities
Sherrie Wang | Applying Machine Learning to Solve Global Food Security Challenges
Marzyeh Ghassemi | Applying Machine Learning to Understand and Improve Health
Create your
podcast in
minutes
It is Free
Regenerative Skills
The Meaningful Life with Andrew G. Marshall
The No-Frills Teacher Podcast
The Jordan B. Peterson Podcast
The Mel Robbins Podcast