Professors received a grant earlier this month for a summer program that will equip students from underrepresented backgrounds at GW and other local universities with data skills to combat health and climate-related disparities.
The program, a four-week bootcamp at GW led by statistics professors Huixia Judy Wang and Tatiyana Apanasovich, will teach data science students from GW, Trinity Washington University and the University of the District of Columbia how to compile, interpret and apply data to the field of health equity and climate resilience. Apanasovich said the program, which officials will announce applications for on the Department of Statistics website this week, will help GW bolster diversity in the field of data science by collaborating with local universities that enroll many minority students.
“We have a lot of hands-on activities, plus interaction with experts in the area of public health and climate resilience,” Apanasovich said. “Specifically, we are interested in looking into vulnerable communities, racial and socioeconomic and how they respond to climate change and, again, specifically health disparities.”
The University of the District of Columbia is a historically Black public university in Northwest D.C. Trinity Washington University, located in Northeast D.C., is a women’s and predominantly Black and Hispanic university serving a majority of students from low-income households.
The program is funded by the Public Interest Technology University Network 2024 Network Challenge, a partnership of 59 institutions that fund public interest projects analyzing the social, legal and economic effects of technology. GW joined the cohort in 2020 and has since led four programs funded by PIT-UN grants, which focused on ethical applications of technology and coding, according to PIT-UN’s website.
Apanasovich and Wang said they will select 21 rising sophomores, juniors and seniors from the trio of partnering schools to attend the program. Eligible students must have taken at least one statistics course and express an interest in public interest technology in their application, and accepted students will receive a stipend that covers the cost of the program provided by the PIT-UN grant, Apanasovich said in an email.
Apanasovich said students will analyze climate related and health disparity data from public databases, like the Centers for Disease Control and Prevention and other government-published data. Students will also learn from GW data science professors and graduate students who are pursuing degrees in data science, and professors at Trinity, who are GW alumni, will mentor students in the program and help organize activities and lectures, she said.
Apanasovich said the University hopes to continue the bootcamp program in the future, but the current grant is just for this summer. She said if it is “successful,” they will apply for more funding for future years.
“We talked to our colleagues from these universities, and they want to establish data science programs,” Apanasovich said. “They want to train their students and give them data science skills, so I think that that collaboration can be really beneficial to them, to help them to establish their own data science programs.”
Wang said the program will prioritize recruiting students from diverse racial and socioeconomic backgrounds. She said they want to admit students from underrepresented groups because the bootcamp aims to address health disparities, particularly in communities of color and of lower economic status that are sometimes hit harder by health issues, like infectious diseases.
“We’re going to train students to just do that, and with a specific focus on health disparities because definitely it is evident from death rates you know rates of specific disorders to the mentioned asthma, strokes, that there is a difference due to the socioeconomic status, or race, geographical difference,” Apanasovich said. “Some of this also negatively affected by climate change, so the most like the short term effects of different fires. So California, Oregon and kind of moving towards us, East, they experience that quality of air.”
Wang said the analysis of health disparities can help policymakers decide how to allocate resources in the future and assess the impacts of climate change on people and the bootcamp will equip students with the skills to do this work later in their careers.
Data scientists said the bootcamp’s inclusion of students from varying socioeconomic and racial backgrounds is beneficial because the field of data science currently lacks diversity in the workforce across racial and ethnic groups. The workforce in fields centered around technology is around 60 percent white and between 52 percent male — which is less diverse than the overall private sector.
Improving equity in data science also involves examining the data that informs researchers’ models, which can affect the algorithmic biases of their results. For example, some data models have predicted how convicted criminals are more likely to reoffend based on factors, like prior encounters with police, unemployment or being from a over-policed neighborhood, according to the International Institute of Analytics. Without the consideration of the socioeconomic and racial factors, like racist policing policies that target people of color, such data models recommend longer prison sentences for people of color, the institute reported.
James Spall, a professor of mathematics and statistics and the co-chair of the data science program at Johns Hopkins University, said technological advancements have broadened the scope of topics covered by data science over the last several decades, which has created a greater need for more people of diverse backgrounds in the field.
“I obviously see in my own experience with interacting with students that there is a tremendous need for a more diverse student base,” Spall said.
Prince Afriyie, the director of the residential master’s in data science at the University of Virginia, said the data science field “breaks the walls” of traditional academia, with collaborations between universities in data science bringing different perspectives together.
“We need people to be able to use data to tell stories without biases,” Afriyie said. “If I want to tell a story about a particular population, then I want to be able to show that. I want to be confident that I can also relate to that population, and I can do justice to the story. I can’t overemphasize how diverse backgrounds in data science are important.”
Ambuj Tewari, a professor of statistics and the director of the master’s in data science at the University of Michigan, said modern fields, like data science, are subject to social biases rooted in historical forces, like slavery and colonialism, because of the “garbage in, garbage out” principle. This principle suggests that in artificial intelligence and data science, poor or unnecessary data produces poor output, but Tewari said this sorting process can amplified social biases due to perceptions about the quality of data based on the communities it is collected from.
“Thankfully a lot of researchers, especially young researchers, are interested in ensuring that Data science and AI tools are used with human values such as fairness and justice in mind,” Tewari said in an email. “Beyond that, we also need to ensure that the next generation of data science leaders come from all segments of society.”