Dora Zhao: auditing artificial intelligence for bias
by Sharon Adarlo, Center for Statistics and Machine Learning
Dora Zhao, 21, Class of 2021
Dora Zhao is majoring in computer science and is slated to earn two certificates: Statistics and Machine Learning from the Center for Statistics and Machine Learning (CSML) and Asian American Studies.
In her undergraduate research, Zhao has been delving into the intersection of race/gender bias and artificial intelligence, which has manifested in two separate projects - one for her CSML independent project and another for her senior thesis.
In her CSML independent project, which she completed in her junior year, Zhao audited facial recognition software to determine if such systems were able to accurately gauge emotion in different demographics. Olga Russakovsky, assistant professor in the Computer Science Department, acted as her advisor.
“The use of artificial intelligence for decision-making, like choosing who to hire or determining recidivism rates, is becoming more commonplace. However, ubiquity does not mean that AI is a perfected technology,” said Zhao, explaining why she tackled this project.
Auditing applications is important, particularly when they are being used in surveillance of already marginalized communities, Zhao added.
She tested commercially available facial recognition software to measure the accuracy of recognition.
She started the project by looking at data in AffectNet, a massive trove of more than a million facial images that have been annotated for emotional expressions such as fear, disgust, sad, angry and happy. She augmented the dataset so that it would also note the race and gender of each face. She then used this augmented dataset to test various facial recognition software programs, which all proved to perform worst in classifying the emotion of darker skinned males compared to other demographic groups. In general, the programs performed badly on darker skinned people, whether male or female.
Her senior thesis (also with Russakovsky) tackles racial bias in image captioning techniques.
“I was motivated to work on this because of an infamous incident in 2015 when the Google Photos app tagged images of Black people as gorillas. That’s horrible because it’s a very damaging stereotype,” Zhao said. “And so that really drew me into examining image captioning.”
Existing work looking at biases in image captioning techniques have mainly been focused on exploring how gender biases are replicated and amplified as well as methods for mitigating said biases, said Zhao. For racial bias, she decided to first look at existing datasets, in particular the Microsoft Common Objects in Context (MSCOCO), a dataset often used for image captioning. Zhao collected annotations on people’s race and gender in MSCOCO’s people image subsection and will use these annotations to see if there are hidden biases within the dataset itself.
For her research, the CSML certificate program enhanced her skillset in studying these bias issues in technology and deepening her understanding of them, Zhao said.
“I think it's been really great in helping me establish a foundation for tackling these questions in my research. I wouldn't have been able to do it without even having this basic understanding of statistics and machine learning, which underlies everything that I do,” she said.
After graduation, Zhao said she is either going to graduate school with the eventual goal of going into academia or become a product manager at Coinbase, a cryptocurrency firm.
Zhao is a member of the Cap and Gown Club and also the Asian American Student Association.
Zhao enjoys cooking, baking pies, watching sitcoms and hanging out with her friends and family.