Hi Favohem,
I don't have any personal experience in working with social data. You may be able to start with some of the methods that look for bias in numerical and categorical data, such as histogram analysis and looking for features that are correlated, especially with protected class-type data. The famous example of this being that due to historical segregation and redlining, zip codes were highly correlated with race.
I would recommend the book Weapons of Math Destruction by Cathy O'Neil. The author covers a number of different algorithms and tools developed over the years that have had negative impacts due to various issues. In it she recommends different ways to avoid issues identified, though I don't recall technical solutions.
Additionally, you might want to explore other works that complement Weapons of Math Destruction, such as Automating Inequality by Virginia Eubanks and Race After Technology by Ruha Benjamin. Both books address similar themes of inequality and the impact of technology on marginalized communities, and they offer valuable insights for understanding how social bias can be built into technological systems.
------------------------------
Ian Kerman
Data Science and AI SIG Co-Chair, SLAS
Data Science & AI Solutions Architect, Certara
------------------------------
Original Message:
Sent: 01-29-2025 04:42 AM
From: Favohem Ahaks
Subject: Exploring Bias in AI Models
Hello
As data science and AI technologies continue to evolve; one pressing concern is the inherent bias in AI models that can significantly influence decision-making processes; especially in sensitive areas like hiring, healthcare, and criminal justice. Despite advances in machine learning, bias in training data often leads to skewed predictions, which can perpetuate societal inequalities. This issue is especially critical for industries relying on data-driven decisions, where fairness and transparency are paramount.
I've been diving into this problem & exploring techniques such as re-weighting the data; using fairness-aware algorithms, and post-processing adjustments to correct for these biases. However, I've found it challenging to evaluate the effectiveness of these methods across different datasets and domains. Is there a standardized approach or metric that the community uses to evaluate bias and fairness in AI models? I have checked Data Science and AI Topical Interest Group (TIG)MongoDBguide for reference .
It would be great to hear about the experiences of others in mitigating AI bias. Have any of you had success with particular strategies, tools/ frameworks that help monitor and reduce bias during the development of AI systems? Looking forward to your thoughts and recommendations.
Thank you !
------------------------------
Favohem Ahaks
Walmart
California CA
------------------------------