SonyAI’s Post

View organization page for SonyAI, graphic

22,264 followers

🔍 What Does It Take to Curate Fair Datasets? At NeurIPS 2024, our team at Sony AI introduced A Taxonomy of Challenges to Curating Fair Datasets, a paper that dives deep into the complexities of fairness in machine learning data and is part of a growing area of prestigious work by the Ethics Team here at Sony AI. As ML systems touch more areas of our lives—from healthcare to finance to criminal justice—the demand for fair, equitable datasets has never been more essential. Fairness in datasets is a complex, multi-dimensional goal that often struggles to move beyond theory. To address this, our researchers interviewed dataset curators to understand the practical obstacles they face, including resource constraints, biases in taxonomy, and ethical challenges in data sourcing. What were the study’s findings? A detailed taxonomy that categorizes challenges across three core dimensions: ∙ Composition: Capturing a broad range of perspectives and experiences. ∙ Process: Ensuring ethical practices in data annotation and transparency. ∙ Release: Providing clear documentation for responsible dataset use. Building fair datasets requires systemic change, not just individual efforts. 📖 Read our blog to dive deeper into the insights: https://bit.ly/4ilH7Uq

  • No alternative text description for this image
  • No alternative text description for this image
  • No alternative text description for this image
  • No alternative text description for this image
Wilhelmina NdapewaOnyothi N.

Software | Data | Innovation | Sustainability

1mo

A hearty shoutout to the team SonyAI for some fire outputs. There are two concerns: 1. Recommendations for Enabling Fair Dataset Curation * The minimum wage in many SSA counties varies, and is significantly lower than those of the Global north, with an exception to a handful of African countries, either with a global north comparable wage, or varying with industry. * Minimum wage in Northern countries are complemented by a level of social security, which is non-existent or not to the scale of the north. * Therefore, a minimum wage is not a liveable wage, and if data quality is to be ensured (where curation & annotation work is directed to the south), we must compensate accordingly and move towards liveable wages. Perhaps revising our labour laws to this new industry. 2. Discussions and Conclusions * There are individuals at numerous grassroots in the Global south actively curating datasets and pioneering ML models for their societies. Often very resource constrained (Financially, computationally), however persevering nonetheless. It would be interesting to see how their views contributes to your future work. That would be a really fair starting point 🙂

Like
Reply

To view or add a comment, sign in

Explore topics