an AI resume screener had been trained on CVs of employees already at the firm, giving people extra marks if they listed “baseball” or “basketball” – hobbies that were linked to more successful staff, often men. Those who mentioned “softball” – typically women – were downgraded.
Marginalised groups often “fall through the cracks, because they have different hobbies, they went to different schools”
I don’t think you know how LLM’s are trained then. It can become racist by mistake.
An example is, that there’s 100.000 white people and 50.000 black people in a society. The statistic shows that there has been hired 50% more white people than black. What does this tell you?
Obvious! There’s also 50% more white people to begin with, so black and white people are hired at the same rate! But what does the AI see?
It sees 50% increase in hiring white people. And then it can lean towards doing the same.
You see how this was / is in no way racist, but it ends up as it, as a consequence of something completely different.
TLDR People are still racist though, but it’s not always why the AI is.
The bias is really introduced at the design stage. Designers should be aware of demographic differences and incorporate that into the model to produce something more balanced. It’s far from impossible to design models that do not become biased in this way, even from biased data - although, that is no to say it’s easy.
you are right, i don’t know how LLMs are trained, but ironically, this is a perfect example of a minority being privelaged by a system, and racism is still very much involved.
an important assumption you have to consider: in your example, why did the AI know what race people are in the first place? it seems a small consideration but it’s so wildly significant.
the modern understanding of race was not present throughout all of history, and only arose in the 17th century. without getting into the weeds, the fact that your fictional AI can distinguish between whiteness and non-whiteness already means it was designed by someone who understands those structures, and let them slip into the AI itself.
a perfectly well-meaning and anti-racist designer would prevent the AI from even recognizing race at all costs, both directly by sanitizing training data to remove race from the inputs, and indirectly by noting correlations with other data (such as sports, in this article) and controlling for that.
I suppose it depends on how you define by mistake. Your example is an odd bit of narrowing the dataset, which I would certainly describe as an unintended error in the design. But the original is more pertinent- it wasn’t intended to be sexist (etc). But since it was designed to mimic us, it also copied our bad decisions.
deleted by creator
Oh there is so much racist data that the AI is being trained on.
Your example is a simple one. But there are discriminations based on names for instance, so Johns are hired more than Quachin is, and that is by people, before it gets to the AI.