It seemed a bit strange to me to write about the bias in English when I have also been aware of the linguistic diversity of the Internet for some time. I didn’t shove that in because I was not up to date on the latest data regarding language and those connecting to the Internet. As luck would have it, I just found it here in the form of a spreadsheet, updated as of this month of this year.
It shows promise. We went from 64% of humans connected to 67% in one year. More languages from the continent of Africa are represented. Information like this reveals an implicit bias that most people are not aware of – the invisible 33%.
Our framing on the Internet tends to neglect them. We have a tendency to believe that everyone is connected. We’re not.
What’s more, that simple bit of information also demonstrates that training a large language model or an AI that leaves 33% of humanity out should give us pause. It won’t, but it should. 33% of humanity can’t access the Internet. Cultures and languages aren’t represented.
But technology waits for no one because tech companies wait for no one because they need us to keep buying technology.
One thought on “That Other Linguistic Bias…”