Come here, y'all, and let me tell you how this gets so much worse. So, I think it was either in 2017 or 2018 that a bunch of scholars demonstrated that BERTs and GPTs deployed in automated moderation labeled tweets from Black users as "rude" 50% more often than tweets from white users.
For instance, I received a three day ban instead of a total ban on Reddit for the following after the first assassination attempt occurred. “I was extremely disappointed to hear that the person who attempted to assassinate the 🍊🤡 failed in his endeavour.”
I think the trick here is to blast them so eloquently that BSky can’t place a rude label on you. Replies like that are what I constantly strive to accomplish.
The authors of the study concluded that the nature of the LLMs (and this is pre-Stochastic Parrots) meant that they could not understand the context of the speech used and were subsequently making moderation decisions on the basis of what we in philosophy would call "received norms" of conduct.