GP
Giada Pistilli
@giada.bsky.social
Principal Ethicist at Hugging Face • Philosophy Ph.D. at Sorbonne Université
442 followers118 following111 posts
Some key findings: beyond refusal rates, our experiments using CIVICS show diverse responses across LLMs on sensitive topics -- e.g., immigration, LGBTQI rights, and social welfare triggered varied reactions.
We also encountered significant variation in cultural bias among different open-weight models. Refusal to respond to prompts on LGBTQI rights and immigration varied widely, suggesting that models from diverse cultural contexts show varying sensitivity and ethical considerations.
GP
Giada Pistilli
@giada.bsky.social
Principal Ethicist at Hugging Face • Philosophy Ph.D. at Sorbonne Université
442 followers118 following111 posts