AM
Ana Marasović
@anamarasovic.bsky.social
Asst prof @ University of Utah · NLP, XAI · she/her 🇭🇷
177 followers75 following116 posts
In the TMLR paper arxiv.org/abs/2402.14897arxiv.org/abs/2307.13702) with open-weights models; yay, but...
Chain-of-Thought Unfaithfulness as Disguised Accuracy
Understanding the extent to which Chain-of-Thought (CoT) generations align with a large language model's (LLM) internal computations is critical for deciding whether to trust an LLM's output. As a...
...after accounting for a model's bias toward certain answer choices, we show that Lanham et al. (2023)'s unfaithfulness drops significantly for smaller less-capable models; so what?
AM
Ana Marasović
@anamarasovic.bsky.social
Asst prof @ University of Utah · NLP, XAI · she/her 🇭🇷
177 followers75 following116 posts