BLUE

Ana Marasović

@anamarasovic.bsky.social

Asst prof @ University of Utah · NLP, XAI · she/her 🇭🇷

177 followers75 following116 posts

AManamarasovic.bsky.socialJul 25, 2024 10:20am

Honestly, surprised by how chill everyone is when I ask to opt-out of facial recognition which is the default at SLC airport security and gates for at least a year

AManamarasovic.bsky.socialJul 25, 2024 10:16am

I left my apartment and finished security in 40 minutes yesterday! No issues with flights either. Flew with delta

AManamarasovic.bsky.socialJun 23, 2024 5:58pm

youtu.be/JYqfVE-fykk?... this is perfect 😂

Washington's Dream - SNL

George Washington (Nate Bargatze) tells his soldiers (Kenan Thompson, Mikey Day, Bowen Yang, James Austin Johnson) his dream for the country.Saturday Night L...

AManamarasovic.bsky.socialJun 21, 2024 11:08pm

🤖 #bskai

AManamarasovic.bsky.socialJun 21, 2024 11:06pm

In the TMLR paper arxiv.org/abs/2402.14897, we replicate 📉📈 relationship between model size and the final measure of faithfulness in Lanham et al. (2023; arxiv.org/abs/2307.13702) with open-weights models; yay, but...

Chain-of-Thought Unfaithfulness as Disguised Accuracy

Understanding the extent to which Chain-of-Thought (CoT) generations align with a large language model's (LLM) internal computations is critical for deciding whether to trust an LLM's output. As a...

Reposted by Ana Marasović

AManamarasovic.bsky.socialJun 21, 2024 11:06pm

In the TMLR paper arxiv.org/abs/2402.14897 arxiv.org/abs/2307.13702) with open-weights models; yay, but...

Chain-of-Thought Unfaithfulness as Disguised Accuracy

Understanding the extent to which Chain-of-Thought (CoT) generations align with a large language model's (LLM) internal computations is critical for deciding whether to trust an LLM's output. As a...

AManamarasovic.bsky.socialJun 21, 2024 11:07pm

In any case, I'm excited about measurements of CoT faithfulness grounded in model internals, instead of those produced by intervening only on inputs/outputs. That's hard, but hey, at least there is more groundbreaking research to be done!!

AManamarasovic.bsky.socialJun 21, 2024 11:07pm

Is it likely that... ...accuracy is strongly indicative of the intricate concept of CoT faithfulness? ...accurate models produce very unfaithful, but also very plausible CoTs, meaning that they actually reason differently from people in all cases where they solve the task well?

AManamarasovic.bsky.socialJun 21, 2024 11:06pm

Well, eliminating inverse scaling led us to check the correlation between accuracy and Lanham et al. (2023)'s unfaithfulness, and turns out these two are correlated, which we believe is an issue. I'll discuss why next ⬇️

AManamarasovic.bsky.socialJun 21, 2024 11:06pm

...after accounting for a model's bias toward certain answer choices, we show that Lanham et al. (2023)'s unfaithfulness drops significantly for smaller less-capable models; so what?

AManamarasovic.bsky.socialJun 21, 2024 11:06pm

In the TMLR paper arxiv.org/abs/2402.14897 arxiv.org/abs/2307.13702) with open-weights models; yay, but...

Chain-of-Thought Unfaithfulness as Disguised Accuracy

Understanding the extent to which Chain-of-Thought (CoT) generations align with a large language model's (LLM) internal computations is critical for deciding whether to trust an LLM's output. As a...

Ana Marasović

@anamarasovic.bsky.social

Asst prof @ University of Utah · NLP, XAI · she/her 🇭🇷

177 followers75 following116 posts