BLUE
Profile banner
AM
Ana Marasović
@anamarasovic.bsky.social
Asst prof @ University of Utah · NLP, XAI · she/her 🇭🇷
177 followers75 following116 posts
AManamarasovic.bsky.social

Honestly, surprised by how chill everyone is when I ask to opt-out of facial recognition which is the default at SLC airport security and gates for at least a year

2
AManamarasovic.bsky.social

I left my apartment and finished security in 40 minutes yesterday! No issues with flights either. Flew with delta

1
AManamarasovic.bsky.social

In any case, I'm excited about measurements of CoT faithfulness grounded in model internals, instead of those produced by intervening only on inputs/outputs. That's hard, but hey, at least there is more groundbreaking research to be done!!

0
AManamarasovic.bsky.social

Is it likely that... ...accuracy is strongly indicative of the intricate concept of CoT faithfulness? ...accurate models produce very unfaithful, but also very plausible CoTs, meaning that they actually reason differently from people in all cases where they solve the task well?

1
AManamarasovic.bsky.social

Well, eliminating inverse scaling led us to check the correlation between accuracy and Lanham et al. (2023)'s unfaithfulness, and turns out these two are correlated, which we believe is an issue. I'll discuss why next ⬇️

1
AManamarasovic.bsky.social

...after accounting for a model's bias toward certain answer choices, we show that Lanham et al. (2023)'s unfaithfulness drops significantly for smaller less-capable models; so what?

1
Profile banner
AM
Ana Marasović
@anamarasovic.bsky.social
Asst prof @ University of Utah · NLP, XAI · she/her 🇭🇷
177 followers75 following116 posts