BLUE
EM
Emily M. Bender
@emilymbender.bsky.social
5.6k followers191 following800 posts
EMemilymbender.bsky.social

I feel very vindicated for not making time to answer journalist's queries about papers that "prove" things based on hypothesized graphs and fabricated data.I feel very vindicated for not making time to answer journalist's queries about papers that "prove" things based on hypothesized graphs and fabricated data.

Screenshot: The researchers started by assuming that there exists a hypothetical bipartite graph that corresponds to an LLM’s behavior on test data. To explain the change in the LLM’s loss on test data, they imagined a way to use the graph to describe how the LLM gains skills.

Take, for instance, the skill “understands irony.” This idea is represented with a skill node, so the researchers look to see what text nodes this skill node connects to. If almost all of these connected text nodes are successful — meaning that the LLM’s predictions on the text represented by these nodes are highly accurate — then the LLM is competent in this particular skill. But if more than a certain fraction of the skill node’s connections go to failed text nodes, then the LLM fails at this skill.

Source: https://www.quantamagazine.org/new-theory-suggests-chatbots-can-understand-text-20240122/
Screenshot: The team also automated the process by getting GPT-4 to evaluate its own output, along with that of other LLMs. Arora said it’s fair for the model to evaluate itself because it doesn’t have memory, so it doesn’t remember that it was asked to generate the very text it’s being asked to evaluate. Yasaman Bahri, a researcher at Google DeepMind who works on foundations of AI, finds the automated approach “very simple and elegant.”

Source: https://www.quantamagazine.org/new-theory-suggests-chatbots-can-understand-text-20240122/
Screenshot: Nonetheless, Hinton thinks the work lays to rest the question of whether LLMs are stochastic parrots. “It is the most rigorous method I have seen for showing that GPT-4 is much more than a mere stochastic parrot,” he said. “They demonstrate convincingly that GPT-4 can generate text that combines skills and topics in ways that almost certainly did not occur in the training data.” (We reached out to Bender for her perspective on the new work, but she declined to comment, citing a lack of time.)

Source: https://www.quantamagazine.org/new-theory-suggests-chatbots-can-understand-text-20240122/
7

JPjdp23.bsky.social

"very elegant" 😱

0
KVkaiavintr.bsky.social

I guess with so many people doing research or "research" on black-box LLMs, this sort of thing has become normalized

0
BBinsortediaboli.bsky.social

ooooh ive got some nose rubbing to do at work, got a link

1
Hharedurer.bsky.social

Program sez it’s not a parrot.

0
QPqpheevr.bsky.social

“…citing a lack of time [for this bullshit]”

0
Ulepcyrus.bsky.social

"simple & elegant" yeah so... wrong then, like everything else that's simple & elegant in data sciences

0
DNchecarina.bsky.social

lmao. lol. rofl

0
EM
Emily M. Bender
@emilymbender.bsky.social
5.6k followers191 following800 posts