Can LLM be used in science? That's a scientific question. We use LLM on our data, and we verified it by comparing it to human raters. Both, btw, are black boxes... We don't know how either works. But we care about human judgement, and for this task LLM matches them.
“We care about human judgment so we are using a machine to simulate it”
If we're talking quantitative summaries of data, humans are basically just selecting mean vs. median and handing things off to a deterministic algorithm which isn't a black box at all and can be tested for accuracy While LLMs are confidently unreliable at adding five-digit numbers together
Heard a great talk by Henning Hermjakob on the reactome project exploring LLMs, finding that they didn’t work for some tasks but for others could be great. As long as you’ve got a testable hypothesis, I don’t see why they couldn’t work