BLUE

Mark Riedl

@markriedl.bsky.social

AI for storytelling, games, explainability, safety, ethics. Professor at Georgia Tech. Associate Director of ML Center at GT. Time travel expert. Geek. Dad. he/him

724 followers74 following308 posts

MRmarkriedl.bsky.socialMar 6, 2024 1:45pm

It would appear that an entire industry of evaluating LLMs is springing up. Here is another company that has put LLMs to the copyright test and found them to be problematic www.cnbc.com/2024/03/06/g...

Researchers tested leading AI models for copyright infringement using popular books, and GPT-4 performed worst

Patronus AI on Wednesday released research showcasing how often leading AI models produce copyrighted content.

MRmarkriedl.bsky.socialMar 6, 2024 1:46pm

I feel like we are now at the stage where we have black-box evaluators evaluating black-box LLMs, so it is super hard to know how serious to take these reports. We need transparency on the evaluation methods, which we are more likely to get from academic labs.

Mark Riedl

@markriedl.bsky.social

AI for storytelling, games, explainability, safety, ethics. Professor at Georgia Tech. Associate Director of ML Center at GT. Time travel expert. Geek. Dad. he/him

724 followers74 following308 posts