BLUE

Stella Biderman

@stellaathena.bsky.social

I study large language models

193 followers11 following10 posts

SBstellaathena.bsky.socialDec 8, 2023 6:51pm

100% It's also noteworthy that the decision to not release any details about the training data or model architecture allows them to avoid citing the work in that space which has been disproportionately done by non-profit and academic researchers.

MMmmitchell.bsky.socialDec 7, 2023 5:55pm

The recent Google "Gemini" work doesn't cite model cards or datasheets where they are clearly relevant. Allow me to take this moment to talk about how to fix patterns of exclusion in tech, which disproportionately affect (eg) women: 1/

SBstellaathena.bsky.socialNov 8, 2023 12:31am

A propos of "Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models," this meme has been making rounds in the EleutherAI Discord. arxiv.org/abs/2311.00871

SBstellaathena.bsky.socialOct 21, 2023 1:42am

- Training jointly on code and NL - GPT-NeoX (math- and code-aware) tokenizer - 4-bit quant and qLoRA - RWKV - Instruction-tuning - Pre-layer norm - VQGAN - CLIP-Guidance - Latent diffusion - Decision transformers and more!

SBstellaathena.bsky.socialOct 21, 2023 1:30am

It's really wild when people say stuff like "academia doesn't matter any more, only the big labs with the most money do." Recent inventions by non-profit researchers have brought massive improvements in large scale models: - Alibi - Scaled RoPE - Flash Attention - Parallel attention and MLP layers

SBstellaathena.bsky.socialSep 30, 2023 5:41pm

They already have massive amounts of data, and have the legal departments and funding to defend themselves in court. Lawsuits against researchers, even meritless ones, are expensive and time consuming. They have a strong chilling effect. Meanwhile companies will continue to plunder for private gain.

SBstellaathena.bsky.socialSep 30, 2023 5:34pm

The current trajectory of the world is tending strongly towards non-transparency, in large part because people are afraid of what will happen. Smaller orgs, non-profits, and researchers are the ones held back by this, not the large corporations.

SBstellaathena.bsky.socialSep 30, 2023 5:32pm

You cannot have it both ways. Either you care about transparency or you go after 15 person research non-profits for doing open and public research. Obviously I think we'll be legally and socially vindicated one day, but in the meanwhile I'm concerned about the chilling effect of events like this.

SBstellaathena.bsky.socialSep 29, 2023 2:29pm

Very reasonable frustration with the exploitation of labor for commercial profit is being wrongly directed against research organizations. There's no clearer sign of this than the group responsible for taking the Pile down calling for greater transparency rettighedsalliancen.com/the-books3-c...

SBstellaathena.bsky.socialSep 29, 2023 2:29pm

Transparency is a key part of both scientific research and ethical development and deployment of AI technologies. Without transparency into training data we cannot know whose information and ideologies are being encoded in ML systems. Unfortunately, this work is increasingly hard to do.

SBstellaathena.bsky.socialSep 29, 2023 2:28pm

This is your daily reminder that only three orgs have ever trained a LLM and released the model and full data: EleutherAI BigScience (non-OS license) and Together Computer. Small orgs like these make science possible in the face of industry power.

Stella Biderman

@stellaathena.bsky.social

I study large language models

193 followers11 following10 posts