BLUE
SB
Stella Biderman
@stellaathena.bsky.social
I study large language models
193 followers11 following10 posts
SBstellaathena.bsky.social

100% It's also noteworthy that the decision to not release any details about the training data or model architecture allows them to avoid citing the work in that space which has been disproportionately done by non-profit and academic researchers.

0
SBstellaathena.bsky.social

A propos of "Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models," this meme has been making rounds in the EleutherAI Discord. arxiv.org/abs/2311.00871

0
SBstellaathena.bsky.social

- Training jointly on code and NL - GPT-NeoX (math- and code-aware) tokenizer - 4-bit quant and qLoRA - RWKV - Instruction-tuning - Pre-layer norm - VQGAN - CLIP-Guidance - Latent diffusion - Decision transformers and more!

0
SBstellaathena.bsky.social

It's really wild when people say stuff like "academia doesn't matter any more, only the big labs with the most money do." Recent inventions by non-profit researchers have brought massive improvements in large scale models: - Alibi - Scaled RoPE - Flash Attention - Parallel attention and MLP layers

2
SBstellaathena.bsky.social

They already have massive amounts of data, and have the legal departments and funding to defend themselves in court. Lawsuits against researchers, even meritless ones, are expensive and time consuming. They have a strong chilling effect. Meanwhile companies will continue to plunder for private gain.

0
SBstellaathena.bsky.social

The current trajectory of the world is tending strongly towards non-transparency, in large part because people are afraid of what will happen. Smaller orgs, non-profits, and researchers are the ones held back by this, not the large corporations.

1
SBstellaathena.bsky.social

You cannot have it both ways. Either you care about transparency or you go after 15 person research non-profits for doing open and public research. Obviously I think we'll be legally and socially vindicated one day, but in the meanwhile I'm concerned about the chilling effect of events like this.

1
SBstellaathena.bsky.social

Very reasonable frustration with the exploitation of labor for commercial profit is being wrongly directed against research organizations. There's no clearer sign of this than the group responsible for taking the Pile down calling for greater transparency rettighedsalliancen.com/the-books3-c...

1
SBstellaathena.bsky.social

Transparency is a key part of both scientific research and ethical development and deployment of AI technologies. Without transparency into training data we cannot know whose information and ideologies are being encoded in ML systems. Unfortunately, this work is increasingly hard to do.

1
SBstellaathena.bsky.social

This is your daily reminder that only three orgs have ever trained a LLM and released the model and full data: EleutherAI BigScience (non-OS license) and Together Computer. Small orgs like these make science possible in the face of industry power.

1
SB
Stella Biderman
@stellaathena.bsky.social
I study large language models
193 followers11 following10 posts