BLUE
Profile banner
AD
Alexander Doria
@dorialexander.bsky.social
LLM for the commons. Cofounder Pleias
534 followers238 following106 posts
ADdorialexander.bsky.social

Doing experiments of synthetic literature with a freshly finetuned llama 8b: Plato's Republic as a film noir works surprisingly well.

1
ADdorialexander.bsky.social

Announcing that we are on our way to solve a long standing issue of document processing: correction of OCR mistakes. Pleias publishes the largest dataset to date with automated OCR correction, 1 billion words in English, French, German and Italian huggingface.co/datasets/Ple...

1
ADdorialexander.bsky.social

My phd thesis on finance speculation and mass media in 19th century France was really not supposed to be an how to. But as it turns out, things haven’t changed much…

0
Reposted by Alexander Doria
AGtechnollama.bsky.social

This meme is too good for Twitter, so I'm just dropping it here.

2
ADdorialexander.bsky.social

Currently working on an OCR correction model and it accidentally creates hallucinated fiction when the source is *really* noisy.

5
ADdorialexander.bsky.social

Small announcement: opening Paid Research Internship opportunities at a startup to train open science LLMs. Profiles could be AI-focused (with some familiarity/experience in open source LLM communities) or data-focused (with a DH background). Based in Paris, full remote possible.

2
ADdorialexander.bsky.social

Announcing the release of marginalia, a python library to perform corpus analysis and retrieve structured annotations with open LLMs like Mistral Open-Hermes-2.5. github.com/Pleias/margi...

3
Reposted by Alexander Doria
MMmmitchell.bsky.social

Am feeling really happy and relieved that 3 days into the new year, and the Big AI News is...copyright-respecting creation of cartoon mice! Courtesy of @dorialexander.bsky.socialhuggingface.co/Pclanglais/M...

Pclanglais/Mickey-1928 · Hugging Face
Pclanglais/Mickey-1928 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

0
ADdorialexander.bsky.social

Happy new year! And happy public domain day with a major new entry: the original design of Mickey Mouse! For the occasion I’m releasing Mickey-1928 a model on Hugging Face that can generate pictures of Mickey, Minnie and Pete from 1928. huggingface.co/Pclanglais/M...

4
ADdorialexander.bsky.social

I’m afraid an LLM has been taken hostage by DH people. Situation is very dire: we get to the point where putting the entire pretraining corpus in TEI is explicitly mentioned.

1
Profile banner
AD
Alexander Doria
@dorialexander.bsky.social
LLM for the commons. Cofounder Pleias
534 followers238 following106 posts