BLUE
DS
David Smith
@dasmiq.bsky.social
Associate professor of computer science at Northeastern University. Natural language processing, digital humanities, OCR, computational bibliography, and computational social sciences. Artificial intelligence is an archival science.
448 followers220 following224 posts
DSdasmiq.bsky.social

The OCR is the @archive.org 's own Tesseract output, so it won't match previous transcripts. We're thinking about fine-tuning better models and rerunning (as we've done for some of Chronicling America), if anyone has free machines and interest.

2

SGshgregg.bsky.social

Hi! I’d be very interested to know more if you’d be happy chatting via email or zoom etc at some point? DM me if you’re interested.

0
SGshgregg.bsky.social

Ah! Thanks for info.

1
DS
David Smith
@dasmiq.bsky.social
Associate professor of computer science at Northeastern University. Natural language processing, digital humanities, OCR, computational bibliography, and computational social sciences. Artificial intelligence is an archival science.
448 followers220 following224 posts