BLUE
Profile banner
EH
Ed Hawkins
@edhawkins.bsky.social
Climate scientist at the National Centre for Atmospheric Science, University of Reading | IPCC AR6 Lead Author | MBE | Warming Stripes: www.ShowYourStripes.info | Weather Rescue citizen science initiative: www.WeatherRescue.org
5.1k followers312 following205 posts
EHedhawkins.bsky.social

So, how do we rescue millions of historical weather observations that are not currently available to science, such as those in this table? Surely, ML can do this by now... 🧵

9

Mmichaelbishop.me

Have you tried something like Tabula? tabula.technology

2
JBjeboyt.bsky.social

Write a grant to get them scanned

1
GSclimateofgavin.bsky.social

What happens if you use the human read answers from previous digitization efforts as training data for a bespoke ML effort?

1
KMdnakendra.bsky.social

I've had luck with reading files into R. Haven't used this package but it claims to be able to do it from png. www.r-bloggers.com/2016/11/the-... Aside, one of my first jobs included typing in water levels from hand written records from the 40s on for usgs wiers. Good to know it's still needed :D

The new Tesseract package: High Quality OCR in R
The new Tesseract package: High Quality OCR in R

Optical character recognition (OCR) is the process of extracting written or typed text from images such as photos and scanned documents into machine-encoded text. The new rOpenSci package tesseract br...

0
EHedhawkins.bsky.social

Amazon Textract

3
MCmarkcnorwich.bsky.social

People > machines. Call for the Zooniverse.

1
KRkep.rinvelo.com

OCR seems like a more practical purpose than machine learning for the task

1
FDfdonoff.bsky.social

Economic historians have the same problem with tables (no good OCR as far as I know) and solve it by hiring people who do manual entry...

0
JBpv-physicist.bsky.social

I would try Transkribus which was specifically made to transcribe historic documents.

0
Profile banner
EH
Ed Hawkins
@edhawkins.bsky.social
Climate scientist at the National Centre for Atmospheric Science, University of Reading | IPCC AR6 Lead Author | MBE | Warming Stripes: www.ShowYourStripes.info | Weather Rescue citizen science initiative: www.WeatherRescue.org
5.1k followers312 following205 posts