BLUE
Profile banner
MM
Mike Mahoney
@mikemahoney218.com
Environmental science and technology. Opinions my own. #rstats, ML, Boston and spatial data. RTs imply causation. any pronouns / 🏳️‍🌈 mm218.dev github.com/mikemahoney218
452 followers234 following637 posts
MMmikemahoney218.com

If you're working with bigger data -- make sure you're shipping your pairs out to each thread, and not the whole data frame. That overhead to copy a huge df across threads is a killer. But otherwise, yeah, unfortunately this is the way

0
MMmikemahoney218.com

anyway, point being, this way madness lies 😂 4/4

1
MMmikemahoney218.com

All of this is single threaded, because the overhead of moving large objects across threads was too high... I've fully considered writing the data into DuckDB and using the spatial extension, to hopefully get better parallelism, but haven't had the focus on this project for it

1
MMmikemahoney218.com

and then scan the matrix for distances below a certain threshold. For my needs, I wind up doing a single st_distance call and then processing the matrix in C++ for an ounce more performance. It's gross. One of the funniest comments I've written though, imo: github.com/tidymodels/s...

spatialsample/R/buffer.R at main · tidymodels/spatialsample
spatialsample/R/buffer.R at main · tidymodels/spatialsample

Create and summarize spatial resampling objects 🗺. Contribute to tidymodels/spatialsample development by creating an account on GitHub.

1
MMmikemahoney218.com

optimizing this is the most complicated part of spatialsample. The best solution I've found is either parallelizing it yourself (make a list that's like c(1, 2), c(1, 3) etc and use it to index the data frame, then run st_intersects for each pair separately) or compute a single distance matrix 1/

1
Reposted by Mike Mahoney
VFchezvoila.com

Alternatives to the traffic light colour scheme 🟢🟡🔴 Your dashboards will look fresh, will be more accessible and your clients will think you're a magician. 📊 👇

1
MMmikemahoney218.com

Never once did it occur to me that function had a serious use case

0
Reposted by Mike Mahoney
Pspavel.bsky.social

Moe: Oh a RESTful API lah dee dah mr french man

Homer: Well what do you call it?

Moe: An HTTP endpoint
14
Reposted by Mike Mahoney
DLdavelevitan.bsky.social

This is USGS water gage data. The black dots are "all-time high for this day." The pink outlined ones are above flood stage, dark blue are "much above normal." These stretch across a thousand miles.This is USGS water gage data. The black dots are "all-time high for this day." The pink outlined ones are above flood stage, dark blue are "much above normal." These stretch across a thousand miles.

2
MMmikemahoney218.com

That said, the online resources around it -- like Michelle's workshops!! -- have gotten DRASTICALLY better in the amount of time I've been using it

1
Profile banner
MM
Mike Mahoney
@mikemahoney218.com
Environmental science and technology. Opinions my own. #rstats, ML, Boston and spatial data. RTs imply causation. any pronouns / 🏳️‍🌈 mm218.dev github.com/mikemahoney218
452 followers234 following637 posts