BLUE

DAdalonso.mas.to.ap.brid.gySep 30, 2024 1:20pm

AI pareidolia: Can machines spot faces in inanimate objects? https://news.mit.edu/2024/ai-pareidolia-can-machines-spot-faces-in-inanimate-objects-0930 "New dataset of “illusory” faces reveals differences between human and algorithmic face detection, links to animal face recognition, and a […]

Nnafnlaus.bsky.socialSep 30, 2024 12:40pm

😂 Really, though, good point about nothing the "subset" aspect. Like, for any given training dataset, I'm never using the whole dataset, generally just some small part of it.

ACarxiv-cs-cv.bsky.socialSep 30, 2024 12:31pm

Songrui Wang, Yubo Zhu, Wei Tong, Sheng Zhong Detecting Dataset Abuse in Fine-Tuning Stable Diffusion Models for Text-to-Image Synthesis https://arxiv.org/abs/2409.18897

TUtedunderwood.meSep 30, 2024 12:27pm

There is no requirement to list specific works or to list copyright holders. Only the owner of the *dataset*.

TUtedunderwood.meSep 30, 2024 12:26pm

So I imagine what companies are going to do is 1) a proprietary dataset of image material, scraped from the web and labeled by Megacorp; owner: Megacorp. 2) a dataset of text material digitized and cleaned by Megacorp

Nnafnlaus.bsky.socialSep 30, 2024 12:22pm

"(1) The sources or owners of the datasets." Any copyright holder can just look at the dataset, see if their work is in it, and then sue anyone who listed that dataset in their disclosure.

ANanderagakura.bsky.socialSep 30, 2024 11:06am

New paper from criteo, Inria & ENSAE on DU-Shapley, fast and efficient method to estimate Shapley values for dataset valuation by reducing computation and ensuring accurate results. Could be useful in advertising and other industries. arxiv.org/pdf/2306.02071

ALalicele.bsky.socialSep 30, 2024 10:30am

What we DID test is that the within patient plasmid genetic distance is significantly different from the between plasmid genetic distance. Pointing to plasmid found within each patient not being randomly extracted from the dataset (hence probably spread by conjugation)

ALalicele.bsky.socialSep 30, 2024 10:26am

We used a sub-dataset from the from this paper, www.microbiologyresearch.org/content/jour... Selected sequences from patients who had at least 2 sequences, one of each carried an OXA48 resistance (Well known to be plasmid borne).

Diversity of carbapenemase-producing Enterobacterales in England as revealed by whole-genome sequencing of isolates referred to a national reference laboratory over a 30-month period

Introduction. Increasing numbers of carbapenemase-producing Enterobacterales (CPE), which can be challenging to treat, have been referred to the national reference laboratory in England since the ear...

ALalicele.bsky.socialSep 30, 2024 10:24am

Finally this draft is out! The question is: how often do we find the same plasmid in different bacterial hosts in the same patient? Spoiler alert: in this dataset very often!

BBbiorxiv-bioinfo.bsky.socialSep 30, 2024 3:47am

Plasmid conjugation drives within-patient plasmid diversity https://www.biorxiv.org/content/10.1101/2024.09.27.615342v1

Plasmids are well known vehicles of antimicrobial resistance (AMR) genes dissemination. Through conj