this post was submitted on 21 Jan 2024

716 points (94.8% liked)

Technology

58070 readers

3251 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

716

Nightshade, the free tool that ‘poisons’ AI models, is now available for artists to use (venturebeat.com)

submitted 8 months ago by throws_lemy@lemmy.nz to c/technology@lemmy.world

231 comments fedilink hide all child comments

top 50 comments

sorted by: hot top controversial new old

[–] Even_Adder@lemmy.dbzer0.com 199 points 7 months ago (3 children)

Reminder that this is made by Ben Zhao, the University of Chicago professor who stole open source code for his last data poisoning scheme.

[–] ramenshaman@lemmy.world 55 points 7 months ago (4 children)

Pardon my ignorance but how do you steal code if it's open source?

[–] Even_Adder@lemmy.dbzer0.com 58 points 7 months ago (4 children)

He took GPLv3 code, which is a copyleft license that requires you share your source code and license your project under the same terms as the code you used. You also can't distribute your project as a binary-only or proprietary software. When pressed, they only released the code for their front end, remaining in violation of GPLv3.

load more comments (4 replies)

load more comments (3 replies)

[–] MargotRobbie@lemmy.world 19 points 7 months ago* (last edited 7 months ago)

And as I said there, it is utterly hypocritical for him to sell snake oil to artists, allegedly to help them fight copyright violations, while committing actual copyright violations.

load more comments (1 replies)

[–] SPRUNT@lemmy.world 99 points 8 months ago (11 children)

Is there a similar tool that will "poison" my personal tracked data? Like, I know I'm going to be tracked and have a profile built on me by nearly everywhere online. Is there a tool that I can use to muddy that profile so it doesn't know if I'm a trans Brazilian pet store owner, a Nigerian bowling alley systems engineer, or a Beverly Hills sanitation worker who moonlights as a practice subject for budding proctologists?

[–] Ghostalmedia@lemmy.world 116 points 8 months ago (3 children)

The only way to taint your behavioral data so that you don’t get lumped into a targetable cohort is to behave like a manic. As I’ve said in a past comment here, when you fill out forms, pretend your gender, race, and age is fluid. Also, pretend you’re nomadic. Then behave erratic as fuck when shopping online - pay for bibles, butt plugs, taxidermy, and PETA donations.

Your data will be absolute trash. You’ll also be miserable because you’re going to be visiting the Amazon drop off center with gag balls and porcelain Jesus figurines to return every week.

[–] Bonehead@kbin.social 37 points 7 months ago (1 children)

Then behave erratic as fuck when shopping online - pay for bibles, butt plugs, taxidermy, and PETA donations.

...in the same transaction. It all needs to be bought and then shipped together. Not only to fuck with the algorithm, but also to fuck with the delivery guy. Because we usually know what you ordered. Especially when it's in the soft bag packaging. Might as well make everyone outside your personal circle think you're a bit psychologically disturbed, just to be safe.

[–] Neato@ttrpg.network 18 points 7 months ago

How? Aren't most items in boxes even in the bags? It's not like they just toss a butt plug into a bag and ship it...right?

load more comments (2 replies)

[–] Australis13@fedia.io 31 points 8 months ago (1 children)

The browser addon "AdNauseum" can help with that, although it's not a complete solution.

[–] capital@lemmy.world 24 points 7 months ago (4 children)

That and trackmenot.

It searches random shit in the background.

https://www.trackmenot.io/

load more comments (4 replies)

[–] TropicalDingdong@lemmy.world 18 points 7 months ago (2 children)

Is there a similar tool that will “poison” my personal tracked data? Like, I know I’m going to be tracked and have a profile built on me by nearly everywhere online. Is there a tool that I can use to muddy that profile so it doesn’t know if I’m a trans Brazilian pet store owner, a Nigerian bowling alley systems engineer, or a Beverly Hills sanitation worker who moonlights as a practice subject for budding proctologists?

Have you considered just being utterly incoherent, and not making sense as a person? That could work.

[–] SPRUNT@lemmy.world 25 points 7 months ago

According to my exes, yes.

load more comments (1 replies)

load more comments (8 replies)

[–] gapbetweenus@feddit.de 70 points 7 months ago (47 children)

The tool's creators are seeking to make it so that AI model developers must pay artists to train on data from them that is uncorrupted.

That's not something a technical solution will work for. We need copyright laws to be updated.

[–] Even_Adder@lemmy.dbzer0.com 22 points 7 months ago (14 children)

You should check out this article by Kit Walsh, a senior staff attorney at the EFF. The EFF is a digital rights group who recently won a historic case: border guards now need a warrant to search your phone.

A few quotes:

First, copyright law doesn’t prevent you from making factual observations about a work or copying the facts embodied in a work (this is called the “idea/expression distinction”). Rather, copyright forbids you from copying the work’s creative expression in a way that could substitute for the original, and from making “derivative works” when those works copy too much creative expression from the original.

Second, even if a person makes a copy or a derivative work, the use is not infringing if it is a “fair use.” Whether a use is fair depends on a number of factors, including the purpose of the use, the nature of the original work, how much is used, and potential harm to the market for the original work.

and

Even if a court concludes that a model is a derivative work under copyright law, creating the model is likely a lawful fair use. Fair use protects reverse engineering, indexing for search engines, and other forms of analysis that create new knowledge about works or bodies of works. Here, the fact that the model is used to create new works weighs in favor of fair use as does the fact that the model consists of original analysis of the training images in comparison with one another.

load more comments (14 replies)

load more comments (46 replies)

[–] kromem@lemmy.world 52 points 7 months ago (8 children)

This doesn't work outside of laboratory conditions.

It's the equivalent of "doctors find cure for cancer (in mice)."

[–] bier@feddit.nl 17 points 7 months ago (2 children)

I like that example, everytime you hear about some discovery that x kills 100% of cancer cells in a petri dish. You always have to think, so does bleach.

load more comments (2 replies)

load more comments (7 replies)

[–] General_Effort@lemmy.world 44 points 7 months ago (25 children)

Explanation of how this works.

These "AI models" (meaning the free and open Stable Diffusion in particular) consist of different parts. The important parts here are the VAE and the actual "image maker" (U-Net).

A VAE (Variational AutoEncoder) is a kind of AI that can be used to compress data. In image generators, a VAE is used to compress the images. The actual image AI only works on the smaller, compressed image (the latent representation), which means it takes a less powerful computer (and uses less energy). It’s that which makes it possible to run Stable Diffusion at home.

This attack targets the VAE. The image is altered so that the latent representation is that of a very different image, but still roughly the same to humans. Say, you take images of a cat and of a dog. You put both of them through the VAE to get the latent representation. Now you alter the image of the cat until its latent representation is similar to that of the dog. You alter it only in small ways and use methods to check that it still looks similar for humans. So, what the actual image maker AI "sees" is very different from the image the human sees.

Obviously, this only works if you have access to the VAE used by the image generator. So, it only works against open source AI; basically only Stable Diffusion at this point. Companies that use a closed source VAE cannot be attacked in this way.

I guess it makes sense if your ideology is that information must be owned and everything should make money for someone. I guess some people see cyberpunk dystopia as a desirable future. I wonder if it bothers them that all the tools they used are free (EG the method to check if images are similar to humans).

It doesn’t seem to be a very effective attack but it may have some long-term PR effect. Training an AI costs a fair amount of money. People who give that away for free probably still have some ulterior motive, such as being liked. If instead you get the full hate of a few anarcho-capitalists that threaten digital vandalism, you may be deterred. Well, my two cents.

[–] barsoap@lemm.ee 13 points 7 months ago* (last edited 7 months ago)

So, it only works against open source AI; basically only Stable Diffusion at this point.

I very much doubt it even works against the multitude of VAEs out there. There's not just the ones derived from StabilitiyAI's models but ones right now simply intended to be faster (at a loss of quality): TAESD can also encode and has a completely different architecture thus is completely unlikely to be fooled by the same attack vector. That failing, you can use a simple affine transformation to convert between latent and rgb space (that's what "latent2rgb" is) and compare outputs to know whether the big VAE model got fooled into generating something unrelated. That thing just doesn't have any attack surface, there's several magnitudes too few weights in there.

Which means that there's an undefeatable way to detect that the VAE was defeated. Which means it's only a matter of processing power until Nightshade is defeated, no human input needed. They'll of course again train and try to fool the now hardened VAE, starting another round, ultimately achieving nothing but making the VAE harder and harder to defeat.

It's like with Russia: They've already lost the war but they haven't noticed, yet -- though I wouldn't be too sure that Nightshade devs themselves aren't aware of that: What they're doing is a powerful way to grift a lot of money from artists without a technical bone in their body.

load more comments (24 replies)

[–] Telodzrum@lemmy.world 36 points 8 months ago (10 children)

Fascinating that they develop this tool and then only release Windows and MacOS versions.

[–] AceFuzzLord@lemm.ee 12 points 7 months ago (1 children)

To be fair, windows and macos are the 2 biggest computer operating systems in the world. It makes a lot more sense to focus on building tools for people using the biggest platforms rather than focus on people using something with a user base fragmented across multiple versions of the same OS.

Though I do agree a version for Linux would be nice. Even if we have the mac equivalent of wine, darling, I don't know enough about it to say whether it's up to the task or not.

load more comments (1 replies)

[–] cybersandwich@lemmy.world 12 points 7 months ago (5 children)

It's simple math. 97% of the population uses those two operating systems.

There isn't much more incentive to go after the 3% Linux users. You know the population that loves free and open source software and isn't exactly known for dropping a bunch of cash on software. Not to mention it's a fragmented 3%. Even the flatpak, snap, app images of the world that were supposed to make devs lives easier are fragmented across distros.

load more comments (5 replies)

load more comments (8 replies)

[–] vsis@feddit.cl 35 points 7 months ago (2 children)

It's not FOSS and I don't see a way to review if what they claim is actually true.

It may be a way to just help to diferentiate legitimate human made work vs machine-generated ones, thus helping AI training models.

Can't demostrate that fact neither, because of its license that expressly forbids sofware adaptions to other uses.

Edit, alter, modify, adapt, translate or otherwise change the whole or any part of the Software nor permit the whole or any part of the Software to be combined with or become incorporated in any other software, nor decompile, disassemble or reverse engineer the Software or attempt to do any such things

sauce: https://nightshade.cs.uchicago.edu/downloads.html

[–] nybble41@programming.dev 17 points 7 months ago (1 children)

The EULA also prohibits using Nightshade "for any commercial purpose", so arguably if you make money from your art—in any way—you're not allowed to use Nightshade to "poison" it.

load more comments (1 replies)

[–] JATtho@lemmy.world 13 points 7 months ago

I read the article enough to find that the Nightshade tool is under EULA... :(

Because it definitely is not FOSS, use it with caution, preferably on a system not connected to internet.

[–] pavnilschanda@lemmy.world 35 points 7 months ago (2 children)

Apparently people who specialize in AI/ML have a very hard time trying to replicate the desired results when training models with 'poisoned' data. Is that true?

[–] Even_Adder@lemmy.dbzer0.com 37 points 7 months ago* (last edited 7 months ago) (7 children)

I've only heard that running images through a VAE just once seems to break the Nightshade effect, but no one's really published anything yet.

You can finetune models on known bad and incoherent images to help it to output better images if the trained embedding is used in the negative prompt. So there's a chance that making a lot of purposefully bad data could actually make models better by helping the model recognize bad output and avoid it.

load more comments (7 replies)

[–] Miaou@jlai.lu 14 points 7 months ago (1 children)

Until they come with some preprocessing step, or some better feature extractors etc. This is an arms race like there are many of

load more comments (1 replies)

[–] mjhelto@lemm.ee 33 points 7 months ago (2 children)

Begun, the AI Wars have.

[–] UnderpantsWeevil@lemmy.world 14 points 7 months ago (1 children)

Excited to see the guys that made Nightshade get sued in a Silicon Valley district court, because they're something something mumble mumble intellectual property national security.

[–] Even_Adder@lemmy.dbzer0.com 27 points 7 months ago (2 children)

They already stole GPLv2 code for their last data poisoning scheme and remain in violation of that license. They're just grifters.

load more comments (2 replies)

load more comments (1 replies)

[–] webghost0101@sopuli.xyz 22 points 8 months ago

I bet that before the end of this year this tool will be one of the things that helped improve the performance and quality of AI.

[–] neurogenesis@lemmy.dbzer0.com 22 points 7 months ago

Oily snakes slither such that back and forth looks like production..

[–] Canadian_Cabinet@lemmy.ca 18 points 7 months ago

Ironic that they used an AI picture for the article...

[–] HexesofVexes@lemmy.world 16 points 7 months ago

Ah, another arms race has begun. Just be wary, what one person creates another will circumvent.

[–] M0oP0o@mander.xyz 13 points 7 months ago* (last edited 7 months ago)

They clam a credit to using AI to make the thumbnail..... The same people who did nothing more then ask Chat GPT to make a picture to represent the article on a tool that poisons AI models to protect people who make pictures for a living from having Chat GPT use their work to make; say a picture to represent an article on a tool that poisons AI models......

[–] EmperorHenry@discuss.tchncs.de 11 points 7 months ago (13 children)

I hope every artist starts using it.

AI art isn't real art.

load more comments (13 replies)

load more comments