this post was submitted on 04 Jan 2024
358 points (97.6% liked)

Technology

59569 readers
4416 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

cross-posted from: https://programming.dev/post/8121669

Taggart (@mttaggart) writes:

Japan determines copyright doesn't apply to LLM/ML training data.

On a global scale, Japan’s move adds a twist to the regulation debate. Current discussions have focused on a “rogue nation” scenario where a less developed country might disregard a global framework to gain an advantage. But with Japan, we see a different dynamic. The world’s third-largest economy is saying it won’t hinder AI research and development. Plus, it’s prepared to leverage this new technology to compete directly with the West.

I am going to live in the sea.

www.biia.com/japan-goes-all-in-copyright-doesnt-apply-to-ai-training/

you are viewing a single comment's thread
view the rest of the comments
[–] silverbax@lemmy.world 57 points 10 months ago* (last edited 10 months ago) (2 children)

I think this is a difficult concept to tackle, but the main argument I see about using existing works as 'training data' is the idea that 'everything is a remix'.

I, as a human, can paint an exact copy of a Picasso work or any other artist. This is not illegal and I have no need of a license to do this. I definitely don't need a license to paint something 'in the style of Picasso', and I can definitely sell it with my own name on it.

But the question is, what about when a computer does the same thing? What is the difference? Speed? Scale? Anyone can view a picture of the Mona Lisa at any time and make their own painting of it. You can't use the image of the Mona Lisa without accreditation and licensing, but what about a recreation of the Mona Lisa?

I'm not really arguing pro-AI here, although it may sound like it. I've just heard the 'licensing' argument many times and I'd really like to hear what the difference between a human copying and a computer copying are, if someone knows more about the law.

[–] abhibeckert@lemmy.world 48 points 10 months ago* (last edited 10 months ago) (4 children)

Um - your examples are so old the copyright expired centuries ago. Of course you can copy them. And you can absolutely use an image of the Mona Lisa without accreditation or licensing.

Painting and selling an exact copy of a recent work, such as Banksy, is a crime.

… however making an exact copy of Banksy for personal use, or to learn, or copying the style… that’s all perfectly legal.

[–] LWD@lemm.ee 27 points 10 months ago (2 children)

Painting and selling an exact copy of a recent work, such as Banksy, is a crime.

… however making an exact copy of Banksy for personal use, or to learn, or to teach other people, or copying the style… that’s all perfectly legal.

And that was the bait and switch of OpenAI! They sold themselves as being a non-profit simply doing research, for which it would be perfectly legal to consume and reproduce large quantities of data... And then, once they had the data, they started selling access to it.

I would say that that alone, along with the fact that they function as gatekeepers to the technology (One does not simply purchase the model from OpenAI, after all) they are hardly free of culpability... But it definitely depends on the person trying to use their black box too.

[–] abhibeckert@lemmy.world 6 points 10 months ago* (last edited 10 months ago)

Huh? What does being non profit have to do with it? Private companies are allowed to learn from copyrighted work. Microsoft and Apple, for example, look at each other's software and copy ideas (not code, just ideas) all the time. The fact Linux is non-profit doesn't give them any additional rights or protection.

[–] iegod@lemm.ee 4 points 10 months ago (1 children)

They're not gatekeeping llms though, there are publicly available models and data sets.

[–] LWD@lemm.ee 1 points 10 months ago* (last edited 10 months ago) (1 children)

If it's publicly available, why didn't Microsoft just download and use it rather than paying them for a partnership?
(And where at?)

IIRC they only open-sourced some old stuff.

[–] iegod@lemm.ee 1 points 10 months ago

Stability diffusion is open source. You can run local instances with provided and free training sets to query against and generate your own outputs.

https://stability.ai/

[–] silverbax@lemmy.world 8 points 10 months ago* (last edited 10 months ago) (2 children)

Thanks for your response. I realize I muddied the waters on my question by mentioning exact copies.

My real question is based on the 'everything is a remix' idea. I can create a work 'in the style of Banksy' and sell it. The US copyright and trademark laws state that a work only has to be 10% differentiated from the original in order to be legal to use, so creating a piece of work that 'looks like it could have been created by Banksy, but was not created by Banksy' is legal.

So since most AI does not create exact copies, this is where I find the licensing argument possibly weak. I really haven't seen AI like MidJourney creating exact replicas of works - but admittedly, I am not following every single piece of art created on Midjourney, or Stable Diffusion, or DALL-E, or any of the other platforms, and I'm not an expert in the trademarking laws to the extent I can answer these questions.

[–] abhibeckert@lemmy.world 8 points 10 months ago* (last edited 10 months ago)

Thanks for your response

Always happy to discuss copyright. :-) Our IP laws are long overdue for an overhaul in my opinion. And the only way to make that happen is for as many people as possible to discuss the issues. I plan to spend the rest of my life creating copyrighted work, and I really hope I don't spend all of it under the current rules...

The US copyright and trademark laws state that a work only has to be 10% differentiated from the original in order to be legal to use

The law doesn't say that.The Blurred Lines copyright case for example was far less than 10%. Probably less than 1%, and it was still unclear if it was infringement or not. It took five years of lawsuits to reach an unclear conclusion where the first court found it to be infringing then an appeals panel of judges reached a split decision where the majority of them found it to be non-infringing, so even after five years there wasn't a clear ruling on wether or not it was copyright infringement.

Copyright is incredibly complex and unclear. It's generally best to just not get into a copyright lawsuit in the first place. Usually when someone accuses you of copyright infringement you try to pay them whatever amount of money (in the Blurred Lines case, there were discussions of 50% of the artist's income from the song) to make them go away even if your lawyers tell you you're probably going to get a not guilty verdict.

[–] Silentiea@lemm.ee 2 points 10 months ago

I really haven't seen AI like MidJourney creating exact replicas of works

I don't have a source to cite, but I did read an article that showed a bad faith actor deliberately trying to use ai to copy images directly, and while the results weren't exact replicas, they were reasonable facsimiles of the original, to the extent that if a human has created it without ai, it would have been blatant copyright infringement, despite not being quite identical.

I wish I had the examples on hand to show, but it was months ago, and unfortunately I have not the skills nor time to retrieve it.

[–] tabular@lemmy.world 1 points 10 months ago* (last edited 10 months ago) (1 children)

To be at fault the user would have to know the AI creation they distributed commits copyright infringement. How can you tell? Is everyone doing months of research to be vaguely sure it's not like someone else's work?

Even if you had an AI trained on only public domain assets you could still end up putting in the words that generate something copyrighted.

Companies created a random copyright infringement tool for users to randomly infringe copyright.

[–] red@sopuli.xyz 6 points 10 months ago* (last edited 10 months ago) (1 children)

The same way you can tell if you repainted a Banksy yourself. If you don't realize, and monetize, then you are liable for a copyright lawsuit regardless of the way you created the piece in question.

And if noone can detect similarities beyond influences, then it's not infringing anything.

[–] tabular@lemmy.world -2 points 10 months ago* (last edited 10 months ago) (1 children)

You may recognize a Banksy but to another it's like I said you aught to know your work is like one from Coinsey: who?

This is exasperated when people can create creative works via AI, having even less knowledge about your peers who know how to DIY. A potentially life-ruining lawsuit is a bad system to find out you can't monetize something.

[–] red@sopuli.xyz 4 points 10 months ago (1 children)

If only there was some way to find out prior to selling stuff as if you made it. If only. Darn it!

[–] tabular@lemmy.world -1 points 10 months ago (1 children)

I don't understand. If I make something that doesn't mean I'm not infringing someone's works.

[–] red@sopuli.xyz 2 points 10 months ago* (last edited 10 months ago) (1 children)

Point: regardless of the HOW it was made, the process of figuring if it infringes on something is the same. It's still not always easy and due to the shittyness of current IP laws, even long time professional artists sometimes make mistakes.

In the end it's just about money.

[–] tabular@lemmy.world -1 points 10 months ago (1 children)

I am familiar with SEGA owning a software patent on Crazy Taxi's "arrow above car points where to go" because my interests in creating games happened to lead me to an article stating such.

That seems related to HOW my works are made, to me. I know of no other way to find that out.

[–] pirat@lemmy.world 1 points 10 months ago (1 children)

Like this one in Midtown Madness? Did MS actually have to pay SEGA to do the same thing? Both were originally released in 1999, it seems. I'm unsure which came first, but does it even matter if SEGA managed to get the patent first?

MS Midtown Madness gameplay screenshot

[–] tabular@lemmy.world 2 points 10 months ago* (last edited 10 months ago) (2 children)

Midtown Madness 1/2/3 all have the arrows from what I've seen. At least 3 has a part where you pick up people in a taxi.

I am unsure what would happen if Midtown Madness did it first but didn't patent it, game mechanic patents are not common (I hope...). Perhaps they knew they could win but didn't want to lose money fighting the Microsoft of that time? I can find no mention of MM regarding the Sega v. Fox lawsuit where Fox privately settled over Simpson's Road Rage in 2003.

[–] pirat@lemmy.world 1 points 10 months ago* (last edited 10 months ago) (1 children)

Crazy! I had never thought of this sort of arrow as something that would have a patent. Isn't it pretty common in various other driving/racing games? Maybe not?! MM1 & MM2 definitely had the arrow — I've spent way too many hours fucking around in those as a kid! However, there's no taxi mode in any of them. Sadly, I've never tried MM3, since it was never released for PC, iirc only for Xbox, but the video you shared indeed confirms it had the arrow too, and even a taxi mode! How similar is it to that of Crazy Taxi? I've never played that. At least, SEGA probably doesn't own the patent for the taxi/delivery/ambulance driver game format too?!

[–] tabular@lemmy.world 0 points 10 months ago* (last edited 10 months ago)

Looks basically the same, as you get closer to the destination in Crazy Taxi the arrow pluses in size and switches colour from green, to orange, to red.

Are you aware that mini-games during loading screens were patented (expired now)?

[–] PipedLinkBot@feddit.rocks 1 points 10 months ago (1 children)

Here is an alternative Piped link(s):

you pick up people in a taxi

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I'm open-source; check me out at GitHub.

[–] Mango@lemmy.world -3 points 10 months ago (1 children)

Your example is a dude who paints unsolicited on other people's property. What kind of copyright does a ghost have?

[–] Silentiea@lemm.ee 2 points 10 months ago (1 children)

A surprising amount, though it would potentially be quite difficult to prove.

[–] Mango@lemmy.world 0 points 10 months ago

I should paint some shit on your house and then sue you for displaying it.

[–] theneverfox@pawb.social 8 points 10 months ago (1 children)

Here's the thing... Generative AI had a plagiarism/remix phase. It raised some serious questions about copyright

It lasted for a matter of weeks.

We're all still stuck up on it, but go to civit.ai

Play with it. Look at what people are creating.

If you're not convinced, put up a bounty for something extremely specific

Art has changed. There's no putting it back in the bottle, this is the tiniest leading edge of the singularity

[–] camelbeard@lemmy.world 1 points 10 months ago

Just a small warning, I just played around with civit. Tried to make some Images, also wanted to try to make some nsfw images. Anyway be really careful what you prompt, I accidentally generated some images with very young people I never intended.