this post was submitted on 26 Jul 2023
483 points (96.0% liked)

Technology

59346 readers
7412 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

Thousands of authors demand payment from AI companies for use of copyrighted works::Thousands of published authors are requesting payment from tech companies for the use of their copyrighted works in training artificial intelligence tools, marking the latest intellectual property critique to target AI development.

you are viewing a single comment's thread
view the rest of the comments
[–] CmdrShepard@lemmy.one 2 points 1 year ago (2 children)

You've failed to explain how that relates to your point. Sure you can purchase an econonomics textbook and then go become a finance bro, but that's not what they're doing here. They're taking that textbook (that wasn't paid for) and feeding it into their commercial product. The end product is derived from the author's work.

To put it a different way, would they still be able to produce ChatGPT if one of the developers simply read that same textbook and then inputted what they learned into the model? My guess is no.

It'd be the same if I went and bought CDs, ripped my favorite tracks, and then put them into a compilation album that I then sold for money. My product can't exist without having copied the original artists work. ChatGPT just obfuscates that by copying a lot of songs.

[–] bouncing@partizle.com 1 points 1 year ago (1 children)

A better comparison would probably be sampling. Sampling is fair use in most of the world, though there are mixed judgments. I think most reasonable people would consider the output of ChatGPT to be transformative use, which is considered fair use.

[–] Eccitaze@yiffit.net -1 points 1 year ago (1 children)

If I created a web app that took samples from songs created by Metallica, Britney Spears, Backstreet Boys, Snoop Dogg, Slayer, Eminem, Mozart, Beethoven, and hundreds of other different musicians, and allowed users to mix all these samples together into new songs, without getting a license to use these samples, the RIAA would sue the pants off of me faster than you could say "unlicensed reproduction."

It doesn't matter that the output of my creation is clear-cut fair use. The input of the app--the samples of copyrighted works--is infringing.

[–] bouncing@partizle.com 1 points 1 year ago

> If I created a web app that took samples from songs created by Metallica, Britney Spears, Backstreet Boys, Snoop Dogg, Slayer, Eminem, Mozart, Beethoven, and hundreds of other different musicians, and allowed users to mix all these samples together into new songs, without getting a license to use these samples, the RIAA would sue the pants off of me faster than you could say “unlicensed reproduction.”

The RIAA is indeed a litigious organization, and they tend to use their phalanx of lawyers to extract anyone who does anything creative or new into submission.

But sampling is generally considered fair use.

And if the algorithm you used actually listened to tens of thousands of hours of music, and fed existing patterns into a system that creates new patterns, well, you'd be doing the same thing anyone who goes from listening to music to writing music does. The first song ever written by humans was probably plagiarized from a bird.

They’re taking that textbook (that wasn’t paid for) and feeding it into their commercial product.

Nobody has provided any evidence that this is the case. Until this is proven it should not be assumed. Bandwagoning (and repeating this over and over again without any evidence or proof) against the ML people without evidence is not fair. The whole point of the Justice system is innocent until proven guilty.

The end product is derived from the author’s work.

Derivative works are 100% protected under copyright law. https://www.legalzoom.com/articles/what-are-derivative-works-under-copyright-law

This is the same premise that allows "fair use" that we all got up and arms about on youtube. Claiming that this doesn't exist now in this case means that all that stuff we fought for on Youtube needs to be rolled back.

To put it a different way, would they still be able to produce ChatGPT if one of the developers simply read that same textbook and then inputted what they learned into the model? My guess is no.

Why not? Why can't someone grab a book, scan it... chuck it into an OCR and get the same content? There are plenty of ways that snippets of raw content could make it into these repositories WITHOUT asserting legal problems.

It’d be the same if I went and bought CDs, ripped my favorite tracks, and then put them into a compilation album that I then sold for money.

No... You could have for all intents and purposes have recorded all your songs from the radio onto a cassette... That would be 100% legal for personal consumption... which would be what the ML authors are doing. ChatGPT and others could have sources information from published sources that are completely legit. No "Author" has provided any evidence otherwise yet to believe that ChatGPT and others have actually broken a law yet. For all we know the authors of these tools have library cards, and fed in screenshots of the digital scans of the book or hand scanned the book. Or didn't even use the book at all and contextually grabbed a bunch of content from the internet at large.

Since the ML bots are all making derivative works, rather than spitting out original content... they'd be covered by copyright as a derivative work.

This only becomes an actual problem if you can prove that these tools have done BOTH

  1. obtain content in an illegal fashion
  2. provide the copyrighted content freely without fair-use or other protections.