this post was submitted on 09 Jul 2023
425 points (96.7% liked)
Technology
59201 readers
2643 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
If I read a book to inform myself, put my notes in a database, and then write articles, it is called "research". If I write a computer program to read a book to put the notes in my database, it is called "copyright infringement". Is the problem that there just isn't a meatware component? Or is it that the OpenAI computer isn't going a good enough job of following the "three references" rule to avoid plagiarism?
Say I see a book that sells well. It's in a language I don't understand, but I use a thesaurus to replace lots of words with synonyms. I switch some sentences around, and maybe even mix pages from similar books into it. I then go and sell this book (still not knowing what the book actually says).
I would call that copyright infringement. The original book didn't inspire me, it didn't teach me anything, and I didn't add any of my own knowledge into it. It didn't produce any original work, I simply mixed a bunch of things I don't understand.
That's what these language models do.