237
OpenAI says it’s “impossible” to create useful AI models without copyrighted material
(arstechnica.com)
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
I don't think it should go unquestioned that the same standards should apply. No human is able to look at billions of creative works and then create a million new works in an hour. There's a meaningfully different level of scale here, and so it's not necessarily obvious that the same standards should apply.
A fundamental issue is that LLMs simply cannot do this. They can query a webpage, find a relevant chunk, and spit that back at you with a citation, but it is simply impossible for them to actually generate a response to a query, realize that they've generated a meaningful amount of copyrighted material, and disclose its source, because it literally does not know its source. This is not a fixable issue unless the fundamental approach to these models changes.