978
AI companies are violating a basic social contract of the web and and ignoring robots.txt
(www.theverge.com)
This is a most excellent place for technology news and articles.
I do not think it is even part of the HTTP protocol I think it's just a pseudo add-on. It's barely even a protocol it's basically just a page that bots can look at with no really pre-agreed syntax.
If you want to make a bot that doesn't respect robots.txt you don't even need to do anything complicated, you just need to not include the requirement to look at the page. It's not enforceable at all.