
NO AI TRAINING: Without in any way limiting the author’s (and blogger’s) exclusive rights under copyright, any use of this publication to “train” generative artificial intelligence (AI) technologies to generate text is expressly prohibited. The author reserves all rights to license uses of this work for generative AI training and development of machine learning language models.
This was an easy one. A simple add-on paragraph in my book and on my blog.
No AI training with my content!
I get the training part, I suppose it’s a bit like allowing cookies to collect (all) your data, but worse? I assume it means my content could be handed to someone else, of course, a bit altered. I assume I would sound smarter, better, more perfect, and less emotional. Which, in my book, equals boring.
AI Crawling, that’s a bit more complicated. What does it even mean? Google, Facebook, and other companies use crawlers to search our content, and we allow it, mainly because we don’t think much of it. Nobody ever asked our permission, and while we, by nature, always assume the worst, online, we seem to have no worries.
Protect from AI Crawling: https://authorsguild.org/news/practical-tips-for-authors-to-protect-against-ai-use-ai-copyright-notice-and-web-crawlers/
I always had a copyright message on my blog, and of course, I have one in my book, but neither included AI until last week.
I changed that!
Sorry, AI, you train somewhere else. (Not that IT will obey).

That’s what I found: Crawlers are software that automatically navigates between links and records data about web pages. Google runs crawlers to gather data for its search engine, for example.
Because crawlers have the potential to interfere with the function of websites if they send too many requests or send them to the wrong places, people created a standard called “robots.txt” that lists what bots are allowed to visit what pages. Bots are also supposed to clearly identify themselves in their requests with something called a user-agent string.
This mostly worked for decades, but it was always a good-faith agreement. It relied on crawlers voluntarily following the social contract.
With the rise of AI, however, there is a kind of gold rush for data happening, since whatever company has the most data has the best chance at building a more effective AI. This has lead to a huge swath of companies that have both huge resources and a strong incentive to ignore the existing social contract — and that’s what they are doing.
So “AI crawlers” will sometimes pretend to identify themselves at first, but if you block them they will switch to lying and saying they are a human using a browser. They will also swap between thousands of IPs so you can’t identify them that way. These crawlers are so aggressive that they threaten the open internet as a whole, because small websites can’t block them but also can’t support the amount of traffic they send. From the stories I’ve heard, they are often ridiculously aggressive, scraping the same page over and over again because they don’t even bother keeping track of where they have been before.
There are tools emerging now that are having some success at blocking these scrapers, but it relies on a slightly obtrusive loading screen before you visit the website, which annoys human visitors.
Fun fact: One is the ways scrapers get access to so many IPs is by leasing them from “residential proxy” services. What this means is that shady app developers sell remote access to people’s phones to a middleman company, and then the scrapers make requests using those phones. Because the requests via the phones are coming from local home networks, they are much harder to block, since you risk blocking a real person. So if you start seeing a lot more CAPTCHAs on websites, and you don’t use a paid VPN, maybe uninstall whatever sketchy app you recently installed. Free VPNs also do this. Source
My AI-free book “Losing it All: Houseless, with Love, Dogs and Sausages.”
Written by a human for humans (and dogs).


Thanks for pointing this out, Bridget. Life seems to get more complicated by the day!
Thanks for the information, Bridget. 🙂
That’s wonderful! I’m not sure how a Bot would be able to read that on your blog? I’m not that technical but I would like to put that on my Copyright page.
It’s in my footer, right underneath the copyright. It might only be cosmetic but it gives me the right to control my content and react if this right gets violated.
Of course, please, by all means, copy it. That’s why I published it for all to read. 🙂