In addition to the built-in variables available from Jekyll, you can specify your own custom data that can be accessed via the Liquid templating system.| Jekyll • Simple, blog-aware, static sites
AI companies are crawling the open web to, ostensibly, improve the quality of their models and products. This process is extractive and accrues the benefit to said companies, not the owners of sites both small and large.| coryd.dev
Robots.txt is used to manage crawler traffic. Explore this robots.txt introduction guide to learn what robot.txt files are and how to use them.| Google for Developers
“AI” companies think that we should have to opt-out of data-scraping bots that take our work to train their products. There isn’t even a required no-scraping period between the announcement and when they start. Too late? Tough. Once they have your data, they don’t provide you with a way to have it deleted, even before they’ve processed it for training.| Neil Clarke