To my robots…

08 Nov 2019

robots.txt is a simple text file placed in the web server root, providing web crawlers with instructions for them to abide by or disregard. The main purpose is preventing web crawlers from indexing undesirable areas of the web server, and ensure proper listings in search engines.

Specific instructions may be given for specific user agents (e.g. “Google”), or may be generalized for any agent. User agents are free to disregard robots.txt, and the file is public and thus no way to hide content from users per se.

On this implementation of Jekyll, I’ve saved minima template files in a folder named “bak”, keeping it there out of laziness. To prevent indexing the files therein, I’ve added the following to robots.txt:

User-agent: *  
Disallow: /bak/

You’ll find my robots.txt here.