A bot, also known as a web robot, web spider or web crawler, is a software application designed to automatically perform simple and repetitive tasks in a more effective, structured, and concise manner than any human could ever do.
The most common use of bots is in web spidering or web crawling.
YankyMateBot is the search bot software that Banpei.net sends out to discover and collect new and updated web data.
Data collected by YankyMateBot is used in:
- testing and developing purposes
How YankyMateBot Crawls Your Site
YankyMateBot’s crawl process starts with a list of web page URLs. When YankyMateBot visits these URLs it crawls the internal website structure detecting all the hyperlinks within the site and adding them to the list of URLs to follow. This list, also known as the “crawl frontier”, is recursively visited according to a set of Banpei policies to effectively map a site for updates: content changes, new pages, and dead links.
How To Block YankyMateBot From Crawling Your Site
Bots are crawling your web pages to help parse your site content, so that the relevant information within your site is easily indexed and more readily available to users searching for the content you provide.
Although most bots are harmless and quite beneficial, you still may want to prevent bots from crawling your site (please note, however, that not everyone on the web is using a bot to help index your site). The easiest and quickest way to do this is to use “robots.txt”. This text file contains instructions on how a bot should process your site data.
To stop YankyMateBot from crawling your site, add the following rules to your “robots.txt” file:
To block YankyMateBot from crawling your site for web graph of links, add:
Please note that there might be a delay up to two weeks before YankyMateBot discovers the changes you made to robots.txt.
Make sure that the “robots.txt” file is in the top directory of the server; otherwise, there will be no effect on the YankyMateBot behavior.
Please do not try to block YankyMateBot via IP in .htaccess as we do not use any consecutive IP blocks.
If YankyMateBot is still crawling your site, make sure YankyMateBot can retrieve your “robots.txt.
More Information About Bots
For more information about bots, please refer to http://www.robotstxt.org/.
If you still have any questions about YankyMateBot, please contact us via the contact form and we will respond as soon as possible.
If you think that YankyMateBot does not obey your “robots.txt” rules, please provide us with your website URL, the log entries showing YankyMateBot crawling the pages that it was not supposed to, and we will work quickly to resolve the issue.