Why Some Search Engines Ignore Robots.txt

Generally speaking all search engines honour robots.txt file and follow the commands provided by this file . Whenever a search engine bot visits your site it always takes a look at site's robots.txt
file to know which areas of the site it is allowed to crawl and don't crawl pages that are disallowed by the site's robots.txt file.

Bots that ignore robots.txt file usually have a bad intention and are used to find the weakness in site for exploitation . You should write a robots.txt file of your site to tell search engine bots about the allowable pages of the site. While changing your robots.txt file you should be very careful because if you make wrong changes to your site's robots.txt file then you may end up denying the access to search engines and your site may not show up in the search results.

Search engines that ignore robots.txt are usually not doing it for any good reasons and you should ban any such search engines that show suspicious behaviour while crawling your site.  If you are looking for search engines that ignore robots.txt file then you should know that all the major search engines like Google, Bing etc always follow your site's robots.txt file .You can find your site's robots.txt file at" yourdomain.com/robots.txt". All the entries after  disallow:/.. are not crawled by search engines.