Petalbot |
Petalbot |
Brian Chandler |
Jan 13 2022, 11:47 PM
Post
#1
|
Jocular coder Group: Members Posts: 2,460 Joined: 31-August 06 Member No.: 43 |
My error log is full of accesses to the nonexistent https://imaginatorium.com/addbskt.php from something identifying itself as Petalbot. This links to a page here:
https://webmaster.petalsearch.com/site/petalbot This explains that Petalbot follows the robots.txt protocol, and describes how to block it by (e.g.) CODE User-agent: PetalBot Disallow: /*.php But https://imaginatorium.com/robots.txt already includes CODE User-agent: * Allow: /*.html Disallow: /*.php Unless I misunderstand something, if Petalbot followed the robots.txt protocol it would not attempt to access this page. Or do I have to go around adding in the names of all the robots I want to exclude? |
Brian Chandler |
Jan 27 2022, 12:08 PM
Post
#2
|
Jocular coder Group: Members Posts: 2,460 Joined: 31-August 06 Member No.: 43 |
Well, I am still seeing huge numbers of robot accesses to .php files. Not only Petalbot, also DuckDuckWhateveritis, and others. Here is my robots.txt file, as of about two weeks ago; does it look OK?
https://imaginatorium.com/robots.txt And how long do you think I need to give bots to update their copy of robots.txt? Any ideas? |
Christian J |
Jan 27 2022, 05:13 PM
Post
#3
|
. Group: WDG Moderators Posts: 9,686 Joined: 10-August 06 Member No.: 7 |
Well, I am still seeing huge numbers of robot accesses to .php files. Not only Petalbot, also DuckDuckWhateveritis, and others. Here is my robots.txt file, as of about two weeks ago; does it look OK? https://imaginatorium.com/robots.txt I wouldn't use ending slashes, unless e.g. "ack.php" is a directory and not a PHP file... QUOTE And how long do you think I need to give bots to update their copy of robots.txt? Any ideas? No idea what they actually do. But since the purpose of a returning bot is to update its database, surely that would include the robots.txt file as well (if they care about it)? |
Brian Chandler |
Jan 28 2022, 01:57 AM
Post
#4
|
Jocular coder Group: Members Posts: 2,460 Joined: 31-August 06 Member No.: 43 |
Well, I am still seeing huge numbers of robot accesses to .php files. Not only Petalbot, also DuckDuckWhateveritis, and others. Here is my robots.txt file, as of about two weeks ago; does it look OK? https://imaginatorium.com/robots.txt I wouldn't use ending slashes, unless e.g. "ack.php" is a directory and not a PHP file... Thanks Christian! My blunder somehow. Pandy's links are interesting, but rather evidence-free claims of Petalbot not complying with robots.txt. I'll see what happens now. I don't think we can expect them to read the robots.txt file every day, even - something like once a week or month would seem quite reasonable, so I am happy to be patient. |
Lo-Fi Version | Time is now: 15th June 2024 - 12:40 PM |