OK, so a quick background, I've always seen thefind.com crawling my site in whos_online.php but the ip's have never been marked as a spider, despite my best efforts to correctly add the UA to the spiders.txt file.
I have scoured the forums and found the latest spiders.txt file that Dr. Byte posted in the forum, which to my knowledge contains the header:
$Id: spiders.txt 18372 2011-02-08 21:21:42Z drbyte $
thefind.com's bot UA is FatBot 2.0, or more specifically: Mozilla/5.0 (compatible; FatBot 2.0; http://www.thefind.com/crawler)
Based on the above referenced spiders.txt file, the "Bot" part of "FatBot" should trigger the admin to identify it as a spider since the very first line after the header in spiders.txt is, in fact, "bot". Needless to say it doesn't. So of course, the next step would be to simply add "fatbot" on a new at line of the spiders.txt file - OK, did it, no dice. FatBot 2.0? - nope, no dice.
This really wouldn't be a problem for me, but at any given time I have a dozen or so different ip's from the find crawling my site, and all are being assigned a new session id, it's really starting to **** me off. What the heck do I need to do to prevent sessions for this dang thing?
BTW, applicable sessions settings are:
cookie domain: true
force cookie use: true (full ssl - not on shared ip)
check ssl session id: false
check user agent: false
check ip address: false
prevent spider sessions: true
recreate session: true
ip to hose conv status: true
Please note: ip address for true guest shown in SS are blurred intentionally to protect their anonymity, and full domain paths are blurred due to the adult nature of the website (I also don't feel the path is germane to the issue at hand). But as you'll see, each ip from the find has it's own unique session. googlebot is correctly pinned as a spider.. so why not fatbot???
Bookmarks