Page 1 of 3 123 LastLast
Results 1 to 10 of 21
  1. #1
    Join Date
    Apr 2008
    Location
    Covington, Washington, United States
    Posts
    205
    Plugin Contributions
    1

    Default spiders.txt still not catching fatbot (thefind.com)

    OK, so a quick background, I've always seen thefind.com crawling my site in whos_online.php but the ip's have never been marked as a spider, despite my best efforts to correctly add the UA to the spiders.txt file.

    I have scoured the forums and found the latest spiders.txt file that Dr. Byte posted in the forum, which to my knowledge contains the header:
    $Id: spiders.txt 18372 2011-02-08 21:21:42Z drbyte $

    thefind.com's bot UA is FatBot 2.0, or more specifically: Mozilla/5.0 (compatible; FatBot 2.0; http://www.thefind.com/crawler)

    Based on the above referenced spiders.txt file, the "Bot" part of "FatBot" should trigger the admin to identify it as a spider since the very first line after the header in spiders.txt is, in fact, "bot". Needless to say it doesn't. So of course, the next step would be to simply add "fatbot" on a new at line of the spiders.txt file - OK, did it, no dice. FatBot 2.0? - nope, no dice.

    This really wouldn't be a problem for me, but at any given time I have a dozen or so different ip's from the find crawling my site, and all are being assigned a new session id, it's really starting to **** me off. What the heck do I need to do to prevent sessions for this dang thing?

    BTW, applicable sessions settings are:
    cookie domain: true
    force cookie use: true (full ssl - not on shared ip)
    check ssl session id: false
    check user agent: false
    check ip address: false
    prevent spider sessions: true
    recreate session: true
    ip to hose conv status: true

    Please note: ip address for true guest shown in SS are blurred intentionally to protect their anonymity, and full domain paths are blurred due to the adult nature of the website (I also don't feel the path is germane to the issue at hand). But as you'll see, each ip from the find has it's own unique session. googlebot is correctly pinned as a spider.. so why not fatbot???
    Attached Images Attached Images  
    Last edited by litepockets; 30 Dec 2011 at 04:11 AM. Reason: ss image too small.

  2. #2
    Join Date
    Sep 2003
    Location
    Ohio
    Posts
    69,402
    Plugin Contributions
    6

    Default Re: Most recent spiders.txt still not catching fatbot (thefind.com)

    Those are caught on my site without any issues ...

    Check your file:
    /includes/spiders.txt

    and make sure that the first bot under the ID is:
    bot
    Linda McGrath
    If you have to think ... you haven't been zenned ...

    Did YOU buy the Zen Cart Team a cup of coffee and a donut today? Just click here to support the Zen Cart Team!!

    Are you using the latest? Perhaps you've a problem that's fixed in the latest version: [Upgrade today: v1.5.5]
    Officially PayPal-Certified! Just click here

    Try our Zen Cart Recommended Services - Hosting, Payment and more ...
    Signup for our Announcements Forums to stay up to date on important changes and updates!

  3. #3
    Join Date
    Apr 2008
    Location
    Covington, Washington, United States
    Posts
    205
    Plugin Contributions
    1

    Default Re: Most recent spiders.txt still not catching fatbot (thefind.com)

    Quote Originally Posted by litepockets View Post
    Based on the above referenced spiders.txt file, the "Bot" part of "FatBot" should trigger the admin to identify it as a spider since the very first line after the header in spiders.txt is, in fact, "bot".
    Hi Ajeh, I referenced that in op. Not sure what the heck the issue is.

  4. #4
    Join Date
    Oct 2006
    Location
    Alberta, Canada
    Posts
    4,571
    Plugin Contributions
    1

    Default Re: Most recent spiders.txt still not catching fatbot (thefind.com)

    litepockets,
    $Id: spiders.txt 18372 2011-02-08 21:21:42Z drbyte $ - is 1.5

    $Id: spiders.txt 16983 2010-07-25 17:40:59Z drbyte $ - is 1.3.9h

    If using ZC v1.3.9h then why not try using the appropriate file?


    Ajeh, I noticed this in 1.3.9h and 1.5 RC3, 'spiders.txt' file.

    gulperbot
    h¦m¦h¦kki
    hämähäkki
    hamahakki

    Am I missing something or are those actual Robot names that will be recognized?

  5. #5
    Join Date
    Jan 2004
    Posts
    66,373
    Blog Entries
    7
    Plugin Contributions
    274

    Default Re: Most recent spiders.txt still not catching fatbot (thefind.com)

    litepockets,
    Perhaps your site's code isn't fully recognizing mixed-case names. So, "bot" won't match the "Bot" in "fatBot".
    Workaround: since "thefind.com" is mentioned in the UA string and is all lowercase, you can add "thefind" to the end of your own spiders.txt file.
    .

    Zen Cart - putting the dream of business ownership within reach of anyone!
    Donate to: DrByte directly or to the Zen Cart team as a whole

    Remember: Any code suggestions you see here are merely suggestions. You assume full responsibility for your use of any such suggestions, including any impact ANY alterations you make to your site may have on your PCI compliance.
    Furthermore, any advice you see here about PCI matters is merely an opinion, and should not be relied upon as "official". Official PCI information should be obtained from the PCI Security Council directly or from one of their authorized Assessors.

  6. #6
    Join Date
    Apr 2008
    Location
    Covington, Washington, United States
    Posts
    205
    Plugin Contributions
    1

    Default Re: Most recent spiders.txt still not catching fatbot (thefind.com)

    Quote Originally Posted by Website Rob View Post
    litepockets,
    $Id: spiders.txt 18372 2011-02-08 21:21:42Z drbyte $ - is 1.5

    $Id: spiders.txt 16983 2010-07-25 17:40:59Z drbyte $ - is 1.3.9h

    If using ZC v1.3.9h then why not try using the appropriate file?
    Rob, I appreciate the input, but I downloaded the file per Dr. Byte's suggestion on this post: http://www.zen-cart.com/forum/showpo...5&postcount=32 - as you can see, he stated to use that version, which is what was going to be used in v1.5 (turns out that the 1.5 version has been modified from this, but nonetheless it's still Dr. Bytes suggested version for 1.3.9h).

    Quote Originally Posted by DrByte View Post
    litepockets,
    Perhaps your site's code isn't fully recognizing mixed-case names. So, "bot" won't match the "Bot" in "fatBot".
    Workaround: since "thefind.com" is mentioned in the UA string and is all lowercase, you can add "thefind" to the end of your own spiders.txt file.
    Dr. Byte, thanks, I'll give "thefind" a try. I had recently tried "thefind.com" but had no success. I'll report back

  7. #7
    Join Date
    Apr 2008
    Location
    Covington, Washington, United States
    Posts
    205
    Plugin Contributions
    1

    Default Re: spiders.txt still not catching fatbot (thefind.com)

    Dr. Byte,

    The addition of "thefind" to the end of the spiders.txt file seemed to bring about mixed results...

    Good News: is that for the first time, I actually saw their dang bot correctly identified as a spider.

    Bad News:
    1. Correct identification isn't consistent. It reported 1 IP with the fatbot UA string as a spider, and another IP with an identical UA string as a guest (chronologically AFTER the one that was correctly reported).
    2. Googlebot is suddenly not being reported as a spider.
    3. It would appear that a true guest is being incorrectly tagged as a spider.

    Screenshot:


    Attached: my current spiders.txt

    Also worth noting, I have already followed the suggestions given by Ajeh in these posts:
    http://www.zen-cart.com/forum/showpo...52&postcount=4
    http://www.zen-cart.com/forum/showpo...28&postcount=5

  8. #8
    Join Date
    Jan 2004
    Posts
    66,373
    Blog Entries
    7
    Plugin Contributions
    274

    Default Re: spiders.txt still not catching fatbot (thefind.com)

    Sounds like something wrong with the many addons you've got.
    I can't replicate your problem here.
    .

    Zen Cart - putting the dream of business ownership within reach of anyone!
    Donate to: DrByte directly or to the Zen Cart team as a whole

    Remember: Any code suggestions you see here are merely suggestions. You assume full responsibility for your use of any such suggestions, including any impact ANY alterations you make to your site may have on your PCI compliance.
    Furthermore, any advice you see here about PCI matters is merely an opinion, and should not be relied upon as "official". Official PCI information should be obtained from the PCI Security Council directly or from one of their authorized Assessors.

  9. #9
    Join Date
    Apr 2008
    Location
    Covington, Washington, United States
    Posts
    205
    Plugin Contributions
    1

    Default Re: spiders.txt still not catching fatbot (thefind.com)

    Dr. Byte,

    Thank you for trying. Any general idea of which file(s) to examine other than those outlined?

  10. #10
    Join Date
    Jan 2008
    Posts
    103
    Plugin Contributions
    0

    Default Re: spiders.txt still not catching fatbot (thefind.com)

    I'm having the same trouble the other users have experienced with this spider not being recognized as a spider. I have added thefind, thefind.com, fatlens, fatbot, FatBot, and any possible combination I can think of to my spiders.txt file, but no luck. At the moment, this spider has 25 separate pages/sessions running. Is there any other way I can get this properly classified?

    Thanks!

 

 
Page 1 of 3 123 LastLast

Similar Threads

  1. Include a bot in spiders.txt
    By vimad in forum General Questions
    Replies: 4
    Last Post: 22 Apr 2011, 02:26 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
disjunctive-egg
Zen-Cart, Internet Selling Services, Klamath Falls, OR