-
I have unwanted pages in google index, like products_all
Hi i.m using 1.39h could someone tell me a basic robot.txt to input for this version, Reason being is i have unwanted pages in google index, like products_all , new_products, pop_up-Image, Discount-coupon, login_page etc, so really not needed in the index,
Is there a basic list i could insert into robot.txt to stop google crawling these, and a way of blocking them being indexed,
and is there any specific parameters to insert into WMT for url parameters
Thanks
-
Re: robot.txt and parameters
-
Re: I have unwanted pages in google index, like products_all
1) Did you look at the included robots.txt example, that should get rid of pop up images appearing google etc
Quote:
User-agent: *
Disallow: /cgi-bin/
Disallow: /*zenid=
Disallow: /index.php?main_page=popup_image*
2) Pages like the login page, have the noindex tag on them by default to stop them being indexed, or at least all that i have seen have
This is a very old one from someone somewhere, you may be able to chop useful bits out of it
Quote:
User-agent: Googlebot
Disallow: /*&action=notify$
Disallow: /*&number_of_uploads=0&action=notify
Disallow: /index.php?main_page=discount_coupon
Disallow: /index.php?main_page=checkout_shipping
Disallow: /index.php?main_page=shippinginfo
Disallow: /index.php?main_page=privacy
Disallow: /index.php?main_page=conditions
Disallow: /index.php?main_page=contact_us
Disallow: /index.php?main_page=advanced_search
Disallow: /index.php?main_page=login
Disallow: /index.php?main_page=unsubscribe
Disallow: /index.php?main_page=shopping_cart
Disallow: /index.php?main_page=product_reviews_write&cPath=*
Disallow: /index.php?main_page=tell_a_friend&products_id=*
Disallow: /index.php?main_page=product_reviews_write&products_id=*
Disallow: /index.php?main_page=popup_shipping_estimator
Disallow: /index.php?main_page=account
Disallow: /index.php?main_page=password_forgotten
Disallow: /index.php?main_page=checkout_shipping_address
Disallow: /index.php?main_page=logoff
Disallow: /index.php?main_page=gv_faq
Disallow: /gv_faq.html?faq_item=*
Disallow: /*&sort=*
Disallow: /*alpha_filter_id=*
Disallow: /*&disp_order=*
User-agent: Slurp
Disallow: /*&action=notify$
Disallow: /*&number_of_uploads=0&action=notify
Disallow: /index.php?main_page=discount_coupon
Disallow: /index.php?main_page=checkout_shipping
Disallow: /index.php?main_page=shippinginfo
Disallow: /index.php?main_page=privacy
Disallow: /index.php?main_page=conditions
Disallow: /index.php?main_page=contact_us
Disallow: /index.php?main_page=advanced_search
Disallow: /index.php?main_page=login
Disallow: /index.php?main_page=unsubscribe
Disallow: /index.php?main_page=shopping_cart
Disallow: /index.php?main_page=product_reviews_write&cPath=*
Disallow: /index.php?main_page=tell_a_friend&products_id=*
Disallow: /index.php?main_page=product_reviews_write&products_id=*
Disallow: /index.php?main_page=popup_shipping_estimator
Disallow: /index.php?main_page=account
Disallow: /index.php?main_page=password_forgotten
Disallow: /index.php?main_page=checkout_shipping_address
Disallow: /index.php?main_page=logoff
Disallow: /index.php?main_page=gv_faq
Disallow: /gv_faq.html?faq_item=*
Disallow: /*&sort=*
Disallow: /*alpha_filter_id=*
Disallow: /*&disp_order=*
User-agent: *
Disallow: /index.php?main_page=faqs_new
Disallow: /index.php?main_page=discount_coupon
Disallow: /index.php?main_page=checkout_shipping
Disallow: /index.php?main_page=shippinginfo
Disallow: /index.php?main_page=privacy
Disallow: /index.php?main_page=conditions
Disallow: /index.php?main_page=contact_us
Disallow: /index.php?main_page=advanced_search
Disallow: /index.php?main_page=login
Disallow: /index.php?main_page=unsubscribe
Disallow: /index.php?main_page=shopping_cart
Disallow: /index.php?main_page=popup_shipping_estimator
Disallow: /index.php?main_page=account
Disallow: /index.php?main_page=password_forgotten
Disallow: /index.php?main_page=checkout_shipping_address
Disallow: /index.php?main_page=logoff
Disallow: /index.php?main_page=gv_faq
Disallow: /gv_faq.html?faq_item=1
Disallow: /gv_faq.html?faq_item=2
Disallow: /gv_faq.html?faq_item=3
Disallow: /gv_faq.html?faq_item=4
Disallow: /gv_faq.html?faq_item=5
-
Re: I have unwanted pages in google index, like products_all
The main culprit is /index.php?main_page=popup_image... ive lots of them, which are useless to be in the index,
I use ceon mapping 4.07 for the site and it works great, it's just info on firstly getting rid of them, blocking them, and stop them re-indexing,
Anyone????
-
Re: I have unwanted pages in google index, like products_all
So does the
Disallow: /index.php?main_page=popup_image*
Mentioned in my previous post above not work?
It helps to know what you have tried in the past, is there a link to your site
-
Re: I have unwanted pages in google index, like products_all
Thanks Nigel, was beginning to think i was on my own here , Thanks for the example,
I've never had a robot.txt on my site when it was setup, but then again reading some parts on here, some say you don't need one unless you have specific requirements,
I'll see what bits i can take from the example, the url's have been indexed for a long time, so not sure if 301's is needed or even how to implement it for them
-
Re: I have unwanted pages in google index, like products_all
What version of zencart do you have?
As zencart has come with a robots_example.txt file for quite a while now in the root directory, most pages that shouldn't be indexed also have a no-index tag
so pretty much these days you just rename robots_example.txt to robots.txt and you are ready to go, obviously as you want to remove the products_all and new products you would need to add these.
Oh and i realise you are probably not this daft but I have seen it done way too many times, so i'll state it here just in case some reads this, don't put your secret admin directory name in the robots.txt file.
-
Re: I have unwanted pages in google index, like products_all
It's 1.39h, I had the site built for me , just over a year ago, but it was my first site ,
so basically it was built then go learn, but now i see these url's which could be classed as duplicate ,
so need to get rid,
the robots.txt was blank ,
-
Re: I have unwanted pages in google index, like products_all
I'm assuming putting the url's into the robots.txt file will stop the crawling of the url's,
So what would be the best way to get them removed > url removal tool in WMT?
or leave them to die off and fall out the index if google couldn't crawl them?
-
Re: I have unwanted pages in google index, like products_all
Quote:
Originally Posted by
sash
It's 1.39h, I had the site built for me , just over a year ago, but it was my first site ,
so basically it was built then go learn, but now i see these url's which could be classed as duplicate ,
so need to get rid,
the robots.txt was blank ,
But did it have a robots_example.txt sitting in the root? does seem odd that your builder didn't set the default one though
-
Re: I have unwanted pages in google index, like products_all
Quote:
Originally Posted by
nigelt74
But did it have a robots_example.txt sitting in the root? does seem odd that your builder didn't set the default one though
Yea it did have the example in the root folder, plus a blank one He did help me with lots of things,
But being totally green at this, there has been a million things to learn in a year,
-
Re: I have unwanted pages in google index, like products_all
You can remove the urls in Webmaster tools, but they should eventually drop off, just remember it can take months for any changes on your site t show in googles index.
as to the robots.txt issue some people don't bother on zencart, having a blank one actually stops 404 errors being generated by the search bots when they look for it, personally I like to have one that gets rid of the pop-ups and the the zen id, as neither of those in my opinion add value to searches.
-
Re: I have unwanted pages in google index, like products_all
Quote:
Originally Posted by
nigelt74
You can remove the urls in Webmaster tools, but they should eventually drop off, just remember it can take months for any changes on your site t show in googles index.
as to the robots.txt issue some people don't bother on zencart, having a blank one actually stops 404 errors being generated by the search bots when they look for it, personally I like to have one that gets rid of the pop-ups and the the zen id, as neither of those in my opinion add value to searches.
Thanks Nigel, That's Great..
-
Re: robot.txt and parameters
Um, by default Zen Cart already tells Google not to index the main_page=popup_image pages.
Perhaps whoever "built" your site for you has broken that normal functionality. A link to your site would be helpful.
-
Re: I have unwanted pages in google index, like products_all
Hi DrByte the url is http://bit.ly/ibqR2D thanks, would appreciate it if the shortened url was kept that way, cheers
-
Re: I have unwanted pages in google index, like products_all
As I suspected, someone has tampered with the /includes/languages/english/YOUR_TEMPLATE_NAME/meta_tags.php file and altered the default values for ROBOTS_PAGES_TO_SKIP
You would also benefit from improvements made in v1.5.0 to canonical URL support, and more. A site upgrade would help that, along with fixing many other bugs.
-
Re: I have unwanted pages in google index, like products_all
Quote:
Originally Posted by
DrByte
As I suspected, someone has tampered with the /includes/languages/english/YOUR_TEMPLATE_NAME/meta_tags.php file and altered the default values for ROBOTS_PAGES_TO_SKIP
You would also benefit from improvements made in v1.5.0 to canonical URL support, and more. A site upgrade would help that, along with fixing many other bugs.
Thanks for taking a look Dr Byte a site upgrade is something i have looked at but would really need a beginner guide to do it myself, cheers