I have unwanted pages in google index, like products_all
Hi i.m using 1.39h could someone tell me a basic robot.txt to input for this version, Reason being is i have unwanted pages in google index, like products_all , new_products, pop_up-Image, Discount-coupon, login_page etc, so really not needed in the index,
Is there a basic list i could insert into robot.txt to stop google crawling these, and a way of blocking them being indexed,
and is there any specific parameters to insert into WMT for url parameters
Thanks
Re: robot.txt and parameters
Re: I have unwanted pages in google index, like products_all
1) Did you look at the included robots.txt example, that should get rid of pop up images appearing google etc
Quote:
User-agent: *
Disallow: /cgi-bin/
Disallow: /*zenid=
Disallow: /index.php?main_page=popup_image*
2) Pages like the login page, have the noindex tag on them by default to stop them being indexed, or at least all that i have seen have
This is a very old one from someone somewhere, you may be able to chop useful bits out of it
Quote:
User-agent: Googlebot
Disallow: /*&action=notify$
Disallow: /*&number_of_uploads=0&action=notify
Disallow: /index.php?main_page=discount_coupon
Disallow: /index.php?main_page=checkout_shipping
Disallow: /index.php?main_page=shippinginfo
Disallow: /index.php?main_page=privacy
Disallow: /index.php?main_page=conditions
Disallow: /index.php?main_page=contact_us
Disallow: /index.php?main_page=advanced_search
Disallow: /index.php?main_page=login
Disallow: /index.php?main_page=unsubscribe
Disallow: /index.php?main_page=shopping_cart
Disallow: /index.php?main_page=product_reviews_write&cPath=*
Disallow: /index.php?main_page=tell_a_friend&products_id=*
Disallow: /index.php?main_page=product_reviews_write&products_id=*
Disallow: /index.php?main_page=popup_shipping_estimator
Disallow: /index.php?main_page=account
Disallow: /index.php?main_page=password_forgotten
Disallow: /index.php?main_page=checkout_shipping_address
Disallow: /index.php?main_page=logoff
Disallow: /index.php?main_page=gv_faq
Disallow: /gv_faq.html?faq_item=*
Disallow: /*&sort=*
Disallow: /*alpha_filter_id=*
Disallow: /*&disp_order=*
User-agent: Slurp
Disallow: /*&action=notify$
Disallow: /*&number_of_uploads=0&action=notify
Disallow: /index.php?main_page=discount_coupon
Disallow: /index.php?main_page=checkout_shipping
Disallow: /index.php?main_page=shippinginfo
Disallow: /index.php?main_page=privacy
Disallow: /index.php?main_page=conditions
Disallow: /index.php?main_page=contact_us
Disallow: /index.php?main_page=advanced_search
Disallow: /index.php?main_page=login
Disallow: /index.php?main_page=unsubscribe
Disallow: /index.php?main_page=shopping_cart
Disallow: /index.php?main_page=product_reviews_write&cPath=*
Disallow: /index.php?main_page=tell_a_friend&products_id=*
Disallow: /index.php?main_page=product_reviews_write&products_id=*
Disallow: /index.php?main_page=popup_shipping_estimator
Disallow: /index.php?main_page=account
Disallow: /index.php?main_page=password_forgotten
Disallow: /index.php?main_page=checkout_shipping_address
Disallow: /index.php?main_page=logoff
Disallow: /index.php?main_page=gv_faq
Disallow: /gv_faq.html?faq_item=*
Disallow: /*&sort=*
Disallow: /*alpha_filter_id=*
Disallow: /*&disp_order=*
User-agent: *
Disallow: /index.php?main_page=faqs_new
Disallow: /index.php?main_page=discount_coupon
Disallow: /index.php?main_page=checkout_shipping
Disallow: /index.php?main_page=shippinginfo
Disallow: /index.php?main_page=privacy
Disallow: /index.php?main_page=conditions
Disallow: /index.php?main_page=contact_us
Disallow: /index.php?main_page=advanced_search
Disallow: /index.php?main_page=login
Disallow: /index.php?main_page=unsubscribe
Disallow: /index.php?main_page=shopping_cart
Disallow: /index.php?main_page=popup_shipping_estimator
Disallow: /index.php?main_page=account
Disallow: /index.php?main_page=password_forgotten
Disallow: /index.php?main_page=checkout_shipping_address
Disallow: /index.php?main_page=logoff
Disallow: /index.php?main_page=gv_faq
Disallow: /gv_faq.html?faq_item=1
Disallow: /gv_faq.html?faq_item=2
Disallow: /gv_faq.html?faq_item=3
Disallow: /gv_faq.html?faq_item=4
Disallow: /gv_faq.html?faq_item=5
Re: I have unwanted pages in google index, like products_all
The main culprit is /index.php?main_page=popup_image... ive lots of them, which are useless to be in the index,
I use ceon mapping 4.07 for the site and it works great, it's just info on firstly getting rid of them, blocking them, and stop them re-indexing,
Anyone????
Re: I have unwanted pages in google index, like products_all
So does the
Disallow: /index.php?main_page=popup_image*
Mentioned in my previous post above not work?
It helps to know what you have tried in the past, is there a link to your site
Re: I have unwanted pages in google index, like products_all
Thanks Nigel, was beginning to think i was on my own here , Thanks for the example,
I've never had a robot.txt on my site when it was setup, but then again reading some parts on here, some say you don't need one unless you have specific requirements,
I'll see what bits i can take from the example, the url's have been indexed for a long time, so not sure if 301's is needed or even how to implement it for them
Re: I have unwanted pages in google index, like products_all
What version of zencart do you have?
As zencart has come with a robots_example.txt file for quite a while now in the root directory, most pages that shouldn't be indexed also have a no-index tag
so pretty much these days you just rename robots_example.txt to robots.txt and you are ready to go, obviously as you want to remove the products_all and new products you would need to add these.
Oh and i realise you are probably not this daft but I have seen it done way too many times, so i'll state it here just in case some reads this, don't put your secret admin directory name in the robots.txt file.
Re: I have unwanted pages in google index, like products_all
It's 1.39h, I had the site built for me , just over a year ago, but it was my first site ,
so basically it was built then go learn, but now i see these url's which could be classed as duplicate ,
so need to get rid,
the robots.txt was blank ,
Re: I have unwanted pages in google index, like products_all
I'm assuming putting the url's into the robots.txt file will stop the crawling of the url's,
So what would be the best way to get them removed > url removal tool in WMT?
or leave them to die off and fall out the index if google couldn't crawl them?
Re: I have unwanted pages in google index, like products_all
Quote:
Originally Posted by
sash
It's 1.39h, I had the site built for me , just over a year ago, but it was my first site ,
so basically it was built then go learn, but now i see these url's which could be classed as duplicate ,
so need to get rid,
the robots.txt was blank ,
But did it have a robots_example.txt sitting in the root? does seem odd that your builder didn't set the default one though