Need Robots.txt help with Google Shopping - Ecommerce Forums
Page 1 of 2 12 LastLast
Results 1 to 15 of 16

Thread: Need Robots.txt help with Google Shopping

  1. #1

    Need Robots.txt help with Google Shopping

    I have a bunch of products on Google shopping and while some are being listed just fine, the majority are being flagged and are not listed with critical errors. I finally figured out what Google wasn't liking and they are saying that "product pages cannot be crawled because of robots.txt restrictions."

    First, not sure why this would only affect certain items and not others, but they say it can be fixed by changing the robots.txt as follows:

    User-agent: Googlebot
    Disallow:

    User-agent: Googlebot-image
    Disallow:

    So does that mean I just have to add those lines to my robots.txt file? Any help would be appreciated.

    here is my file as it is right now:

    # robots.txt for search engines

    User-agent:*
    Disallow: /cgi-bin/
    Disallow: /ProductDetails.asp
    Disallow: /help_options.asp?*
    Disallow: /ReviewsList.asp?ProductCode=*
    Disallow: /ReviewsList.asp?SortBy=*
    Disallow: /v/vspfiles/assets/images/sizecharts/*
    Disallow: /GiftCert_default.asp?*
    Disallow: /help_answer.asp?ID=*
    Disallow: /login.asp?message=*
    Disallow: /v/vspfiles/assets/images/sizecharts/*
    Disallow: /SearchResults.asp?Search=*
    Disallow: /AccountSettings.asp?*
    Disallow: /PhotoDetails.asp?*
    Disallow: /PhotoGallery.asp?*
    # Block Mobile product images
    Disallow: /mobile/Zoom.aspx?id=*
    # Block Mobile product pages
    Disallow: /mobile/Product.aspxid=*
    #Block Mobile category pages
    Disallow: /mobile/Category.aspx?id=*

    EDIT: Looking at the URL's that Volusion is providing, is it my Disallow: /ProductDetails.asp that is doing this and if so should I remove that from the robots.txt file?

    Here is a sample URL that Volusion feeds to Google:

    http://www.christeninggowns.com/Prod...1-0003&click=2
    Last edited by BabyBeauandBelle; 01-15-2014 at 10:15 AM.

  2. #2
    >Disallow: /ProductDetails.asp

    I would put it back in, and see what happens. Could be Google could have changed something. I am surprised that robots.txt has anything to do with Google Shopping.

  3. #3
    Sorry, I need a little clarification.

    You're saying to put in - Disallow: /ProductDetails.asp

    That line is in my file right now, but I am wondering if I should remove it?? I thought having it there kept Google from crawling the URL's that are not SEO friendly thereby having duplicate pages in Googles eyes. BUT if the same thing is restricting Google from crawling my shopping item "landing page" I seem to be stuck in a bad spot.

    Has this not come up for anybody else??


    Quote Originally Posted by ritchey View Post
    >Disallow: /ProductDetails.asp

    I would put it back in, and see what happens. Could be Google could have changed something. I am surprised that robots.txt has anything to do with Google Shopping.

  4. #4
    BBB,

    I have ProductDetails.asp blocked in robots.txt without issue. Is your mobile store enabled? When did you block the mobile stuff...is that a recent robots.txt change...could that be the problem?

    I use V's Googlefeed api without problem for my PLAs. If you just cursed me, I'm coming after you...

  5. #5
    Ha...

    Mobile is NOT enabled, the mobile blocks were put in after Scott started the thread about it.

    Other then the mobile blocks I have not changed my robots.txt file for a long time.

    Google says:

    Some of your items specify a landing page (via the 'link' attribute) which cannot be crawled by Google because robots.txt forbids Google's crawler to download the landing page. These items stop showing up on Google Shopping until we are able to crawl the landing page.



    Quote Originally Posted by GGG View Post
    BBB,

    I have ProductDetails.asp blocked in robots.txt without issue. Is your mobile store enabled? When did you block the mobile stuff...is that a recent robots.txt change...could that be the problem?

    I use V's Googlefeed api without problem for my PLAs. If you just cursed me, I'm coming after you...

  6. #6
    I would take out the disallow for your product page. It sounds like Google for your site, now requires the ability to crawl the url to get on Google shopping. You should see a difference within a day on your error reporting, since I assume Google is crawling your site daily as well as a daily update on your Google Product Feed is being done.

    My guess is this used to be OK. Google may also be testing the roll out of this change, so you got lucky and are a Beta Tester! Congratulations!

  7. #7
    hmmm, I am afraid if I start allowing the ProductDetails that I will get dinged by Google for duplicate content... Thoughts?

    Also dumb question - by having User-agent:* this means all user agents are allowed including what Google is recommending User-agent: Googlebot
    Disallow:

    User-agent: Googlebot-image
    Disallow:

    So there is no need to manually add those?

  8. #8
    haha! First thing I'd try is removing the mobile blocks and see what happens. After I mess something up, I usually go back to the way it was and see if that fixes it.

    Yep, I'm a real pro.

    Where the heck are Scott and Erik?!

  9. #9
    well... I would rather have my products not show in Google Shopping then be hit with duplicate content by Google from either the mobile or ProductDetails, so I will wait to make any robots.txt changes until somebody can give me a solid answer. No offense GGG

  10. #10
    My understanding using canonical will take care of the duplicate issue and trust Google to figure out what is the right page, and what is not duplicate content.

    My guess is you don't have a choice on allowing Google access to your ProductDetails url instead of just the SEO friendly url.

    >hmmm, I am afraid if I start allowing the ProductDetails that I will get dinged by Google for duplicate content... Thoughts?

  11. #11
    Quote Originally Posted by BabyBeauandBelle View Post
    so I will wait to make any robots.txt changes until somebody can give me a solid answer. No offense GGG
    lol. None taken. However, I'm still catching up from before the holidays and haven't yet blocked mobile in robots.txt. Maybe Google's running behind, but I haven't yet seen any 'duplicate' penalties or notifications in GWT.

    Great. Now you really cursed me... (fingers crossed)

  12. #12
    We do not have the ProductDetails.asp blocked in our robots file and have not blocked it since we upgraded in 2012. We have not had duplicate content issues I am assuming because of the canonical tag. We have not experienced issues with our feed.
    Good luck!

  13. #13
    Canonical will take care of the duplicate content issue, also all of your internal and external links should be pointing to the SEO friendly URLs, the only things pointing to productdetails.asp should be the links in your shopping feed. Your sitemap contains SEO friendly URLs too.

    If you're worried about it, you could manually edit the feed to include the SEO friendly URLs but I wouldn't bother, this is exactly what canonical is for.

  14. #14
    BBB, hi! We are having the same issue I think. Our products cannot be crawled because of robots.txt restriction.

    Our current robots.txt file is similar to the one suggested at this link:
    http://www.schawelcoles.com/volusion-robots-txt-file/
    and we've had it that way for a long time. It does contain Disallow: /ProductDetails.asp and I am thinking that we need to remove that.

    Looking at your current (as of 1/31/14) robots.txt file it does look like you removed the /ProductDetails.asp. Can you comment on how things are going with it now? How did you decide what the robots.txt file should include? Thank you for any input you have!

  15. #15
    You should remove the Disallow: /ProductDetails.asp

    As mentioned, the canonical tag will take care of the duplication and if you have the redirect to friendly URLs they will 301s.

    Also, the robots.txt does't prevent indexing, it only prevents crawling. Just because you have the robots.txt file blocking URLs doesn't necessarily mean those pages will not get indexed. To pull pages out of the index you would have to manually block them with the NOINDEX meta tag.


    Thanks,
    Erik Ellsworth
    CEO
    Convergent7
    Free SEO Analysis

Similar Threads

  1. relooking at 301 and robots.txt
    By CarlWister in forum Tips & Tricks
    Replies: 2
    Last Post: 06-12-2013, 11:11 AM
  2. New Google shopping feed changes
    By GGG in forum Ecommerce Industry
    Replies: 7
    Last Post: 05-17-2013, 02:27 PM
  3. Google Shopping
    By Marc_NY in forum Shopping Feeds
    Replies: 28
    Last Post: 01-14-2013, 07:32 PM
  4. Google Shopping Products
    By GGG in forum Shopping Feeds
    Replies: 5
    Last Post: 11-28-2012, 04:39 AM
  5. Google Shopping
    By OHC in forum Shopping Feeds
    Replies: 21
    Last Post: 10-25-2012, 01:46 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •