If you have implemented any CDN (Content Delivery Network) service in an effort to speed up the loading time of your website, you might have faced this challenge. I am going to share with you the steps you need to take to ensure your website images get indexed properly by Google in spite of being hosted on a CDN domain.
Several months ago I implemented MaxCDN on my website along with CloudFlare and my website’s page loading time reduced from 25+ seconds to under 1 second. It was absolutely incredible to see the impact of this implementation on the browsing experience of my readers from around the world.
I wrote a blog post sharing my initial experience in this post:
But CDN Implementation Came With A Challenge
While the page speed of my website certainly was a joy to watch, I saw a gradual decrease in the number of images Google was indexing. I noticed it late but I did notice that the count of indexed images continued to reduce with time and in a few months, the count came to zero.
The number of indexed images came down from 275 to 0 and that was a huge loss of traffic.
I started to do my research. I contacted customer support of MaxCDN and they pointed me towards setting up canonical headers for image files through .htaccess. They have a well written article which explains that concept and I want you to look at the article so you know what I am talking about:
I followed all the steps MaxCDN recommended in this article and got it verified with their customer support but it didn’t help in the indexing of my website images. They didn’t know what the problem could be and obviously I didn’t know any better.
But I continued researching the issue further because I wasn’t willing to give up on the page speed of my website by doing without CDN. I knew there has to be a solution.
The Solution To Get Your Website Images Indexed
When I was setting up CDN for my website of MaxCDN, I created four custom domain names to host files for parallel processing as shown below:
Among these custom domain names, as you can see, img.gauraw.com is the one I created to host CDN images for my website. With W3 Total Cache WordPress plugin it is really easy to setup these and everything works like a charm.
However, the real challenge was, the conflict my sitemap was creating with the way images on my website were served to the crawlers when they came to my website.
For example, my sitemap file submitted the image URL to Google (through Google Webmaster Tools) as:
But, when Google crawlers came, they would discover images just like anybody else. That means, they will be served images from the CDN subdomains as:
Now, this created a duplicate content situation for Google and you know Google doesn’t appreciate duplicate content. Therefore, it began de-indexing my website images and pretty soon I had none of my images indexed.
If your images are not indexed, this is what is happening to you as well.
I use WordPress SEO by Yoast for on page SEO and sitemap file generation on my website. With a little research I found a solution to my problem through their documentation. I implemented the recommended solution and soon Google began indexing my website images one at a time just in a few days.
Here is what you need to do to help Google discover your images the right way and then it will start indexing them:
1. Confirm CDN Subdomain Ownership In Google Webmaster Tools
As a first step, make sure that you claim your ownership of your CDN subdomains through Google Webmaster Tools. Here is a link to the instructions in detail in case you need to know the steps needed to complete the verification:
Once the verification is complete, your first step is completed in the process of getting your images hosted on CDN subdomain indexed by Google. Now, you can proceed for the second and the final step in this process.
2. Switch XML Sitemap Image URLs To Point To CDN Subdomain
Add the following code in the functions.php file located within your theme’s folder. While adding this code, replace “example.com” with your domain name and CDN URL.
This code will modify your sitemap XML file to contain your CDN image URLs instead of your root domain URL for your images.
This will make sure that Google will always see the images with your CDN Subdomain URL and now it will not find any duplicates because the root domain URL for images will not be found even from the sitemap XML file.
3. Wait For About A Week For Indexing To Begin
Once you have completed both steps outlined above, give it a good week for Google to begin indexing of your website images. Keep watching your Webmaster Tools and it will eventually begin to show the number of images indexed.
Usually, Google Webmaster Tools takes a couple of days to show the numbers after images get indexed. So, just keep that in mind and be patient.
You can also find out whether or not your images are being index by Google or not by searching through Google Image search. What you need to do is, go to images.google.com and then search for following term:
For example: I can search for “site:Gauraw.com” to find out how many of my images have been indexed by Google so far.
When you implement CDN services like MaxCDN (now NetDNA) or CDN777, it comes with a challenge that your website images may get de-indexed by Google because of the CNAME based CDN URLs. But, with these steps I discussed above, you can get Google to index your site images once again. I have been able to do it and I am sure you can do it too.
If you have any questions, or you are facing any challenge in following these instructions, go to the comment section for this post and ask me your question. I will be glad to help you get your answer.
If you have had similar problems because of CDN and you have been able to resolve it using any other technique, would you care to share your experience? Please visit the comment section and share your thoughts.
Thank you kindly!