TL;DR
CDN caching works at the HTML level, including meta tags and HTTP headers. When you add or remove a noindex directive, the change exists on your origin server, but Cloudflare may continue serving the cached version with the old directive to Googlebot. This can cause pages to remain indexed when you want them removed, or stay deindexed when you want them indexed. Cache invalidation is the missing step most site owners skip.
The Conversation
Nathan, technical SEO at a media company, 50,000 articles, $2M monthly ad revenue, WordPress with Cloudflare, advanced SEO knowledge, pages won’t deindex despite noindex tags being live for weeks: I added noindex to 3,000 pages three weeks ago. They’re still indexed. I’ve verified the noindex meta tag is in the page source. Google just isn’t respecting it.
SDC SEO Brain: When you view page source, are you viewing through Cloudflare or directly from your origin? Your browser might show the origin version while Google sees the cached version.
Nathan: I’m just viewing source in Chrome. How would I check what Cloudflare is serving?
SDC SEO Brain: Use a tool that bypasses your local cache and shows what actually gets served. Try curl from command line with the URL, or use a web-based request viewer. Check the response headers for Cloudflare’s cache status.
Nathan: The curl response shows the noindex meta tag. And the cf-cache-status header says HIT.
SDC SEO Brain: That tells you Cloudflare is serving cached content, but the cache now contains your noindex. The question is when that cache was refreshed. What’s your cache TTL set to in Cloudflare?
Nathan: Let me check… Edge Cache TTL is set to “Respect Existing Headers” and our origin sends max-age of 1 week.
SDC SEO Brain: One week cache. If Googlebot crawled within the first few days after you added noindex, before the cache expired, it would have seen the old version without noindex. When did Google last crawl these pages according to Search Console?
Nathan: Most of them show last crawl date around two weeks ago. So before the cache expired.
SDC SEO Brain: That’s your problem. Google crawled, received the cached version without noindex, and recorded those pages as indexable. By the time your cache expired and started serving the noindex version, Google had already made its indexing decision and moved on.
Nathan: But Google should recrawl and see the noindex now.
SDC SEO Brain: Should, eventually. But 3,000 low-priority pages don’t get crawled frequently. Google might not revisit them for weeks or months. Your crawl priority is based on how often content changes and how important Google thinks the pages are. If these were old articles you decided to noindex, they’re probably low on Google’s crawl queue.
Nathan: Can I force Google to recrawl?
SDC SEO Brain: You can request indexing through Search Console, but only for individual URLs. You can’t bulk request recrawling for 3,000 pages. Submitting a sitemap update might help signal that something changed, but it’s not a guarantee.
Nathan: What should I have done differently?
SDC SEO Brain: Purge your Cloudflare cache immediately after adding noindex. When you make any change that affects how Google should treat your pages, the cache needs to clear so Google sees the change on the next crawl. Adding noindex without cache purge is like changing your phone number but not telling anyone.
Nathan: I did purge the cache. I purged everything.
SDC SEO Brain: When did you purge relative to adding noindex?
Nathan: A few days after. I was testing to make sure noindex was working on a few pages first.
SDC SEO Brain: During those few days, Googlebot may have crawled. Even a few pages per day across 3,000 means hundreds of pages saw the cached version. The purge happened after the damage was already done.
Nathan: So cache purge needs to happen immediately?
SDC SEO Brain: Immediately or before. The safest workflow is: add noindex to origin, verify it’s working locally, then purge cache and update simultaneously. Or use a staged rollout where you purge specific URLs as you update them.
Nathan: For 3,000 pages, purging individually isn’t realistic.
SDC SEO Brain: For bulk changes, purge everything, which you did, but do it at the moment of deployment, not days later. Also consider reducing your cache TTL for pages likely to change indexing status. A week-long cache is aggressive for pages that might need quick SEO adjustments.
Nathan: What TTL should I use?
SDC SEO Brain: For media sites with frequently changing content and SEO needs, 4-8 hours is more practical. You lose some cache efficiency, but you gain the ability to make changes that Google sees within a day rather than a week.
Nathan: That’s going to increase our origin load.
SDC SEO Brain: It’s a tradeoff. You can also tier your caching: longer TTL for static assets like images and CSS, shorter TTL for HTML that might change. Cloudflare Page Rules or Cache Rules can set different behaviors for different URL patterns.
Nathan: Okay, let’s fix the immediate problem. How do I get these 3,000 pages deindexed now?
SDC SEO Brain: First, confirm your cache is now serving noindex. Purge again to be safe. Then submit an updated sitemap that either excludes these URLs or has a recent lastmod date on all of them. This signals to Google that something changed and it should recrawl.
Nathan: Will changing the sitemap speed up crawling?
SDC SEO Brain: Potentially. Google monitors sitemap lastmod dates. If you update lastmod for all 3,000 URLs to today, Google might prioritize recrawling them. It’s not guaranteed, but it’s a signal you can send.
Nathan: Should I use the Removals tool?
SDC SEO Brain: The Removals tool temporarily hides URLs from search results, usually for about six months. It doesn’t actually deindex them or remove them permanently. If you want them gone forever, noindex plus waiting for recrawl is the right approach. Removals is a band-aid while you wait.
Nathan: So I could use Removals now for the worst ones while waiting for the noindex to take effect?
SDC SEO Brain: Yes, that’s a valid strategy. Use Removals for the highest-urgency pages, like anything causing duplicate content issues or pages with problematic content. For the rest, sitemap update plus patience.
Nathan: What if some of these pages have backlinks? Will noindex lose that link equity?
SDC SEO Brain: Noindex removes the page from the index but doesn’t prevent link equity from flowing through it. Backlinks to a noindexed page can still pass value to pages that noindexed page links to. The link equity isn’t lost, it just doesn’t benefit the noindexed page directly for rankings.
Nathan: Good to know. Last question: how do I prevent this from happening again?
SDC SEO Brain: Build cache invalidation into your SEO change workflow. Any time you modify meta robots, canonical tags, redirects, or anything Google interprets as a directive, cache purge should be step two, immediately after deployment. Document this in your SOPs so developers understand why it matters.
Nathan: Should we automate this?
SDC SEO Brain: If possible, yes. Some CMS plugins can trigger Cloudflare cache purges on post update. Or use Cloudflare’s API in your deployment pipeline. When code pushes that include SEO changes go live, automatically purge affected URLs. This removes human error from the equation.
FAQ
Q: How do I know if Cloudflare is caching my noindex tags?
A: Use curl or a web-based HTTP request tool to fetch your URL and examine both the HTML response and headers. The cf-cache-status header shows HIT if Cloudflare served cached content. Compare the cached HTML to what you expect. If noindex is missing from the cached version but present on your origin, caching is the issue.
Q: How long does it take for cache purge to take effect?
A: Cloudflare cache purge is typically near-instantaneous globally. Once purged, the next request to that URL will fetch fresh content from your origin. However, Google still needs to recrawl to see the change. Purging doesn’t notify Google; it just ensures when Google crawls, it gets fresh content.
Q: Should I reduce my cache TTL for SEO reasons?
A: Consider shorter TTLs for HTML pages that might need SEO adjustments, especially if you frequently change noindex, canonicals, or redirects. Static assets like images, CSS, and JavaScript can retain longer TTLs. Tiered caching lets you balance performance and SEO responsiveness.
Q: Does noindex on a cached page prevent indexing?
A: Only if Google sees the cached version with noindex. If Google crawled before the cache updated to include noindex, it indexed the page based on what it saw at crawl time. CDN caching can delay how quickly Google sees SEO changes, making cache invalidation critical for time-sensitive directives.
Summary
CDN caching stores your entire HTML response, including meta tags and directives. When you add noindex, the cache might still serve the old version without noindex to Google until the cache expires or is purged.
The fix requires cache invalidation at the moment of deployment, not hours or days later. Any delay creates a window where Googlebot might crawl and receive outdated directives.
For bulk changes like noindexing 3,000 pages, update sitemaps with fresh lastmod dates to signal that content changed and needs recrawling. The Removals tool can temporarily hide urgent pages while waiting for noindex to take effect permanently.
Long-term, build cache purging into your SEO change workflow. Automate purges through your deployment pipeline or CMS so human error doesn’t create caching gaps.
Cache TTL settings should balance performance with SEO responsiveness. Week-long HTML caches are too slow for sites that need to make quick directive changes. Consider 4-8 hour TTLs for HTML with longer TTLs for static assets.
Sources
- Cloudflare: Cache documentation
- Google Search Central: Noindex directive
- Google Search Central: Sitemap lastmod usage