TL;DR
WordPress and other CMSs automatically create author archive pages that list posts by each contributor. When you have multiple authors or guest contributors, these archives become thin pages with duplicate content that compete with your actual blog posts. Author archives with minimal content and no unique value dilute your site’s topical authority and waste Google’s evaluation on pages that shouldn’t rank. Noindexing these archives concentrates your site’s SEO equity on pages that actually deserve to rank.
The Conversation
Tom, content director at a marketing agency blog, 15 writers, 500 posts, WordPress, intermediate SEO knowledge, noticed author pages ranking instead of actual articles: Something weird is happening. When I search for topics we’ve covered, sometimes our author archive pages rank instead of the actual articles. The archive just lists post titles with excerpts. Why would Google rank that over the full article?
SDC SEO Brain: Author archives can accumulate ranking signals unintentionally. Each post links back to its author archive in the byline. If an author has written 50 posts, that’s 50 internal links pointing to their archive page. The archive also contains excerpts from all those posts, creating keyword density for topics they cover frequently.
Tom: But the archive is just a list. There’s no real content there, just titles and snippets.
SDC SEO Brain: From Google’s perspective, it’s a page with your domain authority, dozens of internal links, and text containing your target keywords from all those excerpts. The lack of depth is a problem, but the signals can still push it into rankings. Once there, it might satisfy some navigational queries, keeping it ranked despite being thin.
Tom: We have 15 authors. That’s 15 archive pages Google has to figure out.
SDC SEO Brain: And some of those authors might have written only 2-3 posts. Archives with minimal content are even worse. They exist as indexable pages with almost nothing on them. Google has to crawl and evaluate them, and their low quality contributes to overall site quality assessment.
Tom: What’s the actual harm? If they rank occasionally, does that hurt our real articles?
SDC SEO Brain: Multiple harms. First, cannibalization: when the archive ranks, it takes a SERP slot from your actual article. The archive gets the click, user finds a list instead of content, bounces back to Google. That’s a negative user signal associated with your domain. Second, site quality dilution: Google’s helpful content system evaluates site-wide quality. Thin archive pages drag down your overall score.
Tom: So even if the archive pages don’t rank, having them indexed hurts us?
SDC SEO Brain: Potentially, yes. Google sees your site as 500 substantive posts plus 15 thin archives plus however many category pages plus tag pages. The ratio of thin to substantive content matters. High-quality sites have high proportions of valuable pages. Low-quality sites are padded with empty index pages.
Tom: Should we just delete author archives?
SDC SEO Brain: You can’t easily delete them in WordPress without code changes, they’re generated automatically. But you can noindex them. Adding noindex to author archives removes them from search results while keeping them functional for site navigation. Users can still browse by author if they want; Google just won’t include those pages in its index.
Tom: How do I noindex author archives in WordPress?
SDC SEO Brain: If you’re using Yoast SEO, go to SEO settings, then Search Appearance, then Archives tab. There’s a toggle to disable author archives in search results. Rank Math has a similar setting. This adds noindex meta tags to all author archive pages automatically.
Tom: What if we want some author archives to rank? We have a few high-profile contributors where their author page could be valuable.
SDC SEO Brain: That requires more granular control. You’d need custom code or a plugin that lets you set index status per author. Or better approach: build custom author pages that aren’t the default archive. A proper author page with bio, expertise, credentials, photo, and curated best work is different from an auto-generated archive list.
Tom: What makes a good author page that should rank?
SDC SEO Brain: Unique content about the author: their background, credentials, areas of expertise, published works, speaking engagements, links to external profiles. Then curated highlights of their best articles, not just a chronological list of everything. This creates a page with genuine value for someone searching for that person specifically.
Tom: We don’t have that kind of content for most authors. They’re staff writers.
SDC SEO Brain: Then noindex them all for now. You can selectively build out real author pages for key contributors later when you have substantive content to include. But generic archive pages without unique content shouldn’t be indexed. They’re pure overhead.
Tom: You mentioned category and tag pages too. Should those be noindexed?
SDC SEO Brain: It depends on their quality. Default WordPress tag archives are just lists of posts with that tag. If you have 300 tags, that’s 300 thin pages. Keep tags for internal organization but noindex them unless they have substantial unique content. Categories deserve more consideration because they often represent real topics with enough content to be valuable.
Tom: Our categories are pretty broad. Marketing, Sales, Design, things like that.
SDC SEO Brain: Broad categories with dozens of posts might be worth keeping indexed if you add unique intro content, curate featured posts, and make them genuine destination pages. If they’re just chronological lists like archives, noindex them too.
Tom: This is going to cut our indexed page count significantly.
SDC SEO Brain: That’s the point. Indexed page count isn’t a metric to maximize. You want every indexed page to be one that could rank and satisfy users. Quality over quantity. A site with 500 strong pages beats a site with 500 strong pages plus 200 thin archives. Google’s algorithms increasingly reward this focus.
Tom: How do I know when the noindexing has worked?
SDC SEO Brain: Monitor Search Console’s indexing report. After adding noindex, Google will recrawl those pages and update their status to “Excluded by noindex tag.” You’ll see your indexed page count decrease. Then watch your rankings for actual articles to see if they improve without archive competition.
Tom: How long until we see ranking improvements?
SDC SEO Brain: Once noindexed pages are removed from the index, any cannibalization issues resolve immediately. Site-wide quality improvements take longer because Google needs to recalculate its assessment of your domain. Expect 4-8 weeks to see measurable changes, if the thin archives were actually dragging you down.
FAQ
Q: Why do author archives rank over actual articles?
A: Author archives accumulate internal links from every post by that author and contain excerpts with target keywords. Despite being thin, these signals can push archives into rankings where they compete with your actual content pages.
Q: Should I delete or noindex author archives?
A: Noindex is the better approach. Deleting can break navigation and user experience. Noindexing keeps archives functional for site browsing while removing them from search results. Use your SEO plugin’s archive settings to implement this easily.
Q: When should an author page be indexed?
A: Only when it has genuine unique value: a real bio, credentials, expertise areas, curated best work, external profile links. A page someone might actually search for specifically. Generic auto-generated archives don’t meet this standard.
Q: Do thin archive pages hurt overall site rankings?
A: Potentially yes. Google’s helpful content system evaluates site-wide quality. A high proportion of thin pages can drag down Google’s perception of your entire domain, affecting rankings for your good content.
Q: Should I noindex category pages too?
A: It depends on their quality. Categories with substantial unique intro content and curated posts can be valuable. Default lists without unique content should be noindexed. Evaluate each category based on whether it could genuinely rank and satisfy search intent.
Summary
Auto-generated author archives accumulate ranking signals through internal links and keyword excerpts, sometimes outranking actual articles despite being thin content with no unique value.
Default WordPress archives are lists without unique value. They waste crawl budget and can dilute site-wide quality perception in Google’s helpful content evaluation, affecting your entire domain.
Noindex archive pages (author, tag, and potentially categories) unless they have substantial unique content. This concentrates indexing on pages that can genuinely rank and satisfy users.
Custom author pages with real bios, credentials, and curated content can be valuable indexed pages. Generic chronological lists should never be indexed regardless of how much internal link equity they’ve accumulated.
Quality over quantity in indexed pages. Removing thin archives from the index often improves rankings for actual content by eliminating internal competition and improving overall site quality signals.
Sources
- Google Search Central: Avoid creating thin content
- Google Search Central: Site-wide signals and helpful content
- WordPress Developer Resources: Author archives