TL;DR
Google choosing not to index pages is increasingly common and usually intentional on Google’s part. The main reasons: “Crawled – currently not indexed” means Google saw your content and decided it’s not worth indexing (quality/value issue). “Discovered – currently not indexed” means Google found the URL but hasn’t prioritized crawling it (crawl priority issue, often linked to low internal linking). Both require different fixes. Google has become more selective; they don’t index everything anymore, especially content they perceive as low-value, duplicate, or redundant with existing indexed content.
Do This Today (3 Quick Checks)
- Check your coverage report: GSC → Indexing → Pages. What percentage are “Not indexed”? High percentages indicate systematic issues. Check specific reasons under “Why pages aren’t indexed.”
- Compare indexed vs submitted: How many URLs in your sitemap vs how many indexed? Big gaps indicate Google is choosing not to index your content.
- Test a problem URL: URL Inspection → Test Live URL. Does it show “URL is available to Google”? If yes but not indexed, it’s a quality decision, not a technical block.
Index Bloat: When Too Many Pages Hurts
What is index bloat?
Having too many low-quality or unnecessary pages indexed, which:
- Wastes crawl budget on unimportant pages
- Dilutes overall site quality perception
- Creates internal competition between similar pages
Common index bloat sources:
| Source | Example | Solution |
|---|---|---|
| Parameter URLs | /products?color=red&size=large | Canonicalize or noindex via GSC URL parameters |
| Pagination | /blog/page/47/ | Noindex paginated pages or use rel=prev/next |
| Tag/archive pages | /tag/marketing/, /author/john/ | Noindex thin taxonomy pages |
| Search results | /search?q=keyword | Block via robots.txt or noindex |
| Faceted navigation | /shoes?brand=nike&color=black | Canonicalize or robots.txt |
| Old/thin content | 300-word posts from 2015 | Audit and remove or consolidate |
How to identify bloat:
- Compare indexed pages (GSC) vs pages you actually want indexed
- Search “site:yourdomain.com” and review what’s appearing
- Check GSC for indexing of parameter URLs, pagination, etc.
Bloat reduction: Noindex or remove pages that shouldn’t compete in search. Focus Google’s attention on your best content.
Crawl Budget and Indexing Relationship
How crawl budget affects indexing:
Limited Crawl Budget
↓
Google must prioritize what to crawl
↓
Low-priority pages don't get crawled
↓
Uncrawled pages = "Discovered - currently not indexed"
Crawl budget is limited by:
- Site size and server speed
- Overall site authority
- How often content updates
- Number of internal links to each page
Improving crawl efficiency:
- Remove or noindex low-value pages (reduces waste)
- Improve site speed (Google can crawl more, faster)
- Fix crawl errors (broken links waste budget)
- Update important content (signals freshness priority)
- Strengthen internal linking to important pages
Who needs to worry about crawl budget:
- Sites with 10,000+ pages: Definitely
- Sites with 1,000-10,000 pages: If seeing indexing issues
- Sites with <1,000 pages: Usually not a concern
URL Parameter Handling in GSC
For sites with parameter-generated URLs:
- Go to GSC → Settings → Crawling → URL Parameters (if available)
- For each parameter type:
| Parameter Type | Setting | Example |
|---|---|---|
| Tracking parameters | "No URLs" (don't crawl) | ?utm_source=, ?ref= |
| Sorting parameters | "No URLs" | ?sort=price, ?order=desc |
| Filtering parameters | "Let Google decide" or "No URLs" | ?color=red, ?size=large |
| Pagination | "Let Google decide" | ?page=2, ?p=3 |
| Session IDs | "No URLs" | ?sessionid=abc123 |
Alternative approach: Use canonical tags consistently instead of GSC parameter settings. Canonical is often more reliable.
Expected Indexing Timelines
How long until Google indexes new content?
| Site Type | Typical Timeline | Factors |
|---|---|---|
| New site | Days to weeks | Lower priority, building trust |
| Established site, new page | Hours to days | If well-linked internally |
| Updated existing page | Hours to days | Depends on crawl frequency |
| Low-authority page | Weeks to never | May stay "Discovered" |
What affects indexing speed:
- Site authority (established sites indexed faster)
- Internal linking (well-linked pages indexed faster)
- Sitemap inclusion (helps discovery, not speed)
- Content quality (low quality may never index)
- Server response time (slow sites crawled less)
If pages aren’t indexing within 4 weeks:
- Check for technical blocks (noindex, robots.txt)
- Verify internal links exist to the page
- Assess content quality honestly
- Request indexing via URL Inspection (one-time, not repeatedly)
- Accept that Google may choose not to index
Don’t do:
- Submit URL Inspection requests repeatedly
- Create sitemap spam (submitting same URLs daily)
- Expect Google to index everything
Index Coverage Status Explained
| Status | What It Means | Common Causes | Fix Priority |
|---|---|---|---|
| <strong>Crawled – currently not indexed</strong> | Google crawled, saw content, chose not to index | Thin content, duplicate, low value | High (quality issue) |
| <strong>Discovered – currently not indexed</strong> | Google knows URL exists, hasn't crawled | Low crawl priority, poor internal linking | Medium (priority issue) |
| <strong>Duplicate without canonical</strong> | Google found duplicates, no canonical specified | Multiple URLs same content, no canonical tag | High (technical fix) |
| <strong>Duplicate, Google chose different canonical</strong> | Google ignored your canonical | Signals conflict with your canonical preference | Medium |
| <strong>Excluded by noindex</strong> | Noindex tag present | Intentional or accidental noindex | Check if intentional |
| <strong>Blocked by robots.txt</strong> | Robots.txt blocks crawling | Intentional or misconfigured robots.txt | Check if intentional |
| <strong>Soft 404</strong> | Page returns 200 but Google sees it as error page | Empty pages, "no results" pages | Remove or add content |
The Conversation
Derek, runs a B2B directory website, 15,000 business listings, only 4,000 pages indexed according to GSC, 11,000 pages showing various “not indexed” statuses, technical background but newer to SEO, worried site is fundamentally broken: We have about 15,000 business listings. Search Console shows only 4,000 are indexed. The other 11,000 show “Crawled – currently not indexed” or “Discovered – currently not indexed.” Is our site broken? Why isn’t Google indexing our content?
SDC SEO Brain: Your site probably isn’t broken technically. Google has become much more selective about what they index. For directory sites especially, they often choose not to index pages they perceive as low-value or duplicative. With 15,000 listings, Google is evaluating whether each listing page provides enough unique value to be worth indexing. Let’s understand what’s happening. What does a typical listing page contain?
Derek: Business name, address, phone number, category, a map, and some basic info like hours and services offered.
SDC SEO Brain: How much unique content is on each page? Is there anything beyond the structured business data?
Derek: That’s mostly it. Some listings have a few sentences of description, but many are just the basic info. Maybe 50-100 words total per page.
SDC SEO Brain: There’s your likely problem. Google sees 15,000 pages with 50-100 words each, mostly structured data that’s available on many other directories (Yelp, Google Business Profile, Yellow Pages). From Google’s perspective, what unique value does your listing page provide that isn’t available elsewhere?
Derek: We’re a niche industry directory. Our listings are more targeted than generic directories.
SDC SEO Brain: That helps with user value but not necessarily indexing decisions. Google needs to see that your page is the best or unique result for relevant queries. If someone searches for a specific business, Google already has Google Business Profile. If they search for category + location, Google shows local packs. Your individual listing pages are competing against Google’s own local results and established directories.
Derek: So Google will just never index directory listing pages?
SDC SEO Brain: They’ll index pages they consider valuable. The question is how to make your pages more valuable. Options for directory sites:
Option 1: Add unique content. User reviews, detailed business descriptions (not duplicated from elsewhere), unique insights about the business, original photos, content that only your directory has.
Option 2: Focus on category/location pages instead of individual listings. A page about “HVAC contractors in Austin” with 30 listings might be more indexable than 30 individual thin listing pages.
Option 3: Accept that not all pages need to be indexed. Your 4,000 indexed pages might be your category pages and stronger listings. The thin listings serve users who navigate to them but don’t need to rank in Google.
Derek: Let me check which pages are indexed… You’re right. Most indexed pages are category pages and listings that have reviews. Listings with just basic info aren’t indexed.
SDC SEO Brain: Google is making the decision for you: pages with unique content (reviews) get indexed; pages without don’t. This is actually rational behavior. Your job is to either add unique content to more listings or accept that thin listings won’t be indexed.
Derek: What about “Discovered – currently not indexed”? We have 3,000 pages in that status.
SDC SEO Brain: “Discovered” means Google knows the URL exists, probably from your sitemap or internal links, but hasn’t prioritized crawling it. This is different from “Crawled – currently not indexed.” For discovered pages, Google hasn’t even looked at the content yet. They’re in a queue but not being prioritized.
Derek: Why wouldn’t Google crawl them?
SDC SEO Brain: Crawl priority is influenced by: how important Google thinks the page is based on internal links, how fresh/updated the content is, how well similar pages on your site have performed, and overall site authority. If you have 15,000 pages but weak internal linking to most of them, Google only allocates enough crawl resources for pages that seem important.
Derek: How do we improve crawl priority?
SDC SEO Brain: Better internal linking is the primary lever. How are listing pages linked from the rest of your site?
Derek: Each listing is linked from its category page and maybe a location page. That’s about it.
SDC SEO Brain: So each listing has 1-3 internal links. That’s weak. Compare to a product on Amazon with links from category pages, search results, “related products,” “customers also viewed,” and many other pages. More internal links signal importance. For your discovered pages, adding more internal link paths could push them into the crawl queue.
Derek: We could add “related businesses” sections to each listing page.
SDC SEO Brain: That would help. Also consider: featured listings on category pages, “popular in this area” sections, internal search results that are crawlable, links from blog content mentioning businesses. Every additional internal link path increases the likelihood Google crawls the page.
Derek: Even if we get them crawled, won’t they just move to “Crawled – currently not indexed” since they’re thin?
SDC SEO Brain: Possibly. The real solution for directory sites is adding genuine unique value. Do you collect user reviews or testimonials?
Derek: We have a review system but only about 500 businesses have reviews.
SDC SEO Brain: Those 500 are probably among your indexed pages. Reviews are unique content that differentiates your listing from the same business on other directories. Strategies to get more reviews: email campaigns to businesses asking them to share their directory listing with customers, incentives for reviews, making the review process extremely simple, following up after business interactions.
Derek: What about all the pages that are crawled but not indexed? Can we force indexing?
SDC SEO Brain: You can request indexing via URL Inspection, but Google isn’t obligated to index. If they’ve crawled a page and decided not to index it, requesting again usually doesn’t help unless you’ve significantly changed the page. Google’s John Mueller has explicitly said: if pages are crawled but not indexed, it’s usually a quality signal, and the solution is improving quality, not requesting indexing repeatedly.
Derek: What’s the minimum content to get indexed?
SDC SEO Brain: There’s no magic minimum. It’s about providing value that justifies indexing. Some pages with 200 words get indexed because they’re the best answer for a specific query. Some pages with 2,000 words don’t get indexed because similar content already exists. The question isn’t word count; it’s “does this page serve users better than what’s already indexed?”
Derek: Should we noindex the thin pages?
SDC SEO Brain: If they’re providing no SEO value anyway (not indexed), noindexing doesn’t change much. But it can help consolidate Google’s quality perception. If Google is crawling 15,000 pages and deciding 11,000 aren’t worth indexing, that’s a lot of “this site has low-quality pages” signals. Noindexing them removes them from that evaluation. Some sites have improved overall rankings by noindexing their weakest content.
Derek: So we should noindex all unindexed pages?
SDC SEO Brain: Not automatically. Some might be temporarily unindexed but could be indexed later after improvements. Consider noindexing: pages you’ve confirmed have no unique value and won’t be improved, pages that exist only for user navigation (thin category filters, empty search results), pages that are essentially duplicates of indexed pages. Keep indexed pages and pages you plan to improve.
FAQ
Q: Is “Crawled – currently not indexed” a penalty?
A: No, it’s not a penalty. It’s Google’s assessment that the page doesn’t provide enough unique value to warrant indexing. It’s increasingly common as Google becomes more selective. The fix is improving content quality and uniqueness, not submitting reconsideration requests.
Q: Why are some of my pages “Discovered – currently not indexed”?
A: Google knows these URLs exist but hasn’t prioritized crawling them. Common causes: low internal link count, new pages on low-authority sites, pages far from the homepage in site architecture. Improve internal linking to increase crawl priority.
Q: Can I force Google to index a page?
A: You can request indexing via URL Inspection, but Google isn’t obligated to index. If they’ve already crawled and chosen not to index, requesting again rarely helps without significant content changes.
Q: How do I improve indexing for a directory or listing site?
A: Add unique content that differentiates your listings: user reviews, original descriptions, unique data points, detailed business information. Category/hub pages often index better than individual thin listings. Accept that not all thin pages will or should be indexed.
Q: Should I noindex pages that Google won’t index anyway?
A: Consider it. If Google is crawling pages and deciding they’re not worth indexing, those pages still consume crawl budget and may send negative quality signals. Noindexing definitively removes them from quality evaluation. Test with a subset first.
Summary
Google doesn’t index everything anymore. They’ve become selective, only indexing content they consider valuable and worth serving to searchers. Large portions of low-value content not being indexed is normal, not a technical problem.
“Crawled – currently not indexed” is a quality assessment. Google saw your content and decided not to index it. This usually indicates: thin content, duplicate content, or content that doesn’t add unique value compared to already-indexed pages.
“Discovered – currently not indexed” is a priority issue. Google knows the URL exists but hasn’t allocated crawl resources. Improve internal linking to signal importance. Add more link paths to these pages.
Directory and listing sites face particular challenges. Thin listing pages with structured data available elsewhere often won’t be indexed. Solutions: add unique content (reviews, original descriptions), focus on category/hub pages, accept that not all listings need to index.
Requesting indexing rarely forces Google’s hand. If they’ve crawled and declined to index, requesting again without content changes is ineffective. Google has explicitly stated: improve quality rather than requesting indexing.
Noindexing weak content can help overall site quality. If many pages are crawled but not indexed, those negative quality signals might affect indexed pages too. Consider noindexing content you won’t improve.
Focus on value, not volume. 1,000 indexed pages with unique value beats 15,000 submitted pages with 4,000 indexed. Quality perception matters for the whole site.
Sources
- Google Search Central: Index coverage report – https://support.google.com/webmasters/answer/7440203
- Google Search Central: Why pages might not be indexed – https://developers.google.com/search/docs/crawling-indexing/troubleshoot-page-indexing
- Google Search Central: URL inspection tool – https://developers.google.com/search/docs/monitor-debug/url-inspection-tool