How to Find Content That Google and LLMs Might Not See

This tip comes from Chris Long. Chris shares amazing tips on LinkedIn and you can get his feed in his newsletter at Nectiv. I would also recommend following him on LinkedIn if you are not already.

Screaming Frog has a report that shows you how much of your page content depends on JavaScript to render. If a significant percentage of your text only appears after JS executes, that content is at risk for Google indexing, and even more so for LLMs.

Here’s how to find it.

The Report

  1. Open Screaming Frog
  2. Go to Configuration > Spider > Rendering
  3. Select “JavaScript” from the Rendering dropdown
  4. In the same menu, make sure “Store HTML” and “Store Rendered HTML” are both checked
  5. Run your crawl
  6. Navigate to JavaScript > Contains JavaScript Content in the right-hand sidebar

You’ll see every URL with a “JavaScript % Change” and “Word Count Change” column. This tells you how much content is being loaded via JavaScript versus what’s in the initial HTML.

Bonus: You can drill down to see exactly which text is JS-dependent. Click on a URL, go to “View Source,” and click “Show Differences.” You’ll see the specific content that JavaScript adds to the page.

Why This Matters for Google

Google can render JavaScript. It uses a headless version of Chrome to execute scripts and see the final page. But there’s a catch.

Rendering is expensive. Google doesn’t render pages instantly. It queues them. The page gets crawled first, then sits in a render queue until Google has resources to process the JavaScript. This can take seconds, hours, days, or longer depending on your site’s crawl priority.

During that delay, Google is working with whatever was in your initial HTML. If your main content, links, or metadata only exist after JavaScript runs, there’s a window where Google doesn’t see them. And if something goes wrong with rendering such as timeouts, blocked resources, script errors, that content may never get indexed.

This isn’t theoretical. Sites with heavy client-side rendering regularly see indexing gaps, missing content in search results, and pages that take weeks to reflect updates.

Why This Matters More for LLMs

Here’s where it gets worse.

Most LLM crawlers don’t render JavaScript at all. GPTBot, ClaudeBot, PerplexityBot… none of them execute scripts. They grab the raw HTML and that’s it.

A joint analysis from Vercel and MERJ tracked over half a billion GPTBot requests and found zero evidence of JavaScript execution. Even when GPTBot downloads .js files, it doesn’t run them. Same story for Anthropic’s crawler, Perplexity’s crawler, and others.

This means if your product descriptions, pricing, reviews, or main article content loads via JavaScript, these systems literally cannot see it. Your page might rank fine in Google, but when someone asks ChatGPT or Perplexity about your product category, you won’t exist in their answers because you don’t exist in their index.

Google’s own LLM infrastructure (Gemini, AI Overviews) benefits from Googlebot’s rendering capabilities. But everyone else is working with raw HTML only. And that gap is significant.

What to Do With This Data

Run the Screaming Frog report on your site. Look for pages where:

  • A high percentage of word count comes from JavaScript
  • Critical content (product details, pricing, key copy) appears in the “differences” view
  • Important pages show large JS % changes

For those pages, you have a few options:

Server-side rendering (SSR). Frameworks like Next.js, Nuxt, and SvelteKit can render your JavaScript on the server and deliver complete HTML to crawlers. This solves the problem at the architecture level.

Static generation. If your content doesn’t change frequently, tools like Astro, Hugo, or Gatsby can pre-render pages as static HTML.

Pre-rendering services. Tools like Prerender.io detect bot requests and serve them a fully-rendered HTML version. This is a band-aid, but it works.

Move critical content out of JS (my recommendation). Sometimes the simplest fix is restructuring. If your main headline, product description, or key paragraph can live in the initial HTML, put it there.

The Quick Test

Want to see what LLMs see on any page? Disable JavaScript in your browser and reload. Whatever’s left is what ChatGPT, Claude, and Perplexity can access.

In Chrome:

  1. Open Chrome DevTools (F12 or right-click > Inspect)
  2. Press Cmd+Shift+P (Mac) or Ctrl+Shift+P (Windows)
  3. Type “Disable JavaScript” and select it
  4. Reload the page

If your core content disappears, you have a problem worth fixing.


Thanks to Chris Long for the original tip. Subscribe to his newsletter at nectivdigital.com/newsletter.

Sign up for weekly notes straight from my vault.
Subscription Form (#5)

Tools I Use:

🔎  Semrush – Competitor and Keyword Analysis

✅  Monday.com – For task management and organizing all of my client work

📄  Frase – Content optimization and article briefs

📈  Keyword.com – Easy, accurate rank tracking

🗓️  Akiflow – Manage your calendar and daily tasks

📊  Conductor Website Monitoring – Site crawler, monitoring, and audit tool

👉  SEOPress – It’s like Yoast, if Yoast wasn’t such a mess.

Sign Up So You Don't Miss the Next One:

vector representation of computers with data graphs
Subscription Form (#5)

Past tips you may have missed...