Googlebot File Size Limits and Page Weight Explained

Key Takeaways

Google Search fetches the first 2MB of most supported files and the first 64MB of PDF files.
Google’s broader crawler system has a default 15MB file limit, but that is not the same as Google Search’s 2MB Googlebot limit.
These limits apply to each file on its own, not to the whole page and all assets combined.
Page weight means the total size of the page and its resources, so it matters more for users than for Googlebot’s HTML cutoff.
Most sites will never hit the 2MB HTML limit, but bloated inline code can still cause trouble.

Googlebot file size limits are easier to understand when you split them into two parts. First, Google Search uses Googlebot, which fetches the first 2MB of a supported file type and the first 64MB of a PDF. Second, Google’s wider crawler infrastructure has a default 15MB limit unless a specific crawler sets a different one.

That sounds confusing at first. But the main point is simple. Google is not measuring your whole page as one giant lump. It looks at files one by one. So a page can have a heavy total page weight and still be far below Googlebot’s HTML fetch limit.

What Google actually means by file size limits

Google clarified this point in its documentation update in February 2026. It then explained the process in more detail in a March 2026 blog post.

Here is the plain-English version:

2MB is the Googlebot fetch limit for supported files used by Google Search.
64MB is the fetch limit for PDF files.
15MB is the default limit for Google’s crawlers and fetchers unless a product sets its own rule.
The limit is applied per file, not per page.
Google says the limit is based on uncompressed data.

This means your HTML file has its own limit. Your CSS file has its own limit. Your JavaScript file has its own limit too. They are not all added together into one shared cap.

That is why many site owners worry about the wrong number. A page might weigh several megabytes once images, fonts, scripts, and third-party files load. Even so, the HTML itself may still be tiny.

Page weight is not the same as HTML size

Page weight is the total byte size of a page and the assets needed to build it. That includes HTML, CSS, JavaScript, images, media, and third-party resources.

So page weight is about the full experience. Googlebot’s file size limit is about what gets fetched from each file.

This difference matters a lot.

A page can feel heavy to users because it loads many images, scripts, and videos. Yet the raw HTML file may still be small enough that Googlebot can fetch it without any problem. On the other hand, a page with huge inline CSS, inline JavaScript, base64 images, or oversized blocks of hidden markup can push the HTML file toward Google’s cutoff.

In the 2024 Web Almanac, the median mobile page weight was 2,311 KB. But the median mobile homepage HTML weight was only 18 KB. That gap shows why page weight and HTML size should never be treated as the same thing.

Why page weight still matters

Even when crawl limits are not a problem, page weight still matters for people.

Heavy pages take longer to send, process, and render. They can cost more on mobile data. They can also feel slow on weaker devices. In short, page weight is often a user problem before it becomes a Googlebot problem.

That is also why this topic matters beyond SEO. A page may be indexable but still frustrating to use. Slow pages can hurt satisfaction, engagement, and conversions.

So the smart move is to treat this as a two-part job:

Keep important SEO signals inside the part of the file Google can fetch.
Keep the full page light enough that users get a smooth experience.

What can push a page toward Google’s cutoff

Most websites will never come close to a 2MB HTML file. Google has said that a 2MB HTML payload is massive for most of the web.

Still, some pages do grow too large. The usual causes are not normal text. They are things like:

huge inline CSS blocks
huge inline JavaScript blocks
base64-encoded images placed inside HTML
oversized SVG code
giant navigation systems repeated on every page
bulky structured data dumped low in the source
large chunks of hidden or duplicated markup

This is where site owners get into trouble. If the important parts of the page appear after the cutoff, Googlebot may never fetch them. That can affect text, internal links, structured data, canonicals, and other signals.

How to keep important content within reach

The process is straightforward.

Step 1: Check raw HTML size

Do not guess based on what the page feels like in the browser. Check the raw HTML response size. That is the file Googlebot fetches first.

Step 2: Move bulky code out of the HTML

External CSS and JavaScript are fetched separately. So moving large blocks out of the HTML can make the main document much leaner.

Step 3: Put critical SEO elements high in the source

Place your title, meta tags, canonical, key text, and essential structured data early in the HTML. Google has explicitly recommended putting critical elements higher up so they are less likely to fall below a cutoff.

Step 4: Cut waste before cutting content

Look for repeated markup, bloated menus, unused scripts, oversized inline SVG, and base64 assets. These often add more weight than the visible content itself.

Step 5: Optimize for users too

Shrink images, reduce script weight, delay non-critical assets, and remove third-party tools you do not need. Even if Googlebot can crawl the page, users still pay the performance cost.

Did You Know?

The median mobile page in the 2024 Web Almanac weighed 2,311 KB, yet the median mobile homepage HTML was only 18 KB. In many cases, the biggest weight comes from images and JavaScript, not from the main HTML file.

Conclusion

Googlebot file size limits are not the same as page weight. Google Search fetches the first 2MB of a supported file and the first 64MB of a PDF, while Google’s broader crawler system has a default 15MB limit. The real lesson is simple: keep your HTML lean, place critical signals early, and reduce page weight for users. That way, your site is easier to crawl and easier to use.

FAQs

What is Googlebot’s 2MB limit?

Googlebot for Google Search fetches the first 2MB of a supported file type. If the file is larger, Google stops fetching at that point and works with what it already downloaded. Any bytes after the cutoff are not fetched for indexing.

Does Google count the whole page toward one limit?

No. Google applies file size limits per file, not to the full page as one combined bundle. Your HTML, CSS, and JavaScript files are fetched separately, and each one has its own limit. That is why total page weight and HTML size are different things.

Is the real limit 2MB or 15MB?

Both numbers are real, but they apply in different ways. Googlebot for Google Search uses a 2MB limit for supported files and 64MB for PDFs. Google’s broader crawler infrastructure has a default 15MB limit unless a specific crawler sets a different limit.

Can a large page still rank in Google?

Yes, a large page can still rank. The bigger risk is when important content or signals sit after the cutoff in the HTML, or when the page is so heavy that users have a poor experience. Large total weight is more often a usability issue than a pure crawl issue.

How can I tell if my page is at risk?

Start by checking the raw HTML response size, not just the loaded page in the browser. Then review the source for big inline CSS, JavaScript, SVG, base64 images, and repeated markup. If key SEO elements sit very low in the HTML, move them higher.

Sketchweb Microblog