Technical SEO Guide

Crawlability, indexing, and site performance — the foundation everything else sits on

Content and links can't rank a site that Google can't crawl. Technical SEO is the unglamorous work that makes everything else possible. Fix it once, benefit indefinitely.

Crawlability

Googlebot has to be able to access your pages. Three things block it:

  • robots.txt Disallow: Check that you're not accidentally blocking pages you want indexed. The classic mistake is Disallow: / deployed to production from a staging config.
  • noindex meta tag: <meta name="robots" content="noindex"> prevents indexing entirely. Check it didn't get left on from development.
  • Login walls: Content behind authentication can't be crawled. If you want it indexed, it needs to be accessible without login.

Use Google Search Console's URL Inspection tool to see exactly what Googlebot sees when it crawls a specific URL.

robots.txt

Your robots.txt lives at yourdomain.com/robots.txt. It tells crawlers which paths they can and can't access. A minimal but correct robots.txt:

User-agent: *
Allow: /

Sitemap: https://yourdomain.com/sitemap.xml

If you want to block specific paths (admin, internal search results, staging prefixes), add Disallow: /admin/ etc. Don't block CSS or JavaScript — Googlebot needs them to render your pages.

XML Sitemap

A sitemap lists all the URLs you want Google to know about. It doesn't force indexing, but it speeds up discovery — especially for new content and pages without many inbound links.

  • Include only indexable pages (no noindex, no 404, no redirects)
  • Include <lastmod> dates for content that updates regularly
  • Submit via Google Search Console under Sitemaps
  • Broken sitemap URLs (Google will report these)

HTTPS

HTTPS has been a Google ranking signal since 2014. The protocol also affects browser behavior — Chrome shows a "Not secure" warning on HTTP sites. There's no argument for HTTP in 2024.

Common issues: mixed content (HTTP assets on an HTTPS page), HTTP→HTTPS redirects using 302 instead of 301, or redirect chains that add latency.

Core Web Vitals

Google's page experience ranking signals. Three metrics:

LCP — Largest Contentful Paint

How long until the main content loads. Under 2.5 seconds is good. The most common culprit is a large above-the-fold image that isn't preloaded.

INP — Interaction to Next Paint

How responsive the page is to user interactions. Under 200ms is good. Heavy JavaScript on the main thread is the usual problem.

CLS — Cumulative Layout Shift

How much content moves around as the page loads. Under 0.1 is good. Reserve space for images and ads. Don't inject content above existing content.

Check your CWV scores in PageSpeed Insights and Google Search Console's Core Web Vitals report. Field data (real users) matters more than lab data.

Canonical tags

The canonical tag tells Google which version of a URL is the "real" one. Without it, Google might index example.com/page, www.example.com/page, example.com/page?ref=newsletter, and example.com/page/ as separate pages — splitting ranking signals between all of them.

Every page should have a self-referencing canonical. If you have duplicate or near-duplicate content, the canonical on the copies should point to the primary.

Structured data

JSON-LD structured data in the <head> gives Google context about your content type. It's what unlocks rich results in search: star ratings, FAQ dropdowns, breadcrumbs, event dates.

Start with the schema type that matches your page: Article, Product, LocalBusiness, FAQPage, WebApplication. Validate using Google's Rich Results Test at search.google.com/test/rich-results.

URL structure

Short, descriptive, lowercase, hyphens (not underscores). example.com/technical-seo-guide beats example.com/article?id=1294&cat=5&ref=main. Avoid changing URL structures on live sites — every change risks broken links and lost rankings.

Mobile rendering

Google uses mobile-first indexing. The mobile version of your site is what Google primarily indexes. Test with Google's Mobile Friendly Test. Check that the mobile version has the same content as desktop — hidden content on mobile is still indexed but weighted lower.

Where to start

Fix HTTPS first. Then crawlability (robots.txt, noindex). Then canonical tags. Then Core Web Vitals. Then structured data. That order covers the most impactful technical issues for most sites.

Check your site's technical SEO now — free, no login required.

See how your site actually scores

Free audit. No login. Results in under 10 seconds.

Run a free SEO audit