Technical SEO Guide
Crawlability, indexing, and site performance — the foundation everything else sits on
Content and links can't rank a site that Google can't crawl. Technical SEO is the unglamorous work that makes everything else possible. Fix it once, benefit indefinitely.
Crawlability
Googlebot has to be able to access your pages. Three things block it:
- robots.txt Disallow: Check that you're not accidentally blocking pages you want indexed. The classic mistake is
Disallow: /deployed to production from a staging config. - noindex meta tag:
<meta name="robots" content="noindex">prevents indexing entirely. Check it didn't get left on from development. - Login walls: Content behind authentication can't be crawled. If you want it indexed, it needs to be accessible without login.
Use Google Search Console's URL Inspection tool to see exactly what Googlebot sees when it crawls a specific URL.
robots.txt
Your robots.txt lives at yourdomain.com/robots.txt. It tells crawlers which paths they can and can't access. A minimal but correct robots.txt:
User-agent: * Allow: / Sitemap: https://yourdomain.com/sitemap.xml
If you want to block specific paths (admin, internal search results, staging prefixes), add Disallow: /admin/ etc. Don't block CSS or JavaScript — Googlebot needs them to render your pages.
XML Sitemap
A sitemap lists all the URLs you want Google to know about. It doesn't force indexing, but it speeds up discovery — especially for new content and pages without many inbound links.
- ✓ Include only indexable pages (no noindex, no 404, no redirects)
- ✓ Include
<lastmod>dates for content that updates regularly - ✓ Submit via Google Search Console under Sitemaps
- ✗ Broken sitemap URLs (Google will report these)
HTTPS
HTTPS has been a Google ranking signal since 2014. The protocol also affects browser behavior — Chrome shows a "Not secure" warning on HTTP sites. There's no argument for HTTP in 2024.
Common issues: mixed content (HTTP assets on an HTTPS page), HTTP→HTTPS redirects using 302 instead of 301, or redirect chains that add latency.
Core Web Vitals
Google's page experience ranking signals. Three metrics:
LCP — Largest Contentful Paint
How long until the main content loads. Under 2.5 seconds is good. The most common culprit is a large above-the-fold image that isn't preloaded.
INP — Interaction to Next Paint
How responsive the page is to user interactions. Under 200ms is good. Heavy JavaScript on the main thread is the usual problem.
CLS — Cumulative Layout Shift
How much content moves around as the page loads. Under 0.1 is good. Reserve space for images and ads. Don't inject content above existing content.
Check your CWV scores in PageSpeed Insights and Google Search Console's Core Web Vitals report. Field data (real users) matters more than lab data.
Canonical tags
The canonical tag tells Google which version of a URL is the "real" one. Without it, Google might index example.com/page, www.example.com/page, example.com/page?ref=newsletter, and example.com/page/ as separate pages — splitting ranking signals between all of them.
Every page should have a self-referencing canonical. If you have duplicate or near-duplicate content, the canonical on the copies should point to the primary.
Structured data
JSON-LD structured data in the <head> gives Google context about your content type. It's what unlocks rich results in search: star ratings, FAQ dropdowns, breadcrumbs, event dates.
Start with the schema type that matches your page: Article, Product, LocalBusiness, FAQPage, WebApplication. Validate using Google's Rich Results Test at search.google.com/test/rich-results.
URL structure
Short, descriptive, lowercase, hyphens (not underscores). example.com/technical-seo-guide beats example.com/article?id=1294&cat=5&ref=main. Avoid changing URL structures on live sites — every change risks broken links and lost rankings.
Mobile rendering
Google uses mobile-first indexing. The mobile version of your site is what Google primarily indexes. Test with Google's Mobile Friendly Test. Check that the mobile version has the same content as desktop — hidden content on mobile is still indexed but weighted lower.
Where to start
Fix HTTPS first. Then crawlability (robots.txt, noindex). Then canonical tags. Then Core Web Vitals. Then structured data. That order covers the most impactful technical issues for most sites.
Check your site's technical SEO now — free, no login required.