LogBeast Crawler Blog Download Free

Reducing 404 Errors with Log Analysis

Find and fix broken links that hurt your SEO and user experience. Learn to prioritize 404s by traffic impact and recover lost link equity.

🔗

The Impact of 404 Errors

404 errors hurt your website in multiple ways:

🔑 Priority Rule: Fix 404s that receive the most traffic and have the most backlinks first. Not all 404s are equal.

Types of 404 Errors

Real 404s (Legitimate)

Content that was deleted intentionally:

Soft 404s (Problematic)

Pages that return 200 OK but show "not found" content. Google hates these because they waste crawl budget.

Broken 404s (Most Damaging)

Content that should exist but doesn't:

Finding 404s in Server Logs

Basic 404 Discovery

# Find all 404 errors
awk '$9 == 404' access.log

# Count 404s by URL
awk '$9 == 404 {print $7}' access.log | sort | uniq -c | sort -rn | head -50

# 404s with referrer (find broken internal links)
awk '$9 == 404' access.log | awk -F'"' '{print $4, "->", $2}' | grep -v "^-"

Find 404s Hit by Googlebot

# Googlebot 404s (priority - affects SEO)
grep "Googlebot" access.log | awk '$9 == 404 {print $7}' | sort | uniq -c | sort -rn

# Googlebot 404s over time (trending)
grep "Googlebot" access.log | awk '$9 == 404 {print $4}' | cut -d: -f1 | sort | uniq -c

Find Referrers (Source of Broken Links)

# Where are 404 clicks coming from?
awk '$9 == 404' access.log | awk -F'"' '{print $4}' | sort | uniq -c | sort -rn | head -30

# Internal vs External referrers
awk '$9 == 404' access.log | awk -F'"' '{print $4}' | grep "yourdomain.com" | sort | uniq -c

Prioritizing What to Fix

Priority Matrix

  1. High traffic + backlinks: Fix immediately with 301 redirect
  2. High traffic, no backlinks: Redirect or recreate content
  3. Low traffic + backlinks: Redirect to relevant page
  4. Low traffic, no backlinks: Leave as 404 or redirect to category

Calculating Impact

# Combine with traffic data
# Most hit 404s = highest priority
awk '$9 == 404 {print $7}' access.log | sort | uniq -c | sort -rn | head -20

# Check if important pages link to 404s
# (high referring page authority = high priority)

💡 Pro Tip: Use LogBeast to automatically prioritize 404s by traffic volume, Googlebot hits, and referrer importance. Get a fix-first list without manual analysis.

How to Fix 404 Errors

1. Redirect (Most Common)

# .htaccess (Apache)
Redirect 301 /old-page /new-page

# Or with pattern matching
RedirectMatch 301 ^/products/old-(.*)$ /products/new-$1
# nginx
location = /old-page {
    return 301 /new-page;
}

# Or in server block
rewrite ^/old-page$ /new-page permanent;

2. Restore Content

If the page was deleted accidentally:

3. Update Internal Links

# Find pages linking to 404
grep "/old-broken-url" access.log | awk -F'"' '{print $4}' | sort | uniq

# Then update those pages to use correct URLs

4. Custom 404 Page

For legitimate 404s, create a helpful error page:

Preventing Future 404s

Before Deleting Content

  1. Check for incoming traffic in analytics
  2. Check for backlinks (Ahrefs, Search Console)
  3. Set up redirect before deletion
  4. Update internal links

During Site Migrations

  1. Map all old URLs to new URLs
  2. Set up 301 redirects
  3. Test thoroughly before launch
  4. Monitor 404s after migration

Ongoing Monitoring

#!/bin/bash
# Weekly 404 report
echo "=== 404 Report ==="
echo "Total 404s this week:"
awk '$9 == 404' access.log | wc -l
echo ""
echo "Top 404 URLs:"
awk '$9 == 404 {print $7}' access.log | sort | uniq -c | sort -rn | head -20
echo ""
echo "Googlebot 404s:"
grep "Googlebot" access.log | awk '$9 == 404 {print $7}' | sort | uniq -c | sort -rn | head -10

🎯 Recommendation: Set up weekly 404 monitoring. Catching broken links early prevents traffic loss and maintains link equity. LogBeast includes automated 404 alerts and prioritized fix lists.