Skip to main content

Error Handling

Errors come in two flavors and they're really not the same thing:

  1. Network / protocol errors. DNS failed, connection refused, TLS handshake blew up, the request timed out before a response. Either it never made it, or it made it and the server never replied. These come back as Go error / Python exception / Node thrown error / .NET exception.

  2. Real responses with a non-2xx status. The server got your request, processed it, didn't like it, and sent back 404 or 500 or whatever. The HTTP exchange is done. These come back as normal Response objects. You check StatusCode.

Mixing them up is the most common bug in this space. A 500 isn't a network error. The server told you no. The connection's fine.

The split, with code

resp, err := s.Get(ctx, url)
if err != nil {
// Network / protocol / context error. No response.
return err
}
defer resp.Close()

if resp.StatusCode >= 500 {
// Server-side error. The exchange completed.
return fmt.Errorf("server error: %d", resp.StatusCode)
}
if resp.StatusCode >= 400 {
// Client-side error. You sent something the server rejected.
return fmt.Errorf("client error: %d", resp.StatusCode)
}

// 2xx. We're good.
body, _ := resp.Bytes()

Common error shapes

Things that come back as a real error (not a Response):

DNS failure

dns_resolve nope.example: lookup nope.example: no such host

In Go, this wraps a *net.DNSError. Check IsNotFound, IsTemporary, IsTimeout on it.

var dnsErr *net.DNSError
if errors.As(err, &dnsErr) {
if dnsErr.IsNotFound { /* domain doesn't exist */ }
if dnsErr.IsTimeout { /* DNS server didn't reply in time */ }
}

In other bindings the message string contains dns_resolve or lookup.

Connection refused

dial example.com: dial tcp 1.2.3.4:443: connect: connection refused

Server isn't listening, or a firewall's dropping it. Same shape across all bindings.

TLS handshake failure

tls: handshake failure
remote error: tls: protocol_version

Could be a cert mismatch, expired cert, the server only speaks TLS 1.3 and your config disabled it, or an anti-bot system rejecting your fingerprint at TLS level. The message usually has a hint, but they're not always easy to read.

Timeout

context deadline exceeded
i/o timeout

In Go: errors.Is(err, context.DeadlineExceeded) returns true when the request didn't finish before your context deadline. There's also a wrapped *net.OpError with Timeout() == true for raw socket timeouts.

ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()

_, err := s.Get(ctx, "https://httpbin.org/delay/10")
if errors.Is(err, context.DeadlineExceeded) {
// we hit our timeout
}

Cancellation

context canceled

Same as timeout but voluntary. errors.Is(err, context.Canceled).

What's a real response (not an error)

These all come back as a populated Response with a status code, not as an error:

  • 4xx: Bad Request, Unauthorized, Forbidden, Not Found, Method Not Allowed, the usual suspects.
  • 5xx: Server errors, Bad Gateway, Service Unavailable, Gateway Timeout.
  • 3xx redirects (when the lib stops following them, e.g. with WithoutRedirects()).
  • Empty bodies, weird Content-Types, malformed JSON in the body.

The HTTP exchange completed. The server replied. Whether you treat it as a failure is business logic, not a transport concern.

Retry guidance

warning

Don't retry on 4xx. The server told you no for a reason. Retrying just hammers them and won't change the outcome. Fix the request.

Rough rules:

SituationRetry?
Network error (DNS, refused, reset)Yes, with backoff
TimeoutYes, but bump the deadline if the upstream is slow
TLS handshake failureNo, fix the config
4xxNo
5xx + idempotent verb (GET, HEAD, PUT, DELETE)Yes
5xx + POST/PATCHOnly if you're sure the server didn't already process it. POST is not idempotent.
429 Too Many RequestsYes, but back off harder. Honor Retry-After if present.

The session has built-in retry support. Default is off (issue #57 flipped the default from 3 to 0, because the old behavior silently retried POSTs on 5xx and broke idempotency assumptions):

s := httpcloak.NewSession("chrome-latest",
httpcloak.WithRetry(3),
// or fine-grained:
httpcloak.WithRetryConfig(3, 500*time.Millisecond, 10*time.Second, []int{500, 502, 503, 504}),
)

Pass status codes you actually want to retry on. Don't blindly retry on 4xx.

A timeout test

Easiest way to verify your timeout handling: hit httpbin.org/delay/N with a context shorter than N.

ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()

_, err := s.Get(ctx, "https://httpbin.org/delay/10")
fmt.Println(err)
// dial httpbin.org [h1]: dial tcp4 ...: i/o timeout
fmt.Println(errors.Is(err, context.DeadlineExceeded))
// true

/delay/10 makes the server sit on the request for 10 seconds. With a 2-second timeout you'll get back a deadline-exceeded error, no Response.

Logging tip

When debugging unknown failures in prod, log three things:

  1. The full error message (don't strip the wrap chain).
  2. The error's Go type (or Python class) so you can pattern-match later.
  3. If you got a Response: the status code and a tail of the body.

The error message alone isn't always enough to tell DNS-failed from timeout-on-DNS-server. The type usually is.