iocaine/docs/content/deploying/caddy.md

---
title: Using Caddy with iocaine
description: Setting up Caddy to front for iocaine
---

# Getting started

In here, I assume that iocane has already been [configured](@/configuration/index.md) and [deployed](@/deploying/iocaine.md). Lets assume that we have a site running at `[::1]:8080`, and we want to serve that `Caddy`. Normally, that would look something like this:

```caddyfile
blog.example.com {
  reverse_proxy [::1]:8080
}
```

# Routing AI agents elsewhere

To serve `iocaine`'s garbage to AI visitors, what we need is a matcher, and a matched `reverse_proxy`:

```caddyfile
blog.example.com {
  @ai {
    header_regexp user-agent (?i:gptbot|chatgpt|ccbot|claude)
  }
  reverse_proxy @ai 127.0.0.1:42069
  reverse_proxy [::1]:8080
}
```

# Applying rate limits

We can do even better than this, though! We can apply rate limits using [caddy-ratelimit](https://github.com/mholt/caddy-ratelimit)! Unfortunately, that leads to a slightly more complex configuration, involving a bit of repetition, but one we can mitigate with a snippet. Lets start with that:

```caddyfile
(ai-bots) {
  header_regexp user-agent (?i:gptbot|chatgpt|ccbot|claude)
}
```

This is essentially the same thing as the `@ai` matcher, lifted out. The reason it had to be lifted out, is because the same matcher will have to be reused in slightly differring contexts, including ones where I can't use a named matcher. It sounds more complicated than it is, really, so let me show the final result:

```caddyfile
blog.example.com {
  rate_limit {
    zone ai-bots {
      match {
        import ai-bots
      }
      key {user_agent}
      events 16
      window 1m
    }
  }
  
  @ai {
    import ai-bots
  }
  @not-ai {
    not {
      import ai-bots
    }
  }
  
  reverse_proxy @ai 127.0.0.1:42069
  reverse_proxy @not-ai [::1]:8080
}
```

This does two things: it routes AI user-agents to `iocaine`, and applies a 16 request / minute rate limit, by user agent. If the rate limit is exceeded, Caddy will return a HTTP 429 ("Too Many Requests"), with a `Retry-After` header, to encourage them to come back to our little maze. Rate limiting is keyed by user agent, because most crawlers use *many* hosts to crawl a site at the same time, where each would remain well under reasonable limits - but together, they're a massive pain. So the above snippet is keyed by user agent instead!
Move documentation to a dedicated site Signed-off-by: Gergely Nagy <me@gergo.csillger.hu> 2025-01-25 01:31:38 +01:00			`---`
			`title: Using Caddy with iocaine`
			`description: Setting up Caddy to front for iocaine`
			`---`

			`# Getting started`

Make templating actually useful This rebuilds the templating so that the content is no longer pre-generated, only the parameters. It is up to the template (and some newly implemented helper functions) to construct the output from those. Signed-off-by: Gergely Nagy <me@gergo.csillger.hu> 2025-01-29 00:18:28 +01:00			In here, I assume that iocane has already been [configured](@/configuration/index.md) and [deployed](@/deploying/iocaine.md). Lets assume that we have a site running at `[::1]:8080`, and we want to serve that `Caddy`. Normally, that would look something like this:
Move documentation to a dedicated site Signed-off-by: Gergely Nagy <me@gergo.csillger.hu> 2025-01-25 01:31:38 +01:00
			```caddyfile
			`blog.example.com {`
			`reverse_proxy [::1]:8080`
			`}`
			```

			`# Routing AI agents elsewhere`

			To serve `iocaine`'s garbage to AI visitors, what we need is a matcher, and a matched `reverse_proxy`:

			```caddyfile
			`blog.example.com {`
			`@ai {`
			`header_regexp user-agent (?i:gptbot\|chatgpt\|ccbot\|claude)`
			`}`
			`reverse_proxy @ai 127.0.0.1:42069`
			`reverse_proxy [::1]:8080`
			`}`
			```

			`# Applying rate limits`

			`We can do even better than this, though! We can apply rate limits using [caddy-ratelimit](https://github.com/mholt/caddy-ratelimit)! Unfortunately, that leads to a slightly more complex configuration, involving a bit of repetition, but one we can mitigate with a snippet. Lets start with that:`

			```caddyfile
			`(ai-bots) {`
			`header_regexp user-agent (?i:gptbot\|chatgpt\|ccbot\|claude)`
			`}`
			```

			This is essentially the same thing as the `@ai` matcher, lifted out. The reason it had to be lifted out, is because the same matcher will have to be reused in slightly differring contexts, including ones where I can't use a named matcher. It sounds more complicated than it is, really, so let me show the final result:

			```caddyfile
			`blog.example.com {`
			`rate_limit {`
			`zone ai-bots {`
			`match {`
			`import ai-bots`
			`}`
			`key {user_agent}`
			`events 16`
			`window 1m`
			`}`
			`}`

			`@ai {`
			`import ai-bots`
			`}`
			`@not-ai {`
			`not {`
			`import ai-bots`
			`}`
			`}`

			`reverse_proxy @ai 127.0.0.1:42069`
			`reverse_proxy @not-ai [::1]:8080`
			`}`
			```

			This does two things: it routes AI user-agents to `iocaine`, and applies a 16 request / minute rate limit, by user agent. If the rate limit is exceeded, Caddy will return a HTTP 429 ("Too Many Requests"), with a `Retry-After` header, to encourage them to come back to our little maze. Rate limiting is keyed by user agent, because most crawlers use many hosts to crawl a site at the same time, where each would remain well under reasonable limits - but together, they're a massive pain. So the above snippet is keyed by user agent instead!