mirror of https://git.madhouse-project.org/algernon/iocaine.git synced 2025-02-18 23:48:47 +01:00

This rebuilds the templating so that the *content* is no longer
pre-generated, only the parameters. It is up to the template (and some
newly implemented helper functions) to construct the output from those.

Signed-off-by: Gergely Nagy <me@gergo.csillger.hu>

2025-01-29 00:20:21 +01:00

2.3 KiB

Raw Blame History

title	description
Using Caddy with iocaine	Setting up Caddy to front for iocaine

Getting started

In here, I assume that iocane has already been configured and deployed. Lets assume that we have a site running at [::1]:8080, and we want to serve that Caddy. Normally, that would look something like this:

blog.example.com {
  reverse_proxy [::1]:8080
}

Routing AI agents elsewhere

To serve iocaine's garbage to AI visitors, what we need is a matcher, and a matched reverse_proxy:

blog.example.com {
  @ai {
    header_regexp user-agent (?i:gptbot|chatgpt|ccbot|claude)
  }
  reverse_proxy @ai 127.0.0.1:42069
  reverse_proxy [::1]:8080
}

Applying rate limits

We can do even better than this, though! We can apply rate limits using caddy-ratelimit! Unfortunately, that leads to a slightly more complex configuration, involving a bit of repetition, but one we can mitigate with a snippet. Lets start with that:

(ai-bots) {
  header_regexp user-agent (?i:gptbot|chatgpt|ccbot|claude)
}

This is essentially the same thing as the @ai matcher, lifted out. The reason it had to be lifted out, is because the same matcher will have to be reused in slightly differring contexts, including ones where I can't use a named matcher. It sounds more complicated than it is, really, so let me show the final result:

blog.example.com {
  rate_limit {
    zone ai-bots {
      match {
        import ai-bots
      }
      key {user_agent}
      events 16
      window 1m
    }
  }
  
  @ai {
    import ai-bots
  }
  @not-ai {
    not {
      import ai-bots
    }
  }
  
  reverse_proxy @ai 127.0.0.1:42069
  reverse_proxy @not-ai [::1]:8080
}

This does two things: it routes AI user-agents to iocaine, and applies a 16 request / minute rate limit, by user agent. If the rate limit is exceeded, Caddy will return a HTTP 429 ("Too Many Requests"), with a Retry-After header, to encourage them to come back to our little maze. Rate limiting is keyed by user agent, because most crawlers use many hosts to crawl a site at the same time, where each would remain well under reasonable limits - but together, they're a massive pain. So the above snippet is keyed by user agent instead!

2.3 KiB Raw Blame History

Getting started

Routing AI agents elsewhere

Applying rate limits

2.3 KiB

Raw Blame History