EngineeringApril 14, 2026

How Our Redis Caching Layer Achieves Sub-Millisecond Redirects

by Qorlivo Team

When someone clicks a short link, they expect it to be instant. Not "pretty fast" — instant. Every millisecond of redirect latency is a millisecond where your user is staring at a blank screen wondering if the link is broken.

At Qorlivo, our redirect handler responds in under a millisecond for cached links. Here's how we built it.

The architecture

Our redirect flow has two layers:

  1. Redis (Upstash) — edge-deployed, sub-millisecond reads
  2. Convex — source of truth, handles cache misses

When a request hits qrlvo.co/abc123, the handler does this:

Request → Redis lookup → HIT? → 302 redirect
                       → MISS? → Convex query → cache in Redis → 302 redirect

The vast majority of redirects never touch the database. Redis serves them directly from the edge.

Why Upstash Redis

We chose Upstash for a few reasons:

  • Serverless-native — no persistent connections, works perfectly with Vercel's serverless functions
  • Global replication — data is replicated to multiple regions, so reads are fast regardless of where the user is
  • Pay-per-request — we only pay for what we use, which matters at the free tier

The Upstash REST API means we don't need to manage connection pools or worry about connection limits — a common headache with traditional Redis in serverless environments.

Cache key design

Our cache keys are straightforward:

link:{slug} → destination URL

For custom domains, we namespace by domain:

link:{domain}:{slug} → destination URL

We store the full destination URL as a simple string value. No JSON parsing, no deserialization — just a raw string that goes straight into the Location header.

Cache invalidation

The hardest problem in computer science, right? For us, it's actually simple:

  • Link created → write to cache immediately
  • Link updated → delete cache key (next request repopulates)
  • Link deleted → delete cache key
  • Link toggled off → delete cache key

We invalidate eagerly on writes and let the redirect handler repopulate on the next read. This means there's a brief window (one request) where a cache miss falls through to Convex, but that's by design — it keeps the logic simple and the data consistent.

The redirect handler

The core handler is surprisingly small:

// Simplified version of our redirect logic
const cached = await redis.get(`link:${slug}`);

if (cached) {
  // Cache hit — record click asynchronously, redirect immediately
  recordClick(slug, request); // fire-and-forget
  return Response.redirect(cached, 302);
}

// Cache miss — query Convex
const link = await convex.query(api.links.getBySlug, { slug });

if (!link || !link.isActive) {
  return new Response("Not found", { status: 404 });
}

// Populate cache for next time
await redis.set(`link:${slug}`, link.url, { ex: 86400 });

recordClick(slug, request);
return Response.redirect(link.url, 302);

Note that recordClick is fire-and-forget — we don't wait for analytics to complete before sending the redirect. The user gets their 302 as fast as possible, and click tracking happens asynchronously.

Why 302, not 301

This is a deliberate choice. A 301 Moved Permanently tells browsers to cache the redirect locally. That means:

  • Future clicks bypass our server entirely
  • We lose analytics data for those clicks
  • Link updates won't take effect until the browser cache expires

A 302 Found ensures every click passes through our server, giving us accurate analytics and instant link updates. The ~1ms overhead is worth it.

Performance in practice

On a warm cache (which is most requests):

  • Redis lookup: ~0.5ms (edge region)
  • 302 response: ~0.2ms
  • Total redirect time: under 1ms

On a cold cache:

  • Convex query: ~15-30ms
  • Redis write: ~1ms
  • 302 response: ~0.2ms
  • Total: ~20-35ms (still fast, and subsequent requests are cached)

What's next

We're exploring a few optimizations:

  • Predictive warming — pre-cache links that are likely to be accessed based on creation patterns
  • Regional cache sharding — store cache entries closer to where they're most accessed
  • Cache TTL tuning — dynamic TTLs based on link popularity

But honestly, for most use cases, the current architecture is more than fast enough. Sometimes the best optimization is knowing when to stop optimizing.


Want to see it in action? Create a short link and check the redirect speed yourself.