Edge compute used to be for DNS and a little bit of Cloudflare Workers magic at the CDN layer. In 2026, it's a serious compute tier: Workers, Durable Objects, Deno Deploy, Vercel Edge, Fastly Compute. You can run real logic, real databases (sort of), and real ML at dozens of POPs around the world.
The question stops being "can we?" and becomes "should we?" Here's the framework we use.
What "edge" actually means
Before deciding, let's be specific. When we say "edge" we mean:
- CDN edge / edge workers — V8 or WASM runtimes at CDN POPs (Cloudflare, Fastly, Vercel, AWS Lambda@Edge)
- Regional compute — cloud regions close to users (AWS regions, multi-region deployments)
- Client edge — the user's own device (browser, mobile app)
- IoT edge — actual edge hardware (gateways, devices)
This post focuses on CDN edge and how it relates to origin. The patterns extend to the others.
The first question: latency budget
Every latency-sensitive feature has a budget. Interactive apps target ~100ms for responses to feel "instant" (Jakob Nielsen's classic research). Videos and games are tighter. Search autocomplete is ~150ms. AI chat is forgiving (~2s is acceptable).
For each feature, ask:
- What's the target end-to-end latency?
- What's unavoidable? Network RTT to user, TLS handshake, time to first byte from anywhere.
- What's left for compute?
If the math doesn't work at origin, edge becomes a candidate.
A concrete budget
A user in Mumbai, origin in us-east-1 (Virginia). Round-trip time ~200ms. If your target is 300ms end-to-end, you have 100ms for everything else (app server, database, rendering). That's tight.
Moving the same logic to a Mumbai edge POP: RTT ~15ms. You now have 285ms for app logic. Different universe.
If your origin is more than 100ms away from a meaningful fraction of your users, edge is worth considering for latency-sensitive features.
The second question: what does the feature need?
Edge compute has constraints. Know them before deciding.
Strong data gravity → origin
If your feature needs to read or write data that lives in a specific region, edge doesn't help unless you replicate the data to the edge (complicated, expensive).
Examples:
- Writing a purchase to the primary orders DB
- Reading a user's full profile (if it's only in one region)
- Joining data across several tables in a SQL database
For these, edge adds a hop instead of removing one. Stay at origin.
Read-heavy, cachable → edge shines
If the feature is mostly reads of data that can be cached or replicated cheaply, edge is excellent.
Examples:
- Personalization with precomputed user features stored in KV
- Content rendering from a CMS (the classic edge use case)
- Feature flag evaluation
- Rate limiting, bot detection
- A/B test assignment
Compute-heavy, stateless → edge works
Stateless or low-state compute near the user is a good fit.
Examples:
- Image transformations
- Lightweight ML inference (classifiers, embeddings)
- Input validation, geo-based routing
- Request/response rewriting
Heavy compute or long-running → origin
Edge runtimes have strict limits:
- Cloudflare Workers: 30 sec CPU time, 128MB memory (higher in paid tiers)
- Vercel Edge: 30 sec max duration
- AWS Lambda@Edge: 30 sec, 128MB
For anything longer-running or memory-intensive, you're at origin.
The third question: architecture complexity
Edge adds an explicit tier to your architecture. That means:
- Another place bugs can hide
- Another deploy pipeline
- Another monitoring surface
- Another runtime to maintain proficiency in
- Potentially different language (JS/TS for Workers, Rust/Go for some others)
For low-traffic apps, this complexity isn't worth the latency win. For high-traffic apps with aggressive latency targets, it's essential.
You can't ssh into an edge node. Logs are often sampled. Replay is difficult. Instrument heavily before deploying anything non-trivial.
The decision matrix
| Scenario | Recommendation |
|---|---|
| Reading precomputed/cachable data, latency-sensitive | Edge |
| Writing to a consistent primary DB | Origin |
| A/B test assignment, feature flags | Edge |
| Complex SQL joins across large tables | Origin |
| Personalization with edge KV | Edge |
| Heavy ML inference (> 500ms CPU) | Origin |
| Lightweight ML inference (< 50ms) | Edge |
| Image / video transformation | Edge (CDN) |
| User auth (JWT verification) | Edge |
| User auth (session lookup in Redis) | Origin (or replicated sessions) |
| Rate limiting | Edge |
| Fraud detection requiring historical data | Origin |
| Form submission | Origin (usually) |
| Real-time chat / WebSockets | Origin (or Durable Objects) |
Four patterns that actually work
From our engagements, these are the edge patterns with consistent ROI:
Pattern 1: Edge personalization with precomputed features
Problem: Every homepage visit hits a personalization service at origin, adding 150-300ms.
Solution:
- Batch-compute user features nightly (or in real-time with streaming).
- Replicate to edge KV storage (Cloudflare KV, Vercel Edge Config).
- Edge Worker reads user features, applies scoring logic, returns personalized HTML.
- Origin is consulted only for cold-start or explicit cache bust.
Outcome: p95 latency drops from ~300ms to ~30ms. Origin traffic drops 90%+. We did exactly this for a media platform case study.
Pattern 2: Edge auth + origin data
Problem: Every API request hits origin to validate a session, even if validation is trivial.
Solution:
- Use JWTs with short expiry (15 minutes).
- Edge Worker validates the JWT signature (fast, ~1ms).
- Only requests that need actual user data proceed to origin.
- Public/cached endpoints bypass origin entirely.
Outcome: Origin traffic reduced significantly. Geo-distributed auth means users see faster responses globally.
Pattern 3: Edge content + origin commerce
Problem: Product detail pages need to be fast globally, but checkout needs strong consistency.
Solution:
- Product pages rendered at edge from cached product data.
- Inventory/stock info cached with short TTL (30 seconds).
- Checkout and purchases hit origin directly — consistency wins over latency.
Outcome: Browsing feels instant worldwide; purchases are still reliable.
Pattern 4: Edge image transformation
Problem: Every image needs multiple sizes, formats (WebP, AVIF), and DPR variants. Pre-generating all combinations is wasteful.
Solution:
- Store one high-res master image at origin.
- Edge Worker transforms on-demand based on query params (
?w=800&q=80&fm=webp). - CDN caches transformed variants at each edge POP.
Outcome: Minimal origin storage, fast delivery, huge flexibility.
When edge is wrong
Equally important: the cases where edge is the wrong answer even though it seems appealing.
- "We want to be global." Geographic distribution of users isn't enough reason. You need a specific latency problem to solve.
- "We want to cut costs." Edge can be cheaper, but it's not guaranteed. Count compute-time + per-invocation costs + data transfer carefully.
- "We have a monolith and want to modernize." Edge is not a good first step in modernization. Get to a serviceable origin architecture first.
- "Our users are complaining about speed." Sometimes the answer is a simpler page, better caching at origin, or a CDN without edge compute.
Add real-user monitoring (RUM) to capture p50/p95 latency by region before deciding. If 90% of your users are in North America and your origin is in Virginia, edge won't move the needle.
Cost modeling
Rough comparison for 100M requests/month of moderate-complexity logic:
| Option | Approximate monthly cost |
|---|---|
| AWS Lambda (origin) | $500-800 |
| Cloudflare Workers (edge) | $500-700 |
| Vercel Edge Functions | $600-900 |
| Traditional servers with CDN caching | $300-1000 (depends heavily on cache hit rate) |
For write-heavy workloads, origin usually wins on cost. For read-heavy with high cache hit rates, edge is competitive or cheaper. Always factor in the hidden costs: engineering time, debugging, monitoring infrastructure.
The migration playbook
If you've decided edge is worth it:
- Start with one feature. Don't migrate your whole app. Pick one high-leverage, stateless-ish feature — often personalization or auth.
- Measure first. Baseline latency, cost, error rate.
- Shadow traffic. Run edge in parallel with origin, compare results, confirm correctness.
- Gradual rollout. 1% → 10% → 50% → 100% with monitoring at each step.
- Decommission the origin path only after edge has been stable for 2+ weeks.
Operational guardrails that prevent surprises
Before scaling edge workloads broadly, put these controls in place:
- regional error-rate and latency dashboards with alert thresholds
- explicit fallback paths to origin for each edge feature
- deployment blast-radius controls (geo, traffic percentage, feature flags)
- synthetic probes from key markets to validate user-path latency
- weekly cost-per-request review to detect hidden drift
Edge wins are real, but only when operational discipline matches architectural ambition.
Closing
Edge compute in 2026 is legitimate, production-ready, and sometimes transformative. It's also not a silver bullet. The best use is surgical: identify the handful of features where origin latency is a real user problem, move those specifically, leave the rest.
If you're not sure where edge fits for your app, start with the decision matrix above. Most features end up staying at origin — and that's fine.
Related: our media edge case study, AWS cost optimization, and how we help with infrastructure optimization.
Related resources
- Capabilities: Infrastructure Optimization and Edge Computing
- Case study: Media edge personalization
- Deep dive: Cloud cost AWS checklist
Tags
Anthra AI Team
Engineering Team
Collective posts from the engineers at Anthra AI. We write about what we build.
More posts by Anthra AI TeamShare this article
Get insights like this weekly
Product engineering notes on AI, data, and infrastructure - no fluff.
Previous post
Event schema design: what every product team gets wrong
Product Analytics
Next post
Building an internal analytics platform: the 14-week playbook
Product Analytics