Caching Strategy Selection

Caching Strategy Selection

Decision tree backendperformancearchitecturesoftware designinfrastructure

Premature or misapplied caching adds complexity — stale data bugs, invalidation logic, and distributed consistency problems — without solving the actual bottleneck. This tree routes you to the caching pattern that matches your data access profile, so you apply the right tool to the right problem rather than defaulting to Redis for everything.

Overview

Type
Decision tree
Tags
backend, performance, architecture, software design, infrastructure
Entry
Q1
Questions
4
Outcomes
2
Author
Andrew
Last updated
2026-05-12

Decision Tree

Start: Have you measured a concrete performance problem that caching would solve — slow response times, database overload, or high infrastructure cost from repeated expensive computations?

yes

  • Continues to question: Is the content identical for all users — a public response not personalised to the requesting user or session?

no

  • Outcome: NO-CACHE

Machine-Readable JSON (Canonical Model)

View JSON
{
  "_meta": {
    "schema": "https://www.drawdecisiontree.com/decision-dag.schema.json",
    "source": "https://www.drawdecisiontree.com",
    "description": "DrawDecisionTree.com is a free tool for building, sharing, and embedding interactive decision trees. This file is the machine-readable export of a published decision tree. The `dsl` field contains the original source in the Decision DAG DSL; the `dag` schema is documented at the URL in `schema` above.",
    "links": {
      "interactive": "https://www.drawdecisiontree.com/t/drawdecisiontree/cache-strategy.html",
      "embed": "https://www.drawdecisiontree.com/embed/path/drawdecisiontree/cache-strategy",
      "dsl_reference": "https://www.drawdecisiontree.com/decision-tree-dsl-reference.html",
      "guides": "https://www.drawdecisiontree.com/guides",
      "schema_docs": "https://www.drawdecisiontree.com/decision-dag.schema.json",
      "author_trees": "https://www.drawdecisiontree.com/trees/drawdecisiontree"
    },
    "generated_at": "2026-05-29T12:05:39.252Z"
  },
  "author": {
    "handle": "drawdecisiontree",
    "first_name": "Andrew",
    "last_name": null,
    "avatar_url": "1d32d828-b6ca-40ec-bdd7-771fe7b9c36a/avatar-1778531481027.svg",
    "display_name": "Andrew"
  },
  "file": {
    "id": "275faf72-e037-4f5b-a8a7-8b7673877174",
    "name": "Caching Strategy Selection",
    "public_slug": "cache-strategy",
    "updated_at": "2026-05-12T16:53:43.587978+00:00",
    "url": "https://www.drawdecisiontree.com/t/drawdecisiontree/cache-strategy.html",
    "json_url": "https://www.drawdecisiontree.com/t/drawdecisiontree/cache-strategy/tree.json",
    "dsl_url": "https://www.drawdecisiontree.com/t/drawdecisiontree/cache-strategy/tree.dag"
  },
  "meta": {
    "description": "Premature or misapplied caching adds complexity — stale data bugs, invalidation logic, and distributed consistency problems — without solving the actual bottleneck. This tree routes you to the caching pattern that matches your data access profile, so you apply the right tool to the right problem rather than defaulting to Redis for everything.",
    "mode": "decision",
    "entry": "Q1",
    "tags": [
      "backend",
      "performance",
      "architecture",
      "software design",
      "infrastructure"
    ],
    "image": "https://images.unsplash.com/photo-1543199000-bde8a1e2bba0?w=1200&q=80"
  },
  "questions": [
    {
      "id": "Q1",
      "text": "Have you measured a concrete performance problem that caching would solve — slow response times, database overload, or high infrastructure cost from repeated expensive computations?"
    },
    {
      "id": "Q2",
      "text": "Is the content identical for all users — a public response not personalised to the requesting user or session?"
    },
    {
      "id": "Q3",
      "text": "Is this content delivered over HTTP and primarily consumed by browsers or external API clients?"
    },
    {
      "id": "Q4",
      "text": "Does the cache need to be accessible from multiple application servers or services — is the data shared across instances?"
    }
  ],
  "outcomes": [
    {
      "id": "CDN",
      "label": "Content Delivery Network (CDN)"
    },
    {
      "id": "DISTRIBUTED",
      "label": "Distributed Cache (Redis / Memcached)"
    }
  ],
  "dsl": "dag: Caching Strategy Selection\nversion: 1.0.0\nimage: https://images.unsplash.com/photo-1543199000-bde8a1e2bba0?w=1200&q=80\ndescription: Premature or misapplied caching adds complexity — stale data bugs, invalidation logic, and distributed consistency problems — without solving the actual bottleneck. This tree routes you to the caching pattern that matches your data access profile, so you apply the right tool to the right problem rather than defaulting to Redis for everything.\ntags: backend, performance, architecture, software design, infrastructure\nentry: Q1\n\nQ1: Have you measured a concrete performance problem that caching would solve — slow response times, database overload, or high infrastructure cost from repeated expensive computations?\n  hint: Measure before caching. Add instrumentation, check slow query logs, and profile your hot paths first. The most common mistake is adding a caching layer before establishing a performance baseline — you end up maintaining invalidation complexity without a measurable improvement. If you can't point to a specific metric, caching is premature.\n  yes -> Q2\n  no  -> [NO-CACHE]\n\nQ2: Is the content identical for all users — a public response not personalised to the requesting user or session?\n  hint: Public, shared content (a product catalogue, marketing page, public API response, static asset) is the ideal caching target because one cached copy serves all users. User-specific content (a dashboard, a user's order history, personalised recommendations) cannot be safely served from a shared cache without leaking data between users.\n  yes -> Q3\n  no  -> Q4\n\nQ3: Is this content delivered over HTTP and primarily consumed by browsers or external API clients?\n  hint: HTTP-delivered content benefits from edge caching at a CDN — requests are intercepted at a server geographically near the user, dramatically reducing latency and completely offloading origin traffic for cache hits. If the content is consumed programmatically by internal services over a private network, a CDN is not the right layer.\n  yes -> [CDN]\n  no  -> Q4\n\nQ4: Does the cache need to be accessible from multiple application servers or services — is the data shared across instances?\n  hint: If you run multiple application instances (for load balancing, resilience, or horizontal scaling), in-process caches are instance-local — cache misses and staleness vary between instances, causing inconsistent responses. Shared data requires a distributed cache that all instances can reach.\n  yes -> [DISTRIBUTED]\n  no  -> [IN-PROCESS]\n\n[NO-CACHE]: No Cache — Measure First\n  color: #868E96\n  description: The absence of a cache is not a failure — it is often the correct engineering decision. Every cache introduces a new consistency problem: when does the cached value become stale, how do you invalidate it, and what happens when the cache layer itself fails? These are non-trivial questions that add operational complexity and a class of subtle bugs (serving stale data after a write, cache stampedes under load, cache poisoning) that are hard to reproduce in development. Before adding any caching layer, establish a baseline: instrument your application to record response times and database query latency at the p50, p95, and p99 percentiles. Identify the actual bottleneck — is it a slow query, an expensive computation, a third-party API call, or simply under-provisioned infrastructure? Fix the root cause first. If you have done that analysis and caching is genuinely the right fix, revisit this tree with a specific target in mind.\n  code: CACHE_NONE\n\n[CDN]: Content Delivery Network (CDN)\n  color: #00B4D8\n  description: A CDN caches HTTP responses at edge nodes distributed globally — when a user requests a resource, the request is served from the nearest edge server rather than your origin, reducing latency from hundreds of milliseconds to single-digit milliseconds for cache hits. CDN caching is appropriate for static assets (JS, CSS, images, fonts), public API responses with stable data (product catalogues, public documentation), server-rendered HTML pages, and file downloads. Cache-Control headers (max-age, s-maxage, stale-while-revalidate) control how long the CDN holds each response before revalidating with the origin. Cloudflare, AWS CloudFront, Fastly, and Akamai are the major providers — each supports cache purging via API so you can invalidate content immediately after an update without waiting for TTL expiry. CDN caching is the highest-leverage caching investment available: a single CDN layer can absorb 80–95% of traffic for read-heavy public content, dramatically reducing origin load and infrastructure costs.\n  code: CACHE_CDN\n\n[DISTRIBUTED]: Distributed Cache (Redis / Memcached)\n  color: #DC382D\n  description: A distributed cache — Redis being the dominant choice — stores data in memory on a dedicated server or cluster accessible to all application instances over the network. This is the appropriate layer for application-level data that is expensive to compute or fetch, shared across multiple service instances, and changes infrequently enough that serving slightly stale data is acceptable. Common patterns include caching database query results (cache aside: read from cache, fall back to DB on miss, write to cache after DB read), caching third-party API responses, storing pre-computed aggregations, and session data. Redis's rich data structure support (strings, hashes, sorted sets, streams) enables patterns beyond simple key-value storage — leaderboards, rate limiting counters, and pub/sub coordination. The critical design question is cache invalidation: decide upfront whether you will use TTL-based expiry, explicit invalidation on write, or event-driven invalidation, and be consistent across all cache usage in your codebase. A cache that is sometimes invalidated correctly and sometimes not is worse than no cache.\n  code: CACHE_DISTRIBUTED\n\n[IN-PROCESS]: In-Process (Application Memory) Cache\n  color: #27AE60\n  description: An in-process cache stores data in the application's own memory — a dictionary, LRU map, or a library like Caffeine (Java), functools.lru_cache (Python), or node-cache (Node.js) — with zero network latency for cache hits. This is the fastest possible caching layer and the right choice when the data is small enough to fit comfortably in memory, the application runs as a single instance or where per-instance inconsistency is acceptable, and the cache hit rate is high enough that occasional misses are not costly. Typical use cases include caching configuration values loaded from a database or environment, memoising expensive pure-function computations, and caching reference data (country codes, feature flag evaluations, permission sets) that changes rarely. The key limitation: in-process caches are not shared — each application instance has its own copy, leading to inconsistency across instances when data changes. For single-instance applications or data where per-instance staleness is acceptable (configuration cached for 60 seconds), this is the simplest and most performant option. For multi-instance deployments with shared mutable data, use a distributed cache instead.\n  code: CACHE_IN_PROCESS\n"
}

DSL Representation

dag: Caching Strategy Selection
version: 1.0.0
image: https://images.unsplash.com/photo-1543199000-bde8a1e2bba0?w=1200&q=80
description: Premature or misapplied caching adds complexity — stale data bugs, invalidation logic, and distributed consistency problems — without solving the actual bottleneck. This tree routes you to the caching pattern that matches your data access profile, so you apply the right tool to the right problem rather than defaulting to Redis for everything.
tags: backend, performance, architecture, software design, infrastructure
entry: Q1

Q1: Have you measured a concrete performance problem that caching would solve — slow response times, database overload, or high infrastructure cost from repeated expensive computations?
  hint: Measure before caching. Add instrumentation, check slow query logs, and profile your hot paths first. The most common mistake is adding a caching layer before establishing a performance baseline — you end up maintaining invalidation complexity without a measurable improvement. If you can't point to a specific metric, caching is premature.
  yes -> Q2
  no  -> [NO-CACHE]

Q2: Is the content identical for all users — a public response not personalised to the requesting user or session?
  hint: Public, shared content (a product catalogue, marketing page, public API response, static asset) is the ideal caching target because one cached copy serves all users. User-specific content (a dashboard, a user's order history, personalised recommendations) cannot be safely served from a shared cache without leaking data between users.
  yes -> Q3
  no  -> Q4

Q3: Is this content delivered over HTTP and primarily consumed by browsers or external API clients?
  hint: HTTP-delivered content benefits from edge caching at a CDN — requests are intercepted at a server geographically near the user, dramatically reducing latency and completely offloading origin traffic for cache hits. If the content is consumed programmatically by internal services over a private network, a CDN is not the right layer.
  yes -> [CDN]
  no  -> Q4

Q4: Does the cache need to be accessible from multiple application servers or services — is the data shared across instances?
  hint: If you run multiple application instances (for load balancing, resilience, or horizontal scaling), in-process caches are instance-local — cache misses and staleness vary between instances, causing inconsistent responses. Shared data requires a distributed cache that all instances can reach.
  yes -> [DISTRIBUTED]
  no  -> [IN-PROCESS]

[NO-CACHE]: No Cache — Measure First
  color: #868E96
  description: The absence of a cache is not a failure — it is often the correct engineering decision. Every cache introduces a new consistency problem: when does the cached value become stale, how do you invalidate it, and what happens when the cache layer itself fails? These are non-trivial questions that add operational complexity and a class of subtle bugs (serving stale data after a write, cache stampedes under load, cache poisoning) that are hard to reproduce in development. Before adding any caching layer, establish a baseline: instrument your application to record response times and database query latency at the p50, p95, and p99 percentiles. Identify the actual bottleneck — is it a slow query, an expensive computation, a third-party API call, or simply under-provisioned infrastructure? Fix the root cause first. If you have done that analysis and caching is genuinely the right fix, revisit this tree with a specific target in mind.
  code: CACHE_NONE

[CDN]: Content Delivery Network (CDN)
  color: #00B4D8
  description: A CDN caches HTTP responses at edge nodes distributed globally — when a user requests a resource, the request is served from the nearest edge server rather than your origin, reducing latency from hundreds of milliseconds to single-digit milliseconds for cache hits. CDN caching is appropriate for static assets (JS, CSS, images, fonts), public API responses with stable data (product catalogues, public documentation), server-rendered HTML pages, and file downloads. Cache-Control headers (max-age, s-maxage, stale-while-revalidate) control how long the CDN holds each response before revalidating with the origin. Cloudflare, AWS CloudFront, Fastly, and Akamai are the major providers — each supports cache purging via API so you can invalidate content immediately after an update without waiting for TTL expiry. CDN caching is the highest-leverage caching investment available: a single CDN layer can absorb 80–95% of traffic for read-heavy public content, dramatically reducing origin load and infrastructure costs.
  code: CACHE_CDN

[DISTRIBUTED]: Distributed Cache (Redis / Memcached)
  color: #DC382D
  description: A distributed cache — Redis being the dominant choice — stores data in memory on a dedicated server or cluster accessible to all application instances over the network. This is the appropriate layer for application-level data that is expensive to compute or fetch, shared across multiple service instances, and changes infrequently enough that serving slightly stale data is acceptable. Common patterns include caching database query results (cache aside: read from cache, fall back to DB on miss, write to cache after DB read), caching third-party API responses, storing pre-computed aggregations, and session data. Redis's rich data structure support (strings, hashes, sorted sets, streams) enables patterns beyond simple key-value storage — leaderboards, rate limiting counters, and pub/sub coordination. The critical design question is cache invalidation: decide upfront whether you will use TTL-based expiry, explicit invalidation on write, or event-driven invalidation, and be consistent across all cache usage in your codebase. A cache that is sometimes invalidated correctly and sometimes not is worse than no cache.
  code: CACHE_DISTRIBUTED

[IN-PROCESS]: In-Process (Application Memory) Cache
  color: #27AE60
  description: An in-process cache stores data in the application's own memory — a dictionary, LRU map, or a library like Caffeine (Java), functools.lru_cache (Python), or node-cache (Node.js) — with zero network latency for cache hits. This is the fastest possible caching layer and the right choice when the data is small enough to fit comfortably in memory, the application runs as a single instance or where per-instance inconsistency is acceptable, and the cache hit rate is high enough that occasional misses are not costly. Typical use cases include caching configuration values loaded from a database or environment, memoising expensive pure-function computations, and caching reference data (country codes, feature flag evaluations, permission sets) that changes rarely. The key limitation: in-process caches are not shared — each application instance has its own copy, leading to inconsistency across instances when data changes. For single-instance applications or data where per-instance staleness is acceptable (configuration cached for 60 seconds), this is the simplest and most performant option. For multi-instance deployments with shared mutable data, use a distributed cache instead.
  code: CACHE_IN_PROCESS

Machine Access

Questions in this decision tree

Possible outcomes

How to use this decision tree

Click "Open interactive version" to step through the questions. Your answers narrow the tree until a recommended outcome is reached. You can also embed this tree on your own site.

More decision trees by Andrew

Which API design pattern is right for my project?
Which API design pattern is right for my project?
Determine the right API design style for your integration scenario.
Authentication Method Selection
Authentication Method Selection
Authentication is a security-critical, high-friction decision to reverse — migrating users from one auth method to another requires coordinated password resets or credential migration campaigns. This tree eliminates methods that don't match your user type, enterprise requirements, and security posture, giving you a clear shortlist before you write a line of code.
CI/CD Pipeline Tool Selection
CI/CD Pipeline Tool Selection
Choosing a CI/CD platform is a long-term infrastructure commitment — pipelines accumulate config, custom scripts, and team muscle memory that make switching painful. This tree eliminates tools that don't fit your source control host, infrastructure model, or team scale, leaving only the options genuinely viable for your situation.
Which cloud provider should I use — AWS, Azure, or Google Cloud?
Which cloud provider should I use — AWS, Azure, or Google Cloud?
Answer a few questions to identify the most suitable cloud platform for your workload.
Container Orchestration Platform Selection
Container Orchestration Platform Selection
Container orchestration is foundational infrastructure — the platform you choose shapes how you deploy, scale, network, and operate every service you run. This tree eliminates options that don't match your operational maturity, cloud provider commitment, and workload complexity, so you land on the platform that fits your team today without over-engineering for a scale you haven't reached.
How do I assess the health of a customer account?
How do I assess the health of a customer account?
Classify a customer's health score to guide proactive engagement and retention strategy. Use this tree during your regular account reviews or whenever a trigger event—such as a low NPS, a support spike, or a missed check-in—prompts a reassessment. The outcome drives the cadence and urgency of your next CSM action.