A design doc template gives engineers a shared structure for proposing how a feature, service, or system will be built before any code lands in the main branch. The good ones force the author to think through trade-offs, alternatives, and cross-cutting concerns that get expensive to fix once implementation starts. The bad ones become ceremony, padded with sections nobody reads.
This post gives you a design doc template you can copy into your wiki today, plus section-by-section guidance, two short worked examples, and the common pitfalls that turn design docs into wasted hours. If you also want generic project documentation patterns, the documentation template guide covers that broader category. If you're already comfortable with engineering specs and want to record decisions after the fact, head to the architecture decision record post for the sister artifact.
What a design doc actually is
A design doc is a pre-implementation engineering proposal. It describes the problem you're solving, the solution you're proposing, the alternatives you considered, and the trade-offs that pushed you toward your pick. It's the artifact senior engineers, security reviewers, and on-call partners read before they say "yes, build it" or "wait, you haven't thought about X."
Design docs are not requirements documents. They're not user stories. They're not implementation manuals. They sit at the layer where engineering intent gets pinned down: the system context, the goals, the proposed shape of the solution, the things you decided not to do, and what could go wrong on rollout.
Google popularized the format in their engineering culture. Stripe, Linear, Uber, and most large infrastructure teams use a close variant. The format works because it forces decisions out into prose where reviewers can challenge them, instead of letting decisions hide inside a half-written pull request.
When to write one (and when not to)
Not every change deserves a design doc. The decision is mostly a function of ambiguity and blast radius.
Write a design doc when:
- The problem has more than one reasonable solution
- The change touches more than one team or service
- The work needs senior or cross-team review before it ships
- Security, privacy, or data integrity are in scope
- You're rewriting a system or making a migration that has to land cleanly the first time
Skip the design doc when:
- The fix is obvious and isolated to one file
- You're prototyping to learn, not to ship
- The work is small enough that a pull request description covers it
- The trade-offs are already settled by an existing doc or runbook
Industrial Empathy's writeup on Google's design doc culture puts it sharply: if a doc is really an implementation manual, you should have written the code instead. The whole point is to surface trade-offs, not to narrate steps.
The design doc template
Copy this into your wiki, your repo, or any blank doc. It works for back-end services, front-end features, ML systems, and infrastructure projects with light edits.
# [Project Name] Design Doc
| Field | Value |
|-------|-------|
| Author(s) | Your name, co-authors |
| Reviewers | Names + roles (eng lead, security, product) |
| Status | Draft / In review / Approved / Implemented |
| Last updated | YYYY-MM-DD |
| Related docs | Links to PRDs, ADRs, prior design docs |
## 1. Context and scope
What is the system today? What is the problem we are solving? Why now?
Keep this section short. Two or three paragraphs. Link out to background
material instead of restating it.
## 2. Goals and non-goals
### Goals
- Specific, measurable outcomes this design delivers
- One bullet per goal, not paragraphs
### Non-goals
- Things that could reasonably be goals but are explicitly out of scope
- Bound the work so reviewers know what is and isn't on the table
## 3. Proposed solution
Start with a one-paragraph summary so a busy reader gets the gist.
Then go deeper:
- High-level architecture (diagram or system context)
- Key components and what each owns
- Data model changes (schemas, new tables, new fields)
- API or interface changes (signatures, payloads)
- Sequence of operations for the main flows
Focus on the parts where you made trade-offs. Skip the parts that are
obvious applications of existing patterns.
## 4. Alternatives considered
For each alternative:
- What it is, in one or two sentences
- Why it's plausible
- Why you didn't pick it (the trade-off that tipped the decision)
Two or three alternatives is usually enough. The point is to show the
solution space was explored, not to exhaust it.
## 5. Cross-cutting concerns
### Security and privacy
- New attack surface
- Data classification of anything new being stored
- Auth and access controls
### Reliability and observability
- Failure modes
- Metrics, logs, traces you'll add
- Alert thresholds
### Performance and scale
- Expected load
- Bottlenecks
- Capacity headroom
### Cost
- Infrastructure delta
- Per-request or per-user cost where relevant
## 6. Rollout plan
- Feature flag or staged rollout strategy
- Migration steps if there's existing data
- Backwards compatibility window
- Rollback plan if something regresses
## 7. Open questions
Things you don't have answers to yet, that reviewers might.
List them so they don't get lost in comment threads.
## 8. Timeline
Rough phases with target dates. Two or three milestones is plenty.
This is a planning aid, not a contract.
That's the whole template. Twelve to fifteen pages of prose at the high end, two to three pages for a narrower scope. If you're writing more than twenty pages, the project is probably too big for one doc and should split.
Section-by-section guidance
A template is only as useful as the prose you put inside it. Here's what each section should and shouldn't contain.
Header metadata
The header is the part most people skip and then regret. Status, last-updated date, and the reviewer list are how readers six months later figure out whether the doc still reflects reality. Keep status accurate as the doc moves: Draft means active editing, In review means asking for feedback, Approved means decided, Implemented means shipped.
Context and scope
This is for objective background only. What does the current system look like? What's broken or missing? What's changed in the last quarter that makes this work necessary now? No requirements, no opinions, no pitching the solution. The reader should walk away knowing the situation, not the answer.
A common mistake is to bury the actual problem under three pages of system history. If somebody who's been on the team for two years already knows everything in this section, you wrote too much. Trim until only the new information remains.
Goals and non-goals
Goals are specific outcomes the work delivers. "Reduce p99 latency on /search from 800ms to under 250ms" is a goal. "Make search fast" is a wish. Tie each goal to something you can measure or demo.
Non-goals are the underrated half of this section. They aren't anti-goals like "don't crash." They're things that would be reasonable goals but aren't part of this project. "ACID compliance" might be a non-goal for a metrics pipeline. "Real-time streaming" might be a non-goal for a daily reporting service. Listing them prevents reviewers from asking "what about X?" five times.
Proposed solution
Lead with one paragraph that summarizes the entire approach. Anyone too busy to read the whole doc should be able to skim that paragraph and the alternatives section and have a sense of what you're doing.
Then expand. Use diagrams where they actually save words, especially for cross-service flows. Skip them when prose is faster. Don't paste full schema definitions or interface definitions; those go stale and clutter the trade-off discussion. Sketch the API surface that matters and link out to the formal definition once it exists.
The section should focus on decisions, not steps. "We will use Postgres logical replication" is a decision. "First we open a connection, then we run a query, then we close the connection" is an implementation manual.
Alternatives considered
The most useful section in any design doc. Two to three alternatives, each with a one-paragraph summary and an explicit trade-off statement. "We considered Kafka. It's the right tool for high-throughput, multi-consumer pipelines, but our load is 50 events/sec and we already operate Postgres, so the operational cost outweighed the throughput headroom."
Every alternative should be one a smart reviewer might suggest. If you list a strawman nobody would actually propose, the section reads as defensive instead of rigorous.
Cross-cutting concerns
Security, privacy, reliability, observability, performance, cost. Most teams require these as a checklist because skipping them is how production incidents start. Each subsection should be short, often a single paragraph, but explicit about how the design addresses the concern.
If a concern doesn't apply, say so and why. "No new PII is stored, so privacy review is not required" is a valid entry. Empty sections are not.
Rollout plan
Feature flag, percentage rollout, or full deploy. Migration steps for any existing data. The backwards-compatibility window if you're changing an interface. Rollback plan that doesn't depend on the thing you're rolling out being healthy.
This is the section that gets sliced the thinnest and bites teams the hardest. If the rollout plan fits in two bullets, the system either has no risk or you haven't thought about it enough.
Open questions and timeline
Open questions are honest. List the things you genuinely don't know. Reviewers love this section because it tells them where their input matters. Timeline is rough phases with target weeks or months. Three milestones is plenty.
Worked example 1: notification service migration
Here's what a slim version looks like for a real-shaped project.
# Move notifications from monolith to dedicated service
| Author | Sara C. |
| Reviewers | Eng lead, platform team, on-call rotation |
| Status | In review |
## Context and scope
Notifications (email, push, in-app) live in the monolith. They share a
database with billing, which has caused two incidents this quarter where
notification load slowed down checkout. We need to isolate them.
## Goals
- Notifications run in their own service with its own DB
- p99 send latency under 2s
- Zero dropped notifications during cutover
## Non-goals
- Adding new notification channels (SMS, Slack)
- Per-user notification preferences UX
- Replacing the templating engine
## Proposed solution
Stand up a Go service behind an internal HTTP API. Owns its own Postgres
DB. Monolith publishes notification jobs to a Redis queue; new service
consumes. Cutover via feature flag per-channel, starting with in-app.
## Alternatives considered
- Keep notifications in monolith, add resource isolation (DB read replica,
separate worker pool). Faster to ship but doesn't solve the shared-DB
blast radius. Rejected.
- Use SNS/SQS instead of internal Redis queue. Operationally simpler but
adds a vendor dependency we don't have elsewhere. Held for v2.
## Cross-cutting concerns
- Security: same auth model as monolith, internal-only endpoints
- Reliability: at-least-once delivery, dedup by job ID
- Observability: per-channel send rate, latency, failure rate metrics
- Cost: ~$80/mo additional infrastructure
## Rollout plan
1. Deploy service with no traffic, verify health checks
2. Cutover in-app notifications behind flag, 1% -> 10% -> 100%
3. Cutover push, then email, same ramp
4. Decommission monolith notification code after 30-day soak
## Open questions
- Do we need cross-region replication, or is single-region acceptable?
- Should the queue be Redis Streams or BullMQ?
Three pages, every section earns its place, alternatives are real options a reviewer might raise. That's the bar.
Worked example 2: search ranking experiment
The same template, narrower scope.
# Add learned-to-rank model to /search
| Author | Dev R. |
| Reviewers | Search team, ML platform |
| Status | Draft |
## Context and scope
/search currently uses BM25 with hand-tuned field boosts. CTR on the top
3 results has been flat for two quarters. Click-through data on 8M
queries is now available in the warehouse.
## Goals
- Rerank top 50 BM25 results with a learned model
- Improve top-3 CTR by 10% on the holdout set
- Hold p99 search latency under 350ms
## Non-goals
- Replacing BM25 as the recall stage
- Personalization (per-user signals)
- Rebuilding the search UI
## Proposed solution
Train a LightGBM model offline on click-through data. Serve via a sidecar
container that the search service calls after BM25. Features: text match
scores, document age, document type, popularity.
## Alternatives considered
- BERT-based reranker. Higher ceiling but 5-10x latency cost. Park for v2.
- Heuristic rules tuned on click data. Faster to ship but doesn't
generalize and adds maintenance burden. Rejected.
## Cross-cutting concerns
- Reliability: fall back to pure BM25 if reranker times out (50ms budget)
- Observability: log feature values + final score for offline analysis
- Privacy: training data is aggregated query stats, no user IDs
## Rollout plan
- Shadow mode for 2 weeks (compute scores, don't apply)
- A/B test 10% traffic for 1 week
- Ramp to 100% if CTR target hits and latency holds
Both examples fit on a single screen but contain every section a reviewer needs. That's the goal: enough structure to surface trade-offs, not so much that the doc becomes an artifact unto itself.
Design doc vs ADR vs RFC
These three artifacts overlap, and engineers conflate them constantly. Each has a different job.
| Artifact | Time horizon | Audience | Purpose | Status changes |
|---|---|---|---|---|
| Design doc | Pre-implementation | Team + reviewers | Propose how to build something | Draft, in review, approved |
| ADR | At decision time | Team + future engineers | Record why a decision was made | Proposed, accepted, superseded |
| RFC | Org-wide proposal | Wider org | Get cross-team feedback on a change | Open, accepted, withdrawn |
A design doc is forward-looking. You write it before the work, you iterate during review, you implement against it, and it ages out as a historical record. An architecture decision record is a single decision frozen in time, written when the call gets made, and never edited again. An RFC is broader and more public, used when a proposal touches many teams or external contributors.
In practice, many teams use design docs and ADRs together. The design doc proposes the system. After the proposal is approved, the key decisions inside it (which database, which queue, which protocol) get extracted into one-page ADRs. Future engineers reading the codebase find the ADR fast; engineers wanting full context find the design doc.
Where teams keep design docs
Most engineering teams default to Google Docs or Notion for the writing-and-review phase, then either leave docs there or copy them into a more permanent home. Both work. The trade-off is review speed versus searchability.
Google Docs makes line-by-line review easy. Notion is friendlier for long-term knowledge. Neither is great when you want a public-facing or customer-visible archive of how systems were designed. That's where teams that publish their internal documentation on a real docs site get an advantage: the design doc archive becomes searchable, linkable from code comments, and discoverable to new engineers without granting Drive permissions to half the org.
For teams who want this without standing up a docs platform from scratch, Docsio generates a branded internal docs site you can drop a design doc archive into in minutes. Pair it with a docs-driven development workflow and the design doc becomes the artifact engineers actually open six months later.
Common pitfalls
The design doc template is easy. The discipline around it is what makes the difference between a doc that earns review time and a doc that gets approved without comments because nobody read it.
Skipping non-goals. A doc with goals but no non-goals invites scope creep. Reviewers will keep asking "have you considered X?" until you list X as out of scope.
One alternative. If the alternatives section has only one entry, you're either covering yourself or you didn't actually consider options. Two or three real alternatives, each with an honest trade-off statement, is the bar.
Implementation manual disguised as design. If your "proposed solution" reads like a numbered list of steps, you've written a runbook, not a design. Cut the steps and add the trade-offs you made when picking the approach.
Empty cross-cutting sections. "Security: N/A" is a red flag. Either the concern doesn't apply (say why) or you haven't thought about it. Either way, write the sentence.
No rollout plan. This is where production incidents are born. A design doc that doesn't say how the change reaches users is half-finished.
Letting status drift. A doc marked "Draft" that was approved six months ago and shipped is misleading. Update the header when status changes. It takes ten seconds.
Writing for permission instead of clarity. A doc that exists to get approval reads defensively, hedges every claim, and avoids the hard trade-offs. A doc that exists to align the team reads honestly, makes claims, and surfaces problems early.
Putting the template into practice
Pick the template above, paste it into wherever your team writes, and start with one design doc this week. Pick a project that's ambiguous enough to benefit from review but small enough that the doc takes two hours, not two days. Share it with two reviewers, iterate, then ship the work.
The reps matter more than the format. After three or four design docs, your team will start to know what each section needs and the document writes itself. The template fades into the background and the trade-offs come to the front, which is the point.
If you want to wire design docs into your engineering process more deliberately, the documentation strategy guide covers how to slot them alongside runbooks, ADRs, and onboarding docs without creating a documentation tax. And if you'd rather see a generic engineering-spec template for adjacent artifacts, the technical documentation template is a useful companion.
FAQ
What is a design doc template?
A design doc template is a reusable structure engineering teams use to propose how a system or feature will be built before implementation starts. It typically includes context, goals and non-goals, proposed solution, alternatives considered, cross-cutting concerns, and a rollout plan. The template forces authors to surface trade-offs early.
How long should a design doc be?
Most design docs land between 2 and 15 pages depending on scope. A small feature or migration fits on 2-3 pages. A large system rewrite might run 10-15 pages. If a doc passes 20 pages, the project is usually too big for one doc and should split into a parent doc plus children.
What's the difference between a design doc and an ADR?
A design doc is a forward-looking proposal written before the work, covering the full solution and trade-offs. An ADR (architecture decision record) is a short, single-decision artifact written when a call gets made and never edited after. Teams often use both: the design doc proposes, then key decisions inside it get extracted into ADRs.
Do small startups need design docs?
Small teams benefit most when the work is ambiguous, touches more than one person, or has to land cleanly the first time. For obvious or isolated changes, a thoughtful pull request description covers the same ground. The trigger isn't team size. It's whether the trade-offs are worth surfacing in prose before code starts.
What tools should I use to write design docs?
Most teams write in Google Docs or Notion during review because line-by-line commenting is fast there. After approval, copying the doc into a searchable internal docs site preserves it as a long-term reference. The exact tool matters less than keeping the doc alive after it ships.
