How to Do Threat Modeling in 4 Steps
Varun Kumar 10 min read 21 May 2026
Threat modeling is the practice of identifying what can go wrong with your system, deciding what to do about it, and documenting that decision. That's it. The output isn't a perfect risk register or a signed-off design document. It's a prioritized list of threats and your response to each.
Done in 30 minutes during sprint planning, that list is worth more than a theoretical STRIDE analysis that sits in Confluence and never gets read.
Certified Threat Modeling Professional
Learn STRIDE, PASTA, VAST & RTMP frameworks in one certification.
View Course
I've seen teams spend three weeks on a threat model and ship the exact vulnerability they were modeling against, because the model was too abstract to produce action items. Speed matters. Completeness doesn't. Good enough and shipped beats perfect and abandoned every single time.
What Threat Modeling Actually Produces
Before you start, get clear on the output. A threat model produces two things: a list of things that can go wrong, and a decision about each one. "Mitigate," "accept," or "transfer" are the only valid responses. "We'll look at this later" is not a decision. It's a risk that owns itself.
The decision log is as important as the threat list. If you accept a risk, write down why. That audit trail protects you when something goes wrong and someone asks why you didn't address it. Without the log, you can't tell whether a gap was missed or consciously accepted.
Most teams also expect the threat model to feed their backlog directly. Each mitigated threat should produce a ticket. If it doesn't, the threat model is theater.
Step 1: Draw Your System
You don't need UML. You need boxes, arrows, and trust boundaries. That's a Data Flow Diagram (DFD), and a whiteboard version takes 10 minutes.
Draw every component that handles data: services, databases, caches, queues, external APIs, user browsers, admin interfaces. Draw every arrow that shows data moving between them. Then draw trust boundaries as dotted lines around groups of components that share the same security context.
A trust boundary is where you'd need to authenticate or authorize. The line between your frontend and your API is a trust boundary. The line between your API and your internal database probably isn't, unless your database is in a separate network segment. Between your API and a third-party payment processor, definitely a trust boundary.
OWASP Threat Dragon and Microsoft Threat Modeling Tool both support DFDs natively. For teams already in Miro, a three-shape diagram with sticky note annotations works just as well. Don't let tool choice delay the diagram.
Label every arrow with the protocol and data type. HTTPS: user credentials is more useful than just an arrow. This labeling is what makes STRIDE analysis tractable in the next step.
Step 2: What Can Go Wrong? (STRIDE as a Checklist)
STRIDE is a checklist, not a methodology. For each component and each data flow, ask six questions:
- Spoofing: Can someone pretend to be this component or this user?
- Tampering: Can someone modify data in transit or at rest?
- Repudiation: Can someone deny they performed an action?
- Information Disclosure: Can someone read data they shouldn't?
- Denial of Service: Can someone make this component unavailable?
- Elevation of Privilege: Can someone gain permissions they shouldn't have?
Work through each arrow and each component on your DFD. Not every STRIDE category applies to everything. A message queue doesn't have an elevation of privilege concern the same way a service account does. Use judgment.
Write findings as plain sentences. "The /admin endpoint doesn't check if the user has the admin role, only that they're authenticated" is a finding. "Elevation of privilege risk in admin module" is not. Specificity determines whether someone can act on it.
pytm, the Python-based threat modeling library, automates STRIDE analysis against a code-defined data flow diagram. If your team has Python skills, the tm.py file becomes a living threat model that regenerates when the architecture changes.
Step 3: What Are You Doing About It?
For every finding, pick a response: mitigate, accept, or transfer. No action is also a decision, but it needs to be documented as acceptance, not as oversight.
Mitigate means you're adding a control. That control goes into the sprint backlog. Vague mitigations like "improve authentication" don't count. The ticket needs to say "add role check on GET /api/admin/users using the existing require_role('admin') middleware."
Accept means the risk is real but the cost of mitigation outweighs the impact. Write down the justification. "The admin endpoint is only accessible on the internal VPN, so the attack surface is limited to internal users" is an acceptance rationale.
Transfer is less common in software: it usually means you're relying on a third-party control (your cloud provider's DDoS protection, for example) or your insurance policy covers it.
Most findings in a 30-minute sprint session end up as mitigate or accept. If you have more than 10 findings from a session, you're probably either modeling too broadly or your system has real problems worth escalating.
Step 4: Validate Your Work
This is the step most teams skip. Before you close the threat model, do a quick sanity check.
Ask: did we cover every trust boundary crossing? If you have arrows that cross a dotted line and no STRIDE finding associated with them, that's a gap. Ask: is there a known CVE for any technology we're using in this data flow? CVE-2021-44228 (Log4Shell) caught teams that had never thought about their logging pipeline as an attack surface. Ask: does our threat model match what's actually deployed? I've reviewed threat models built on architecture diagrams that were two years out of date.
The validation step takes five minutes. It catches the most embarrassing gaps.
Worked Example: Login + API Service
Walking through an abstract process is less useful than watching it applied. Here's a STRIDE analysis for a concrete system: a login service that authenticates users, issues JWTs, and hands off to a REST API that reads from a PostgreSQL database.
The system: Browser sends credentials to POST /auth/login. The Auth Service validates the credentials against the Users table, issues a signed JWT. The browser includes that JWT in subsequent requests to GET /api/orders. The Orders Service verifies the JWT, queries the Orders table, and returns results.
DFD trust boundaries:
- Browser to Auth Service (internet-facing, unauthenticated at this point)
- Browser to Orders Service (internet-facing, JWT-authenticated)
- Auth Service to Users DB (internal network, service account)
- Orders Service to Orders DB (internal network, service account)
Findings that produce tickets:
- Add rate limiting on /auth/login: 10 attempts per username per 15 minutes (SEC-301)
- Change error message to generic "Invalid credentials" for both bad username and bad password (SEC-302)
- Restrict Auth Service DB user to SELECT on users table, revoke all other permissions (SEC-303)
- Add structured logging for all login attempts: timestamp, IP, username (hashed), success/failure (SEC-304)
- Verify JWT algorithm is pinned to RS256 in Orders Service middleware (SEC-305)
Common Mistakes That Kill Threat Modeling Programs
These patterns come up repeatedly across teams that try threat modeling and abandon it.
Modeling the happy path, not the system:
Engineers naturally describe how the system works when everything is correct. Threat modeling requires thinking about how the system behaves when inputs are wrong, malicious, or missing. Explicitly ask: "what happens if this input is null? What happens if this service is down? What happens if this user is an attacker?" If you only model the sunny-day flow, you produce a DFD, not a threat model.
Conflating threat modeling with vulnerability scanning:
A threat model identifies threats. A vulnerability scanner finds known CVEs. They're different things. I've seen teams point at their Trivy scan results and call it a threat model. Trivy tells you about CVE-2024-1234 in your base image. Threat modeling tells you that your SSRF vulnerability lets an attacker use that image to reach your internal metadata service. You need both, but they're not interchangeable.
Producing a document nobody updates:
A threat model that isn't updated when architecture changes is worse than no threat model. It's a false sense of security. The model says you have mTLS between services. The actual deployment doesn't because someone forgot to add it to the new service that shipped last quarter. Treat the threat model like a test: it should fail when the architecture drifts from what it describes. Threagile does this by generating threat models from code-defined architecture YAML files. If the YAML doesn't match the infrastructure, the model is obviously out of date.
Making security the gatekeeper:
Threat modeling fails when security owns the output and developers are attendees. Developers need to own the DFD and the mitigation tickets. Security's role is to ask the hard questions and help classify threats correctly. If developers leave a threat modeling session without having drawn the diagram themselves, they don't have a mental model of their own system's attack surface. That defeats the purpose entirely.
Threat modeling only new features:
Every significant dependency upgrade, infrastructure change, and third-party integration warrants a threat model update. Log4Shell affected teams who had never thought about their logging pipeline as an attack surface. Threat modeling Log4j-dependent services in 2021 would have surfaced that risk before it became public.
Threat Modeling in Agile: The 30-Minute Ritual
The practical approach: 30 minutes at the start of sprint planning for any feature that touches authentication, authorization, data storage, or external integrations.
Draw the data flow for the feature on a whiteboard or in Miro. Run STRIDE. Write findings. Create tickets. Done. The threat model lives as a photo of the whiteboard attached to the Jira epic, or as a Threat Dragon JSON file committed to the repo.
For engineers who want a structured path into this practice, the Certified Threat Modeling Professional program covers threat modeling as part of a hands-on security engineering curriculum, not as a theory exercise.
Don't maintain a separate threat modeling document. Keep it in the same place as the architecture decision records. When the architecture changes, update the threat model in the same PR.
Common Mistakes
Scope creep. You don't need to model your entire platform in one session. Model the feature in front of you. The platform threat model is a separate artifact that evolves over quarters, not sprints.
Analysis paralysis. If you're debating whether something is a Tampering threat or an Information Disclosure threat, it doesn't matter. Pick one, write the finding, move on. The STRIDE category is just a prompt to help you think. It's not load-bearing.
Treating it as a one-time activity. I've seen teams do a thorough threat model in Q1 and then add three new external integrations over the year without revisiting it. The threat model needs to live alongside the system. Every significant architecture change should trigger an update.
Inviting too many people. A good threat modeling session has three to five people: an engineer who built it, a security engineer, and a product person who can make risk acceptance decisions. More than five and you'll spend 20 minutes agreeing on scope.
FAQ
Varun is a Security Research Writer specializing in DevSecOps, AI Security, and cloud-native security. He takes complex security topics and makes them straightforward. His articles provide security professionals with practical, research-backed insights they can actually use.
