Find bugs in Sentry backend code by checking for the patterns that cause the most production errors.
This skill encodes patterns from 638 real production issues (393 resolved, 220 unresolved, 25 ignored) generating over 27 million error events across 65,000+ affected users. These are not theoretical risks -- they are the actual bugs that ship most often, with known fixes from resolved issues.
Scope
You receive scoped code chunks from Warden's diff pipeline. Each chunk is a changed hunk (or coalesced group of nearby hunks) with surrounding context.
Analyze the chunk against the pattern checks below.
Use
Read
and
Grep
to trace data flow beyond the chunk when needed — follow function calls, check callers, verify types at boundaries.
Report only
HIGH
and
MEDIUM
confidence findings.
Confidence
Criteria
Action
HIGH
Traced the code path, confirmed the pattern matches a known bug class
Report with fix
MEDIUM
Pattern is present but context may mitigate it
Report as needs verification
LOW
Theoretical or mitigated elsewhere
Do not report
Step 1: Classify the Code
Determine what you are reviewing and load the relevant reference.
Alert and metric subscriptions referencing tags or functions that do not exist in the target dataset. These fire continuously once created.
Red flags:
Creating Snuba subscriptions with
SubscriptionData
using user-provided query strings without validation
Referencing
transaction.duration
in p95/p99 functions on the metrics dataset (it is a string type there)
Using custom tag names (e.g.,
customerType
) as filter dimensions without checking they exist
Calling
resolve_apdex_function
without verifying the dataset supports threshold parameters
Safe patterns:
Validate query fields against dataset schema before subscription creation
Wrap
_create_in_snuba
calls with try/except
SubscriptionError
and mark subscription as invalid
Use
IncompatibleMetricsQuery
checks before building metric subscription queries
Check 2: Missing Record / Stale Reference -- 81 issues, 1,403,592 events
Code calls
.get()
on a Django model assuming the record exists, but it has been deleted, merged, or never created.
Red flags:
Model.objects.get(id=some_id)
without try/except for
DoesNotExist
Detector.objects.get(id=detector_id)
in workflow engine without handling deletion
Environment.objects.get(name=env_name)
in monitor/cron consumers
Subscription.objects.get(id=sub_id)
in billing tasks
Using
Group.objects.get()
with IDs from Snuba query results (groups may be deleted/merged)
Chained lookups where second
.get()
fails
Safe patterns:
Model.objects.filter(...).first()
with a None check
try/except
DoesNotExist
that returns a graceful fallback (404, skip, log)
Queryset
.exists()
check before
.get()
In API endpoints: return 404 for
DoesNotExist
, 400 for validation errors. Never suggest returning 500 intentionally.
Not a bug — do not flag:
Infrastructure invariants:
.get()
enforcing a deployment precondition (e.g., "default org must exist in single-org mode") should crash — a 500 signals misconfiguration, not a code defect.
Already validated by parent: If the endpoint base class validates the object (e.g.,
OrganizationEndpoint
resolves the org), don't flag
.get()
on related records unless there's a genuine race or deletion window. Read the endpoint's parent class before reporting.
Configuration lookups: Code that loads required config objects (
get_default()
, settings-based lookups) is expected to fail hard if the config is wrong.
Dictionary mutation during iteration, shared mutable state, and unimplemented code paths.
Red flags:
for key in self._dict:
while another thread modifies it (RuntimeError)
Publishing to shared
KafkaPublisher
dict that grows unbounded
dict.pop()
or
dict[key] = value
on a dict being iterated in another thread
Missing
NotImplementedError
handlers for new search expression types
Safe patterns:
dict.copy()
before iteration
Use
threading.Lock
for shared mutable state
Implement all code paths before enabling features
Use
list(dict.keys())
for safe iteration when mutation is needed
Check 11: Logic Correctness -- not pattern-based
After checking all known patterns above, reason about the changed code itself:
Does every code path return the correct type?
Are all branches of conditionals handled (especially
else
/ default cases)?
Can any input (None, empty list, 0, empty string) cause unexpected behavior?
Are there off-by-one errors in loops, slices, or range checks?
If this code runs concurrently, is shared state protected?
Only report if you can trace a specific input that triggers the bug. Do not report theoretical concerns.
Not a bug — do not flag:
assert
statements enforcing infrastructure invariants — Sentry does not run with
python -O
, so assertions are always active. Crashing on a violated invariant is intentional.
Speculative input concerns (e.g., "this URL could be too long", "this header could be malformed") unless you can show the input actually reaches the code path unvalidated. Check for existing validation (host checks, schema validation, DRF serializers) before reporting.
If no checks produced a potential finding, stop and report zero findings. Do not invent issues to fill the report. An empty result is the correct output when the code has no bugs matching these patterns.
Each code location should be reported once under the most specific matching pattern. Do not flag the same line under multiple checks.
Step 3: Report Findings
For each finding, include:
Title
Short description of the bug
Severity
high, medium, or low
Location
File path and line number
Description
Root cause → consequences (2-4 sentences)
Precedent
A real production issue ID (e.g., "Similar to SENTRY-5D9J: Detector.DoesNotExist, 610K events")
Fix
A unified diff showing the code fix
Fix suggestions must include actual code. Never suggest a comment or docstring as a fix.
When suggesting fixes in API endpoints, use appropriate HTTP status codes (404 for not found, 400 for bad input, 409 for conflicts). Never suggest returning 500 intentionally.
Do not prescribe your own output format — the review harness controls the response structure.