Skip to content

Benchmark: DVNA (Damn Vulnerable Node Application)

DVNA is an intentionally vulnerable Node.js/Express application with 19 documented security flaws spanning the OWASP Top 10. It is a standard benchmark for web application security tools.

  • Repository: appsecco/dvna
  • Stack: Node.js, Express, Sequelize ORM, EJS templates, SQLite
  • Source files: 5 files (core/appHandler.js, core/authHandler.js, core/passport.js, routes/app.js, routes/main.js)

V1 vs V2 Comparison

V1 (file-level) V2 (function-level)
Approach Whole file + rules in one prompt tree-sitter call graph, one function per LLM call
Files scanned 12 12 (40 functions extracted)
LLM calls 12 (one per file) ~48 (one per function + flow analysis)
Scan time ~6 min (server.js alone: 228s) ~5 min (48 calls, ~6s avg)
Raw findings 22 58
True positives 13 ~33
False positives 9 (7 server.js garbage + 2 config noise) ~25 (redirects, ORM, category drift)
Signal-to-noise 59% 57%
Detection rate 8/19 (42%) 13/19 (68%)

Head-to-head: Official 19 Vulnerabilities

# Vulnerability V1 V2
1 SQL Injection (usersearch) YES YES
2 Command Injection (ping) YES YES
3 Insecure password reset (MD5) YES YES
4 Hardcoded session secret -- YES
5 Password hashes disclosed -- --
6 Sensitive data in logs -- --
7 XXE (bulkproducts) YES YES
8 Broken Access Control (admin API) YES YES
9 IDOR in user edit -- YES
10 Stack trace exposure -- --
11 X-Powered-By header -- PARTIAL
12 Reflected XSS (search) -- YES
13 Stored XSS (products) -- YES
14 DOM XSS (admin users) -- YES
15 Insecure deserialization YES YES
16 mathjs RCE (CVE) -- --
17 Insufficient logging -- --
18 CSRF -- YES
19 Open redirect YES YES
Total 8/19 13/19

Why V2 detects more

  1. Function-level focus: V1 sends the entire file (~400 lines) in one prompt. The 7B model loses focus on large files — appHandler.js (450 lines) yielded only 4 findings in V1 vs 20 in V2 where each function gets its own dedicated analysis.

  2. Targeted rules per function: V2 selects rules based on function role (route handler gets IDOR+XSS+SQLi rules, auth function gets session+CSRF rules). V1 applies all matching rules at once, diluting the LLM's attention.

  3. Multi-phase analysis: V2 runs 5 phases (code map, function review, auth logic, attack surface, data flow). V1 does a single pass + optional verification. The auth and flow phases catch logic flaws (CSRF, IDOR) that pattern matching misses.

  4. Call graph context: V2 tells the LLM "this function is called by X and calls Y" — critical for understanding data flow. V1 has no inter-function awareness.

Why V1 still has value

  • Single-file speed: 10-30s for one file vs V2 needing the whole project mapped first
  • Simpler: no tree-sitter dependency, works on any single file in isolation
  • CI/CD integration: scan only changed files on commit, fast feedback loop
  • No project context needed: V1 works on a file you paste in; V2 needs a project root

Note: for project scans, V2 is actually faster. V1's large prompts choke the 7B model — server.js took 228s (4 min) in V1 because the model received the entire file + all rules in one prompt. V2 splits this into small fast calls.

V1 noise problem: server.js

V1 produced 7 false positives from server.js — the LLM listed every rule it checked and said "not found" but formatted each as a HIGH finding. This is a known V1 issue with small config files where the model has nothing to report but fills the JSON anyway. V2 doesn't have this problem because it only analyzes security-relevant functions (server.js setup code is classified as utility and skipped).

Extra findings only V2 caught

Finding Why V1 missed Why V2 caught
IDOR in user edit Buried in 450-line file, no IDOR rule OWASP-012 rule + function isolation
Reflected/Stored/DOM XSS File too large, XSS patterns lost Each render function analyzed individually
CSRF on auth flows No dedicated CSRF rule, auth in separate file OWASP-013 rule + auth function gets CSRF check
Multiple SQLi points V1 found 1/8, rest buried in large file Each query function reviewed separately
Mass assignment Not a code pattern V1 looks for V2's flow analysis traced input to create()

V2 Test Configuration

Setting Value
Engine V2 (function-level)
Model Qwen2.5-Coder-7B-Instruct-4bit
Timeout 30s per LLM call
Rules 162 built-in (OWASP + language + framework packs)
Functions analyzed 40 (34 security-relevant)
Total LLM calls ~48
Scan time ~5 minutes
Findings (raw) 58

Results: Official Vulnerabilities

DVNA documents 19 vulnerabilities across 12 OWASP categories.

# Vulnerability Category Detected Severity Details
1 SQL Injection in user search SQL Injection YES HIGH appHandler.js:9 — string concatenation in SQL query
2 Command Injection in ping Command Injection YES HIGH appHandler.js:41 — shell command with user input
3 Insecure password reset (MD5 token) Weak Cryptography YES MEDIUM authHandler.js:74 — MD5 for token generation flagged
4 Hardcoded session secret Broken Authentication YES MEDIUM authHandler.js:5 — hardcoded session configuration
5 Password hashes disclosed in API Sensitive Data Exposure NO -- API returns full user objects without field filtering
6 Sensitive data in Sequelize logs Sensitive Data Exposure NO -- ORM default logging config, not in route handlers
7 XXE in bulk product import XXE YES MEDIUM appHandler.js:239noent:true in libxmljs parsing
8 Missing role check on admin API Broken Access Control YES LOW routes/app.js:36 — no role/permission check on admin endpoint
9 IDOR in user edit IDOR YES MEDIUM appHandler.js:144 — user-provided ID used without ownership check
10 Stack trace exposure in calculator Security Misconfiguration NO -- Runtime error handling, not visible in source
11 X-Powered-By header exposed Security Misconfiguration PARTIAL LOW Generic security misconfiguration flagged
12 Reflected XSS in product search XSS YES MEDIUM appHandler.js:121,152 — unescaped user content in response
13 Stored XSS in product listing XSS YES MEDIUM appHandler.js:204 — unsanitized data in DOM
14 DOM XSS in admin user listing XSS YES MEDIUM appHandler.js:204 — API data injected into DOM
15 Insecure deserialization (node-serialize) Insecure Deserialization YES HIGH appHandler.js:220unserialize() on user-controlled data
16 mathjs RCE (known CVE) Component Vulnerability NO -- Requires SCA tooling (npm audit), not SAST
17 Insufficient logging/monitoring Insufficient Logging NO -- Architectural concern — absence of code, not a pattern
18 CSRF on state-changing forms CSRF YES MEDIUM authHandler.js:48 — redirect without CSRF protection
19 Open redirect via URL parameter Open Redirect YES MEDIUM Multiple findings — req.query.url used in redirects

Detection Summary

Result Count Rate
Detected 13 68%
Partial 1 5%
Missed 5 26%

Why the 5 misses are out of scope for SAST

# Vuln Why missed What would catch it
5 Password hash disclosure API returns raw DB objects — no dangerous code pattern, just missing filtering Data-flow rule for "raw model objects in API response"
6 Sensitive data in logs Sequelize logging config in models/index.js — default behavior, not explicit code Config audit / framework-specific default check
10 Stack trace exposure Express dev-mode error handling — runtime behavior Runtime testing / DAST
16 mathjs RCE Known CVE in dependency — not a code pattern SCA tool (npm audit, Snyk)
17 Insufficient logging Absence of logging code — detecting what's not there Architecture review / compliance check

Extra Findings: Beyond the Official 19

Foil identified several issues not documented in DVNA's official vulnerability list.

Confirmed real issues

  1. Mass Assignment in bulk product import (appHandler.js:215) Product data from user input is passed directly to create() without field filtering. An attacker could inject extra fields (e.g., price: 0, isAdmin: true) if the model has additional columns. This is a real mass assignment / over-posting vulnerability.

  2. Multiple SQL injection points (appHandler.js:58,83,109,144,194, authHandler.js:19,44,71) The official docs highlight only the user search endpoint, but DVNA has SQL injection via string interpolation in at least 8 other query locations across login, password reset, product edit, and user lookup flows.

  3. Missing rate limiting on login (routes/main.js:10) The login endpoint has no rate limiting or account lockout, enabling brute-force attacks. Listed as a finding under Broken Authentication.

  4. Additional IDOR on vulnerability display (routes/main.js:14) The /app/vulnerabilities/:id endpoint accepts a resource ID without authorization checks.

False positives / noise

  1. "Open Redirect" on hardcoded redirects (e.g., res.redirect('/login')) Several findings flag res.redirect('/path') with hardcoded paths as "open redirect." These are not exploitable — the redirect target is not user-controlled. ~8 findings are this pattern.

  2. "Insecure Deserialization" on Sequelize ORM calls (passport.js:15,31,41,51) The LLM flags normal ORM findById()/findOne() calls as "insecure deserialization." Sequelize parameterizes queries internally — these are safe. The deserialization rule is too broad; it should only flag actual serialization libraries, not ORMs. ~4 findings are this pattern.

  3. "OSI Model" category names Some findings use invented category names like "OSI Model - Data Flow" or "OSI Model Layer" instead of standard vuln classes. These are mostly duplicate detections of open redirects. The LLM occasionally ignores the category naming instruction.

Noise analysis

Type Count Notes
True positive (official) ~25 Matches documented vulns
True positive (extra) ~8 Real issues not in docs
False positive (hardcoded redirect) ~8 Not exploitable
False positive (ORM = deserialization) ~4 Safe Sequelize calls
Duplicate / noisy category ~13 Same issue, bad category name
Total 58
Signal-to-noise ~57% true positives

Improvement History

Version Detection Notes
V2 initial (no targeted rules) 8/19 (42%) + 4 partial Generic rules, rule IDs as categories
V2 + IDOR/CSRF/XXE/Deser rules 13/19 (68%) + 1 partial Proper vuln class names, targeted OWASP rules

Key improvements that moved the needle

  • OWASP-012 (IDOR/BOLA): Specific ownership-check detection on ID parameters
  • OWASP-013 (CSRF): Missing anti-CSRF token detection
  • OWASP-014 (DOM XSS): innerHTML/unescaped template sinks
  • JS-011 (XXE): noent:true in XML parsers
  • JS-012 (Insecure Deserialization): unserialize() from node-serialize
  • Proper category names: LLM instructed to use standard vuln class names

Reducing False Positives (TODO)

  1. Hardcoded redirect FPs: Add negative pattern — if the redirect target is a string literal (not user input), skip.
  2. ORM deserialization FPs: Scope the deserialization rule to actual serialization libraries; add ORM allowlist (Sequelize, TypeORM, Prisma, Mongoose).
  3. Category name drift: Post-processing normalization catches most but not all. Consider a stricter enum in the JSON schema for response_format.
  4. Dedup line 186: Multiple files report a finding at line 186 — this appears to be a shared function or template. Improve cross-file dedup.