Benchmark: DVNA (Damn Vulnerable Node Application)¶

DVNA is an intentionally vulnerable Node.js/Express application with 19 documented security flaws spanning the OWASP Top 10. It is a standard benchmark for web application security tools.

Repository: appsecco/dvna
Stack: Node.js, Express, Sequelize ORM, EJS templates, SQLite
Source files: 5 files (core/appHandler.js, core/authHandler.js, core/passport.js, routes/app.js, routes/main.js)

V1 vs V2 Comparison¶

	V1 (file-level)	V2 (function-level)
Approach	Whole file + rules in one prompt	tree-sitter call graph, one function per LLM call
Files scanned	12	12 (40 functions extracted)
LLM calls	12 (one per file)	~48 (one per function + flow analysis)
Scan time	~6 min (server.js alone: 228s)	~5 min (48 calls, ~6s avg)
Raw findings	22	58
True positives	13	~33
False positives	9 (7 server.js garbage + 2 config noise)	~25 (redirects, ORM, category drift)
Signal-to-noise	59%	57%
Detection rate	8/19 (42%)	13/19 (68%)

Head-to-head: Official 19 Vulnerabilities¶

#	Vulnerability	V1	V2
1	SQL Injection (usersearch)	YES	YES
2	Command Injection (ping)	YES	YES
3	Insecure password reset (MD5)	YES	YES
4	Hardcoded session secret	--	YES
5	Password hashes disclosed	--	--
6	Sensitive data in logs	--	--
7	XXE (bulkproducts)	YES	YES
8	Broken Access Control (admin API)	YES	YES
9	IDOR in user edit	--	YES
10	Stack trace exposure	--	--
11	X-Powered-By header	--	PARTIAL
12	Reflected XSS (search)	--	YES
13	Stored XSS (products)	--	YES
14	DOM XSS (admin users)	--	YES
15	Insecure deserialization	YES	YES
16	mathjs RCE (CVE)	--	--
17	Insufficient logging	--	--
18	CSRF	--	YES
19	Open redirect	YES	YES
	Total	8/19	13/19

Why V2 detects more¶

Function-level focus: V1 sends the entire file (~400 lines) in one prompt. The 7B model loses focus on large files — appHandler.js (450 lines) yielded only 4 findings in V1 vs 20 in V2 where each function gets its own dedicated analysis.
Targeted rules per function: V2 selects rules based on function role (route handler gets IDOR+XSS+SQLi rules, auth function gets session+CSRF rules). V1 applies all matching rules at once, diluting the LLM's attention.
Multi-phase analysis: V2 runs 5 phases (code map, function review, auth logic, attack surface, data flow). V1 does a single pass + optional verification. The auth and flow phases catch logic flaws (CSRF, IDOR) that pattern matching misses.
Call graph context: V2 tells the LLM "this function is called by X and calls Y" — critical for understanding data flow. V1 has no inter-function awareness.

Why V1 still has value¶

Single-file speed: 10-30s for one file vs V2 needing the whole project mapped first
Simpler: no tree-sitter dependency, works on any single file in isolation
CI/CD integration: scan only changed files on commit, fast feedback loop
No project context needed: V1 works on a file you paste in; V2 needs a project root

Note: for project scans, V2 is actually faster. V1's large prompts choke the 7B model — server.js took 228s (4 min) in V1 because the model received the entire file + all rules in one prompt. V2 splits this into small fast calls.

V1 noise problem: server.js¶

V1 produced 7 false positives from server.js — the LLM listed every rule it checked and said "not found" but formatted each as a HIGH finding. This is a known V1 issue with small config files where the model has nothing to report but fills the JSON anyway. V2 doesn't have this problem because it only analyzes security-relevant functions (server.js setup code is classified as utility and skipped).

Extra findings only V2 caught¶

Finding	Why V1 missed	Why V2 caught
IDOR in user edit	Buried in 450-line file, no IDOR rule	OWASP-012 rule + function isolation
Reflected/Stored/DOM XSS	File too large, XSS patterns lost	Each render function analyzed individually
CSRF on auth flows	No dedicated CSRF rule, auth in separate file	OWASP-013 rule + auth function gets CSRF check
Multiple SQLi points	V1 found 1/8, rest buried in large file	Each query function reviewed separately
Mass assignment	Not a code pattern V1 looks for	V2's flow analysis traced input to create()

V2 Test Configuration¶

Setting	Value
Engine	V2 (function-level)
Model	Qwen2.5-Coder-7B-Instruct-4bit
Timeout	30s per LLM call
Rules	162 built-in (OWASP + language + framework packs)
Functions analyzed	40 (34 security-relevant)
Total LLM calls	~48
Scan time	~5 minutes
Findings (raw)	58

Results: Official Vulnerabilities¶

DVNA documents 19 vulnerabilities across 12 OWASP categories.

#	Vulnerability	Category	Detected	Severity	Details
1	SQL Injection in user search	SQL Injection	YES	HIGH	`appHandler.js:9` — string concatenation in SQL query
2	Command Injection in ping	Command Injection	YES	HIGH	`appHandler.js:41` — shell command with user input
3	Insecure password reset (MD5 token)	Weak Cryptography	YES	MEDIUM	`authHandler.js:74` — MD5 for token generation flagged
4	Hardcoded session secret	Broken Authentication	YES	MEDIUM	`authHandler.js:5` — hardcoded session configuration
5	Password hashes disclosed in API	Sensitive Data Exposure	NO	--	API returns full user objects without field filtering
6	Sensitive data in Sequelize logs	Sensitive Data Exposure	NO	--	ORM default logging config, not in route handlers
7	XXE in bulk product import	XXE	YES	MEDIUM	`appHandler.js:239` — `noent:true` in libxmljs parsing
8	Missing role check on admin API	Broken Access Control	YES	LOW	`routes/app.js:36` — no role/permission check on admin endpoint
9	IDOR in user edit	IDOR	YES	MEDIUM	`appHandler.js:144` — user-provided ID used without ownership check
10	Stack trace exposure in calculator	Security Misconfiguration	NO	--	Runtime error handling, not visible in source
11	X-Powered-By header exposed	Security Misconfiguration	PARTIAL	LOW	Generic security misconfiguration flagged
12	Reflected XSS in product search	XSS	YES	MEDIUM	`appHandler.js:121,152` — unescaped user content in response
13	Stored XSS in product listing	XSS	YES	MEDIUM	`appHandler.js:204` — unsanitized data in DOM
14	DOM XSS in admin user listing	XSS	YES	MEDIUM	`appHandler.js:204` — API data injected into DOM
15	Insecure deserialization (node-serialize)	Insecure Deserialization	YES	HIGH	`appHandler.js:220` — `unserialize()` on user-controlled data
16	mathjs RCE (known CVE)	Component Vulnerability	NO	--	Requires SCA tooling (npm audit), not SAST
17	Insufficient logging/monitoring	Insufficient Logging	NO	--	Architectural concern — absence of code, not a pattern
18	CSRF on state-changing forms	CSRF	YES	MEDIUM	`authHandler.js:48` — redirect without CSRF protection
19	Open redirect via URL parameter	Open Redirect	YES	MEDIUM	Multiple findings — `req.query.url` used in redirects

Detection Summary¶

Result	Count	Rate
Detected	13	68%
Partial	1	5%
Missed	5	26%

Why the 5 misses are out of scope for SAST¶

#	Vuln	Why missed	What would catch it
5	Password hash disclosure	API returns raw DB objects — no dangerous code pattern, just missing filtering	Data-flow rule for "raw model objects in API response"
6	Sensitive data in logs	Sequelize logging config in `models/index.js` — default behavior, not explicit code	Config audit / framework-specific default check
10	Stack trace exposure	Express dev-mode error handling — runtime behavior	Runtime testing / DAST
16	mathjs RCE	Known CVE in dependency — not a code pattern	SCA tool (npm audit, Snyk)
17	Insufficient logging	Absence of logging code — detecting what's not there	Architecture review / compliance check

Extra Findings: Beyond the Official 19¶

Foil identified several issues not documented in DVNA's official vulnerability list.

Confirmed real issues¶

Mass Assignment in bulk product import (appHandler.js:215) Product data from user input is passed directly to create() without field filtering. An attacker could inject extra fields (e.g., price: 0, isAdmin: true) if the model has additional columns. This is a real mass assignment / over-posting vulnerability.
Multiple SQL injection points (appHandler.js:58,83,109,144,194, authHandler.js:19,44,71) The official docs highlight only the user search endpoint, but DVNA has SQL injection via string interpolation in at least 8 other query locations across login, password reset, product edit, and user lookup flows.
Missing rate limiting on login (routes/main.js:10) The login endpoint has no rate limiting or account lockout, enabling brute-force attacks. Listed as a finding under Broken Authentication.
Additional IDOR on vulnerability display (routes/main.js:14) The /app/vulnerabilities/:id endpoint accepts a resource ID without authorization checks.

False positives / noise¶

"Open Redirect" on hardcoded redirects (e.g., res.redirect('/login')) Several findings flag res.redirect('/path') with hardcoded paths as "open redirect." These are not exploitable — the redirect target is not user-controlled. ~8 findings are this pattern.
"Insecure Deserialization" on Sequelize ORM calls (passport.js:15,31,41,51) The LLM flags normal ORM findById()/findOne() calls as "insecure deserialization." Sequelize parameterizes queries internally — these are safe. The deserialization rule is too broad; it should only flag actual serialization libraries, not ORMs. ~4 findings are this pattern.
"OSI Model" category names Some findings use invented category names like "OSI Model - Data Flow" or "OSI Model Layer" instead of standard vuln classes. These are mostly duplicate detections of open redirects. The LLM occasionally ignores the category naming instruction.

Noise analysis¶

Type	Count	Notes
True positive (official)	~25	Matches documented vulns
True positive (extra)	~8	Real issues not in docs
False positive (hardcoded redirect)	~8	Not exploitable
False positive (ORM = deserialization)	~4	Safe Sequelize calls
Duplicate / noisy category	~13	Same issue, bad category name
Total	58
Signal-to-noise	~57% true positives

Improvement History¶

Version	Detection	Notes
V2 initial (no targeted rules)	8/19 (42%) + 4 partial	Generic rules, rule IDs as categories
V2 + IDOR/CSRF/XXE/Deser rules	13/19 (68%) + 1 partial	Proper vuln class names, targeted OWASP rules

Key improvements that moved the needle¶

OWASP-012 (IDOR/BOLA): Specific ownership-check detection on ID parameters
OWASP-013 (CSRF): Missing anti-CSRF token detection
OWASP-014 (DOM XSS): innerHTML/unescaped template sinks
JS-011 (XXE): noent:true in XML parsers
JS-012 (Insecure Deserialization): unserialize() from node-serialize
Proper category names: LLM instructed to use standard vuln class names

Reducing False Positives (TODO)¶

Hardcoded redirect FPs: Add negative pattern — if the redirect target is a string literal (not user input), skip.
ORM deserialization FPs: Scope the deserialization rule to actual serialization libraries; add ORM allowlist (Sequelize, TypeORM, Prisma, Mongoose).
Category name drift: Post-processing normalization catches most but not all. Consider a stricter enum in the JSON schema for response_format.
Dedup line 186: Multiple files report a finding at line 186 — this appears to be a shared function or template. Improve cross-file dedup.