Synthetic Evidence Generation
T8 · External Deception & Misinformation →Synthetic evidence attacks weaponize the LLM's fluency at producing structured documentary artifacts — studies, legal filings, medical records, financial statements, chat logs, forensic reports — that *look like* the records institutions and courts rely on. The deception works because evidentiary documents derive their authority from format and internal consistency (consistent dates, plausible names, domain-correct jargon, cross-referencing), all of which a capable model generates coherently. The asymmetry is severe: producing a fabricated "study" with methods, tables, and a reference list takes seconds, while debunking it requires a domain expert to trace claims, check registries, and find the absence of a real record.
- Registry and provenance cross-checks: Verify cited studies against DOI/registry records, legal filings against court dockets, certificates against issuer databases — fabrications lack a real backing record
- Internal-consistency and metadata forensics: Inspect document metadata, fonts, generation timestamps, and statistical plausibility (e.g., fabricated data often shows unnatural digit or variance patterns)
- Citation grounding: Resolve every reference; hallucinated or non-existent citations are a hallmark of LLM-generated "studies"
- Stylometry and template-detection: Detect the homogenized phrasing and boilerplate that LLM output reuses across supposedly independent documents
Synthetic evidence is the corroboration layer that makes other T8 techniques stick: it supplies the "study" behind an authority impersonation (T8-AT-001), the "leaked documents" behind a conspiracy (T8-AT-003), and the paper trail behind a fabricated identity (T8-AT-015). It chains tightly with T9 synthetic media when a fake document is paired with a doctored image or a deepfake of the purported author, and with T15 human-workflow exploitation when forged records are injected into approval or claims processes.