HackerSignal: A Large-Scale Multi-Source Dataset Linking Hacker Community Discourse to the CVE Vulnerability Lifecycle
In Plain Terms
This paper introduces HackerSignal, a very large public dataset that stitches together 7.45 million documents from hacker forums, exploit databases, vulnerability advisories, and software fix commits collected over 36 years. Everything is connected through shared CVE vulnerability identifiers, letting researchers trace a security flaw from early hacker chatter all the way to its official patch. The authors demonstrate three AI benchmark tasks the dataset enables and release diagnostics and documentation to support responsible reuse.
Key Contributions
Key contributions will be added soon.
Artifacts
No artifacts listed yet.
Related Papers
Citation
Benjamin M. Ampel & Sagar Samtani (2026). HackerSignal: A Large-Scale Multi-Source Dataset Linking Hacker Community Discourse to the CVE Vulnerability Lifecycle. In *arXiv preprint arXiv:2605.03158* https://doi.org/10.48550/arXiv.2605.03158