Identifying and Categorizing Malicious Content on Paste Sites: A Neural Topic Modeling Approach

Altmetric Attention Score

This badge shows attention from news, blogs, social media, policy documents, and more. View details

๐Ÿ“ˆ Dimensions Citation Metrics

Dimensions tracks citations across scholarly literature, patents, clinical trials, and policy documents. View full metrics โ†’

In Plain Terms

Cybercriminals dump stolen data, credit card numbers, and malware code onto public text-sharing sites like Pastebin. This study builds a new machine-learning method that combines a language model (BERT) with topic modeling to automatically sort millions of these posts into categories, helping security teams spot leaked sensitive information and emerging threats faster.

Key Contributions

Key contributions will be added soon.

Artifacts

Citation

Tala Vahedi, Benjamin M. Ampel, Sagar Samtani, & Hsinchun Chen (2021). Identifying and Categorizing Malicious Content on Paste Sites: A Neural Topic Modeling Approach. IEEE ISI https://doi.org/10.1109/ISI53945.2021.9624765
Benjamin M. Ampel
Benjamin M. Ampel
Assistant Professor in Computer Information Systems and Director, Center for CyberAI Research (CCAIR)

My research focuses on AI-enabled Cybersecurity, including Cyber Threat Intelligence, Large Language Models, and Phishing Detection.

Loading stats...