Decepticons Attack! Attacking and Defending Cyber Defense Systems Using Transformers
Using transformers to generate and detect cyber threat intelligence
Open source text has become an integral part of most cyber defense systems, most of these systems rely on this data for the latest open-source intelligence (OSINT). These systems use natural language processing techniques to extract information from the text, called cyber threat intelligence (CTI), to create knowledge graphs that are used in downstream defense tasks. Since the training text data is open source, it is vulnerable to manipulation by adversaries. Adversaries can potentially bombard the usual forums for data collection like blog posts, chat forums, etc with false CTI and change the training data so that the models learn incorrect patterns and serve the adversary's needs.
This project investigates the use of transformer models to generate false CTI and using the same models to classify any given CTI as human-written or fake. We evaluate different classification methods and explore the efficacy of representation learning using large-scale language models.