Large Language Models Being Trained on the DarkWeb

Large language models (LLM) are becoming more common in cybersecurity for the tasks it can automate and for its other applications. Large language models are “trained” through various articles, websites, and sources.

A group of South Korean researchers wanted to make a LLM for the dark web. The dark web is the hidden part of the internet. This new LLM is called DarkBERT and it was trained with Tor network. Tor is a software which is known for its ease of access to the dark web.

DarkBERT used hacking forums, websites containing scams, and other sources with illegal activities to learn. This LLM was able to understand and spot possible threats through the language of cybercriminals.

There are many uses for a LLM specifically on the dark web. One of which is its ability to monitor harmful forums. Another use is its ability to flag websites publishing confidential information.

DarkBERT is unavailable to the public as it is only for academic purposes, but future LLM in the dark web will be very important in preventing cyber attacks.

Article contributed by Anthony DiTaranto