Summer Observations

Sophia Behar
Aug 31
3 min read

Updated: Sep 3

This summer, I had the opportunity to engage in fascinating cybersecurity research. This has prompted me to dive into the field further and explore the ways in which natural language processing (NLP) can be used to help enhance it.

The first main application of NLP to the field is identifying phishing emails. These emails are particularly dangerous because they can trick users into revealing sensitive information or downloading malicious software. Hence, filtering them out before users can even access them is very important, and NLP models happen to be great at this! They can check for suspicious keywords, spelling mistakes and other surface-level irregularities and use these text classification features to determine whether an email should be considered “phishing”. Furthermore, the models are not limited to searching for smaller red flags across individual words. NLP can be used to identify an email’s broader intent and emotion through sentiment analysis, which, for example, allows models to observe whether the email’s tone matches the common tone of fear or urgency of phishing emails. Named Entity Recognition is also a useful NLP tool as it focuses on pinpointing names of companies, individuals and locations, helping to identify the impersonation attempts that may exist within these emails. Therefore, when an NLP model combines all these different methods and adapts over time based on the phishing strategies it encounters, it becomes highly effective at reducing the number of phishing emails that pass through spam filters and, in turn, the harm they can cause.

Next, NLP is very helpful for analyzing logs. Logs refer to the digital records of system activity, and they allow security teams to understand system errors and investigate attacks. However, in their raw form, these logs are usually voluminous and unstructured. While traditional methods of parsing logs rely on fixed rules, these do not work well when the log formats undergo changes unless the rules are consistently updated too. Instead, NLP-based models, like CyBERT, treat logs like language and allow for the identification of unusual behaviour even when the format differs by relying on deeper contextual understanding and generalization of patterns. This reduces manual maintenance and enables security teams to gain insights in a clearer and faster manner.

Lastly, NLP can improve the efficiency and accuracy of threat intelligence. There is an overwhelming number of threat intelligence sources available on the Internet, from blogs to dark web forums and even social media. Although these are all very valuable in providing insight into current and future potential cyberthreats, the sheer amount of data and the fact that many sources are in languages other than English make it hard for human analysts to interpret the information optimally. Hence, NLP algorithms can help by scanning through the sources and extracting patterns, drawing connections between attacks and noting other relevant information about emerging threats. These threats can even be categorized by severity or other common themes, which allows cybersecurity teams to prioritize their response efforts and reduce the risk of overlooking potential harm.

Overall, it is clear that there are numerous applications of natural language processing to cybersecurity. While it is not a perfect solution to tackle all cyberthreats, the intersection between NLP and cybersecurity is an area of research that will only continue to grow as machine learning models and artificial intelligence capabilites become more complex.

Credit: M K Pavan Kumar (Medium)

Works Cited

Hensley, MaKenna. “The Role of Natural Language Processing in Detecting Phishing Emails.” LinuxSecurity, 8 Aug. 2025, linuxsecurity.com/news/server-security/nlp-phishing-detection. Accessed 31 Aug. 2025.

Jeff. “NLP in Cybersecurity: Contextual Threat Analysis.” The Security Bulldog, 10 July 2025, securitybulldog.com/blog/nlp-in-cybersecurity-contextual-threat-analysis/. Accessed 31 Aug. 2025.

Kuriakose, Anil Abraham. “Leveraging Natural Language Processing (NLP) for Cyber Threat Analysis in MDR.” Algomox, 2 Jan. 2025, www.algomox.com/resources/blog/leveraging_nlp_cyber_threat_analysis_mdr/. Accessed 31 Aug. 2025.

Richardson, Bartley. “Changing Cybersecurity with Natural Language Processing.” NVIDIA Developer Technical Blog, 19 Oct. 2022, developer.nvidia.com/blog/changing-cybersecurity-with-natural-language-processing/. Accessed 31 Aug. 2025.