Machine learning in cybersecurity is not new. What is new is how accessible it has become to security practitioners who do not have a PhD in statistics. Libraries like scikit-learn, techniques like Random Forest and Isolation Forest, and data tools like pandas and Jupyter make it possible for working security professionals to apply ML directly to their operational data.
This is the area GTK Cyber has been teaching longest. Summer Rankin has 30+ peer-reviewed publications in ML. Charles Givre has been training security teams on data science since before it was fashionable.
Training
The Applied Data Science & AI for Cybersecurity course covers the full data science lifecycle applied to security problems. Threat Hunting with Data Science focuses specifically on operationalizing ML for threat detection. Python Coding for Security Analysts is the entry point if you are new to programming.
Reading
- Automated Advanced Analytics: An Unexpected Tool in the Cyber Arsenal
- The Power of Prediction: Machine Learning for Ransomware Prevention
- Why Cybersecurity Professionals Need AI Skills
Core Techniques
The ML techniques most applicable to security work:
- Anomaly detection:
IsolationForest, statistical baselines, time-series analysis. Finds what signatures miss. - Classification: Random Forest, SVM, gradient boosting for malicious URL detection, phishing, SQL injection
- Clustering: K-means and hierarchical clustering for surfacing behavioral patterns
- NLP: Entity extraction from threat intelligence, prompt injection detection
- Deep learning: For specific high-value use cases (adversarial examples, embeddings)