ESG research indicates that only 30% of cybersecurity pros feel like they are very knowledgeable about AI or machine learning and their application to cybersecurity defenses. And that’s apparent as, according to research, only 12 percent of enterprise organizations have deployed AI-based security defenses extensively, and 27 percent have deployed AI-based security defenses on a limited basis. If AI is our “Cyber Savior,” why are the numbers so low?In its current stage, AI is very difficult to apply to security defenses for prevention or detection in the true sense of human-like decision making. This is because cybersecurity is not defined by observed rules – like chess or driving a car. AI is challenged to address the unstructured environment and lack of rules within security to address detection, investigation and response. While machine learning is showing interesting capabilities in detecting anomalies and to reduce alert false positives, this is not real AI. At the same time, improved automation is driving security analyst efficiency and effectiveness when defenses are combined and share data.
Metadata + machine learning
Machine learning allows you to train models for specific use cases to develop baselines to understand what is normal. An undesired activity can be part of the baseline, so you use peer groups to detect these exceptions as collusion between random peers is unlikely. Also, over time baselines shift and models have to be updated with new training sessions, or they can experience ‘drift’ from changes in data sources.Humans need to closely define use cases, define optimal data sources, provide feedback and refine models, and validate results.
Machine learning is quickly becoming a new feature across a wide variety of security solutions. It can find anomalies and nuances in data too fine for humans to detect, and on a consistent basis over long periods of time. For example, DLP is often thought of as a first defense to detect insider threats, however, the reality is DLP alone is not very effective for this specific use case. Adding DLP data into machine learning models focused on insider threat anomalies has shown to be more effective.
Machine learning paired with metadata, going beyond “data about other data” to provide data lineage, relationships, mapping and optimizing is crucial. Metadata is technical, operational, business and social data. For example, as digital transformation drives cloud adoption and increasingly more cloud-native apps, identity access becomes a new perimeter. Traditionally CIOs control identity access management with the objective to enable access while CISOs see the risk. Machine learning has been proven effective to clean up access outliers, dormant accounts, shared access, unknown privilege access, and detecting anomalies based on behavior, peer groups or analytics. This can augment a SOC team’s perspective for detection, investigation and response viewing identity and access as a threat plane. Without machine learning models and metadata, it would be an exhaustive human effort.
Conclusion
There’s a lot of work to be done before we’ll get an “Alexa” for security. Yes, we can use machine learning to find anomalies in data sets using attributes with variation for specific use cases, but this is more about machine learning and advanced statistics than AI. Instead of looking to AI, enterprises today need to be leveraging metadata, machine learning and advanced statistics, alongside improving detection skills and developing internal threat intelligence.At this phase of AI’s maturity, quality metadata for cybersecurity is key as it provides content and context at a lower cost than storing the original data itself and is indexed for fast iterative queries used in threat detection and hunting. AI as a silver bullet is over the horizon and years away, focusing now on quality metadata and specific machine learning models and use cases will provide a better payoff.