It’s baseball season and once again there are plenty of statistics to absorb. Look on the back of any baseball card and it contains numerous stats for each player – batting average, slugging percentage, number of home runs, and so on. Making sense of all these stats has become a time-honored part of the game, but unless baseball pros and fans alike really understand what they’re looking at, all those stats can make it tricky to answer the all-important question: How good is that player on the card?
Similarly, when it comes to an organization’s vulnerabilities, there are a lot of different metrics to help inform, analyze, and prioritize remediation. With alerts piling up and patching that needs to happen, having too many metrics can create unnecessary complexity and slow down processes. With baseball, most people focus on a few important metrics to better understand the quality of any one player. The same principle applies with vulnerability management (VM). Here are four metrics that can help security teams measure the outcomes of a VM program:
- Mean-Time-To-Remediate: MTTR has become a valuable metric because it answers a few important questions: Does the organization patch? And if so, does it patch fast enough? Generally, security teams will create policies for how long they are willing to have a vulnerability exist in their environment. The amount of time may change depending on the level of risk of the vulnerability. For example, an organization may want all vulnerabilities patched within 30 days, but critical vulnerabilities in 10 days. If a security team’s average-time-to-remediate runs at below 30 days, that means most of the time they are following policy. It’s an important metric because it tracks the ability to remediate important vulnerabilities within the organization.
- Remediation Velocity: In a high-functioning VM program, all metrics collapse into two main categories: precision and speed. Precision helps team prioritize their work to a higher optimal degree, only spending time on issues that matter. Speed measures how quickly the team can work on the issues that matter. In VM, all vulnerabilities matter to a degree, even if it’s a small degree. Teams can find ways to work more efficiently and encourage fixing more vulnerabilities with less work. The remediation velocity measures how many vulnerabilities are remediated per vulnerabilities discovered over a time period. Some teams might measure this weekly, benchmarking the new vulnerabilities in the system versus those it resolved the week prior. Over time, this shows how well a VM program performs at the highest level and helps contextualize prioritization. If the team combines this metric with other metrics, such as Risk Score and SLA Success rate, it can quickly get high-level visibility into its program and identify gaps.
- SLA Success Rate: At its core, service-level agreements (SLAs) in VM measure the team’s commitment to doing the job. SLAs define outcomes and goals that the team wants to meet; in this case, fixing vulnerabilities within a certain timeframe. Then the team measures how often they successfully hit that timeframe. High functioning organizations leverage SLAs to determine their own expectations and hold themselves accountable. If the team does not hit its SLA timelines consistently, it’s a clear indication that something has not been functioning correctly. This generally stems from dysfunction between teams, either the VM team and the teams doing remediation work, or unrealistic expectations with not enough time and attention to fixing vulnerabilities. It can also help team determine if it needs more resources or to come up with a new approach to incentivize remediation behaviors. Either way, SLA metrics give the team a high level overview of the organization’s health. It’s used as an alerting mechanism to let the team know when it needs to start digging into its VM processes and pipeline.
- Risk Score: The risk score tracks the average risk from all vulnerabilities in the organization. As the team fixes critical vulnerabilities with higher risk scores, the overall risk score average for the organization should trend down, indicating reduced risk. Teams are given risk scores of between 1 and 1,000 that measure an organization’s general vulnerability. Risk scores are helpful because they offer a measure of what a typical vulnerability looks like at an organization. But what goes into making up this score? Risk scores are computed based on a mix of vulnerability attributes: Common Vulnerability Scoring System (CVSS) metrics, the business context the organization offers, and threat intelligence. CVSS metrics are usually stable, as is the business context related to a vulnerability, unless the team changes it. But threat intelligence always changes, it’s a wild card. New exploits, or exploits being used more or less by attackers, can cause a change in scores. It’s not uncommon for a vulnerability to have a low-risk score when it first appears, and then become increasingly critical as the threat landscape evolves.
There are a variety of other metrics that teams can use when it comes to VM programs. However, it’s counterproductive to overcomplicate a practice that’s already overburdened by analysis and alerts. In the security industry we have a tendency to want to find new and novel ways to understand our data and procrastinate on doing a few tasks really well.
A VM program aims to answer a few basic questions: Am I remediating? Am I remediating effectively? And, am I remediating the right bugs? If the teams metrics don’t help answer these questions, they may create unnecessary complexity and slow progress.
When the team swings for security, best to just keep it simple.
Scott Kuffer, co-founder and COO, Nucleus Security