Ever since I started my career in information security I was both interested and intrigued by metrics applied to vulnerabilities (or metrics in general for that matter). CVSS is certainly not new and I had to make the choice whether to use it or not in the past and I always wanted to share some issues I had with it. This blog post laid dormant in DRAFT state since 8 months and I decided to publish it in parts rather than wait another year to finish it.

This blog series will explain a few elements of CVSS and in particular the points I feel are unclear, misleading, old or simply unfit for purpose.

This post assumes that you are accustomed to CVSS, if you are not, you may want to have a look at : http://www.first.org/cvss/cvss-guide.html




Table of content


  • Introduction to CVSS
  • CVSS Base Group
  • Focus on the Temporal Metric Group
  • Metric scoring (Temporal)
  • Critique (Temporal)
  • Comment about overall Data Skew

Introduction

The goal of this blog series is to take a fresh look at the CVSS, from different viewpoints and mix use cases into it - we will dive into it's fitness with regards to the current threat landscape, ideas on how it could be transformed, changed and/or reused.

CVSS is split into three distinct metrics, the base metric (Raw rating of the vulnerability), the temporal metric (describing the vulnerability lifecycle - Exploit maturity - Patch maturity) and the Environmental metric (representing the impact of the vulnerability on a specific entity).


We got three metrics resulting in a Overall score. The concept makes sense; Vulnerability database maintainers (Bugtraq, Mitre/DHS, Vupen, Secunia..) express the base fundamentals of a vulnerability (CIA triad) - and the Enterprise security team adds the temporal and environmental score to it adjusting the score to their particular environment.

Imporant to note - strictly speaking, after the introduction of the Temporal and Environmental metric - CVSS becomes a metric expressing risk, trying to express how much risk a vulnerability poses to a specific entity at a specific point in time.

The CVSS Base Metric Group

I'll leave it to the CVSS SIG themselves to explain the purpose and goal "The purpose of the CVSS base group is to define and communicate the fundamental characteristics of a vulnerability. This objective approach to characterizing vulnerabilities provides users with a clear and intuitive representation of a vulnerability. Users can then invoke the temporal and environmental groups to provide contextual information that more accurately reflects the risk to their unique environment. This allows them to make more informed decisions when trying to mitigate risks posed by the vulnerabilities"

In other words, only the base metric is measuring the vulnerabilities themselves, the environmental and temporal metrics are pure individual risk metrics.

Focus on the Temporal Metric Group

Citing the CVSS documentation "Temporal: represents the characteristics of a vulnerability that change over time but not among user environments."

Exploitability

We see that "Exploitability" is rated on publicly available information, particularly on the basis of whether poof of concept code or details vulnerability descriptions exists that eases the exploit of a giving vulnerability.

This makes sense as the effort that has to be put into getting an exploit/flaw to work is often an indicator as to the immediate threat this flaw poses to an organisation.

However it is not granular enough to not make the distinction as to include the availability COTS commercial grade exploit kits like Canvas or Core impact.Obviously these should be key factors to weight in on scoring a vulnerability in terms of risk.

Possible Values : Unproven, Proof of Concept, Functional, High, Not defined


Remediation Level

We see that the Remediation Level is measured by that availability of a Patch and the maturity of it (Workaround, temporary fix..etc)

CVSS defines "Remediation Level" as "The remediation level of a vulnerability is an important factor for prioritization. The typical vulnerability is unpatched when initially published. Workarounds or hotfixes may offer interim remediation until an official patch or upgrade is issued. Each of these respective stages adjusts the temporal score downwards, reflecting the decreasing urgency as remediation becomes final. "

Hence choosing "Official Fix" will reduce the end score of the Vulnerability.

Example : If a vulnerability has a base score of 10 (TEN) choosing official fix reduces the score to 8.7 - This solely on the basis that a patch exists . It does not imply you have actually deployed it.

Possible Values : Official Fix, Temporary Fix, Workaround, Unavailable, Not Defined

Report confidence

"This metric measures the degree of confidence in the existence of the vulnerability and the credibility of the known technical details"

In other words this metric functions as a sort of trust metric, it is not influenced by the choices you made in the "Exploitability" section. Theoretically you could have a "Proof of Concept" rating in the "Exploitability" Report confidence metric and "Unconfirmed" (Which doesn't make sense)

Possible Values : Unconfirmed, Uncorroborated, Confirmed, Not defined

Metric Score

Below is a summary of how the temporal score weights into the global score, and we notice that the overall score can only be decreased and not increased.

TemporalScore = round_to_1_decimal(BaseScore*Exploitability*RemediationLevel*ReportConfidence)
Exploitability   = case Exploitability of
                        unproven:             0.85
                        proof-of-concept:     0.9
                        functional:           0.95
                        high:                 1.00
                      not defined:          1.00
RemediationLevel = case RemediationLevel of
                        official-fix:         0.87
                        temporary-fix:        0.90
                        workaround:           0.95
                        unavailable:          1.00
                        not defined:          1.00
ReportConfidence = case ReportConfidence of
                        unconfirmed:          0.90
                        uncorroborated:       0.95      
                        confirmed:            1.00
                        not defined:          1.00

 

Critique on the Temporal Score

My comments on the Temporal Score and how it weighs into the global CVSS score :


Counter intuitive use of "Exploitability" and "Report Confidence" 

It begs the question whether the two metrics shouldn't have been integrated into one - let me explain it by a few use cases.
Use Case Examples
  • As "Exploitability" you choose either "Proof of concept", "Functional" or "High" (which are 3 out of 4 possibilities of the Exploitability Index)

  • Report confidence = ? 
Isn't the above somehow implicit ? Since proof of concept code exists, isn't it clear that the report confidence is confirmed by itself ? The only way the ReportConfidence metric makes sense to me is the case where "Exploitability is unproven". 

In this case theoretically at least the "Report Confidence" could weight in and compensate for the lack of evidence on the "Explotability" index. However it simply can't . See the Temporal Metric can only decrease the score and not increase it. So it can never actually compensate for the decrease of the "Exploitability is unproven" index regardless if you are 100% confident about the report confidence.

To illustrate let's take an example - the decrease from the "Exploitabiliy unproven" (*0.85) cannot be compensated by "Report Confidence confirmed" (*1.0).

So what use has the "Report confidence" index then, if it's goal is not to compensate for a lack of proof of concept code  or other ?  

Furthermore the CVSS score allows for all possible cases of Exploitability and ReportConfindence ratings, for instance it is possible to have a rating indicating Exploitability is HIGH but ReportConfidence of Unconfirmed.

This would lead to the situation of a lower score although the vulnerability is exploitable and it has been proven so by a POC.


Unclear use of Remediation Level

Rating the vulnerability as having an "Official Fix" will reduce the overall score of the vulnerability. I ask myself why that is the case? Why should an "Official Fix" decrease the rating of a vulnerability ?

I personally  believe that the reason why lies within who created the CVSS, the FIRST (Forum of Incident Response and Security Teams) has. Indeed if you are an Incident Response Team then it might make a difference whether a vulnerability has an official patch or not.

In that case you are interested in a temporal rating that reflects your business purpose (Handle incidents) - In situations where limited resources need to investigate an amount of new vulnerabilities, it's essential in knowing where to concentrate the efforts in developing Mitigations.

It is not however in managing vulnerabilities in non incident response scenarios. CVSS however seems to have been widely adopted to manage vulnerabilities in a non incident response scenario - the current rating is unfit to reflect this. The existence of an official fix should not decrease the overall rating of a vulnerability if used as basis as example of a patching policy.
CVSS based Patch Policy
Let me clarify this by a simple example : FIRST encourages to use of mapping CVSS score to a patching policy that can be based around CVSS scores. The higher the score the faster the patch cycle. That makes sense, what doesn't however is that the metric that is supposed to indicate whether a patch exists ("Remediation Level") actually reduces the overall score, and thus directly influencing the patch policy. That shouldn't happen.


Final comment for this part - CVSS Data Skew

The CVSS score distribution below clearly indicates that something is off with scoring calculations, based on 49654 vulnerabilities only 152 are between 8 and 9. A clear sign something is off and skewing the score in a certain direction, (I ignore what and haven't really been looking into it)
Source: CVEDETAILS.COM

Next up : Environmental Metrics / Score Distribution / Introducing Threat Agent categorisation into CVSS scores

3 comments

Olivier said... @ 26 March, 2012 09:25

Waiting for your discussions of the environmental score...

I'm facing so much problems with environmental score where one cannot express vulnerabilities such as : "client side vulnerability on a server that is not connected to the internet".

This kind of vulnerability is often considered as "remotely exploitable".

When facing a vulnerability management process, we want to have a clear view of risks in order to delay low risk vulnerability patching, which represent a great cost.
The only tool provided by CVSS is the environmental score that allow to express temporal facts, and asset criticity.

Environmental score does not work correctly in this example, and we endup tweaking the CVSS vector from "Remotely exploitable" to "Local network exploitable" if the vulnerability is considered a client side exploit and if the server has no internet communication.

In consequence, we tweak the base vector, when we should interact with the environmental vector.
We also have to check manually if it is a client side vulnerability, because it is not expressed in the CVSS base vector, so CVSS databases are not providing with this information.

That is the most frequent example I'm facing.
I hope you found some ideas to map this case to CVSS, maybe through threat agents ?

RICHARDSD said... @ 19 April, 2012 14:35

Wonderful post! You’ve made some really helpful statements and I appreciate the time you’ve taken within your producing. It’s easy to find that you simply have an understanding of what you are discussing about. I’m looking frontward to studying a lot more of your Web Pages material.

Security camera installation Denver Colorado

Anonymous said... @ 09 May, 2012 00:36

There are some good observations above.

I agree completely with the observations concerning computation of exploitability.

I don't have the same problem as Olivier does above. I would code "client side vulnerability on a server that is not connected to the internet" as: AccessVector - local. In order to exploit that vuln one must be on the machine. For a standalone machine that would be the case even for network vulnerabilities.

I think that you also have a good point regarding the Remediation Level scoring. However, I think that all vulnerabilities should be scored in regards to the overall effect on the organization. This is difference between fragility and resilience. A system can be fragile but still useful. It can still perform its job at an acceptable level of risk because the organization is more resilient. The ability of the organization to handle the incident is a large component of their resilience.

Vulnerabilities rated only on the effect on a system are simply statements of the fragility of that component. They do not reflect the importance to the organization.

I agree with your statement concerning use of CVSS for a patching policy. Similarly, one would have to eliminate the environmental scores if one were going to use the scoring system to examine the fragility of components in their architecture. One might do this in development of a continuity plan. I think the CVSS is most aligned to utility in an overall risk management framework. Focused use of the tool requires some reassessment of tool - which requires understanding of the tool.

I'm going to be a bit pedantic concerning your score chart. I'm not certain what ranges you used. They overlap i.e. is a 2 counted in row 2 or row 3?

Your graph has made me consider what is the possible distribution of scores? It might be that the possible score combinations are weighted such that certain numbers appear more often. If not what we may be seeing in your graph is a statement of the types of exploits or vulnerabilities discovered. This may show a predilection on the part of researchers and not be a symptom of the tool at all. It might be that researchers focus more on network exploitable vulns that affect C, I and A.

Also, what is the purpose of f(impact) in computing the base score? It seems to be some coefficient of impact seeing as CIA impact is already taken into account. But I haven't found an actual explanation of it yet.

Good article and brings up some good points for consideration.

Post a Comment