Wednesday, December 13, 2006

An Operations-Based Approach To Network Security

One of the problems that I often encounter when dealing with issues in network security is deciding what needs to be fixed. Many of the formal models which have been developed just aren't that applicable in real life, since they deal in abstractions that don't necessarily map well to life in the data center. I bring this up now because I'm currently reading Securing Storage1 by Himanshu Dwivedi. Mr. Dwivedi talks about security issues in terms of "Security Risk" and "Business Risk", which correspond roughly to "severity of consequences from a technical perspective" and "severity of consequences from a business perspective". Using this mechanism he goes on to classify various types of attacks against SAN fabrics as "low", "medium", and "high" risk. The whole point of performing a threat assessment is to determine what steps should be taken to best improve your security profile. Ideally you'd like to the items that end up at the top of your list to have a relative sense of urgency i.e. it makes sense to address them before any other items on the list. But here's where a lot of risk assessment metrics run into issues. For example, Mr. Dwivedi classifies some attacks as "medium" or "high" risk based on their security and business implications, but this ranking is less than helpful in terms of actually guiding remediation efforts. In this case the attacks in question require physical access to the SAN; as long as you're following some basic standards of datacenter hygiene, such as physical access control, then there are probably more pressing security issues which need to be addressed. The CISSP approach is a little better in that it (allegedly) lets you calculate how much you should spend to remediate an issue. Leaving aside the difficulties in calculating your exposure, much less the expected annual incidence of a particular risk, this approach still doesn't provide complete guidance. If I have a security budget of $100 and I've 3 projects, each of which suggest that I spend $45 to implement them, which 2 should I pursue? The two approaches listed above, and others that I've encountered, all fail to factor in the actual nuts and bolts of security in an operations environment. Security projects compete for $'s and person-hours, just like any other IT undertaking. In such a situation a comprehensive approach to risk management would tell you which projects maximize your ROI. Here's the part where I start venturing out into somewhat uncharted territory. Where do Mr. Dwivedi's model and the CISSP model break? In Mr. Dwivedi's model its pretty easy to spot the problem; it doesn't pay any regard to the actual yield of any particular remediation step. You should not spend time protecting yourself from a difficult and somewhat theoretical attack if you're also vulnerable to a less theoretical and less difficult attack, even if the latter doesn't pose as big a risk as the former. The breakage in the CISSP model a little more subtle; even though 2 assets might both have the same ALE the cost of protecting those two assets is going to differ, sometimes wildly. So how to build a better mousetrap? I going to be heretical and suggest putting the cart before the horse. All of the risk analysis approaches to which I've been exposed start with some sort of formalism such as the quantification of risks. I'd suggest that we should reduce the use of formalisms, since most of them are badly broken to begin with. How, exactly, are you supposed to calculate the annualized frequency with which you'll get hacked? IT isn't the insurance industry... we're just not capable of making those kind of assessments at this point. Instead, here's what you do: You hire yourself a competent security specialist and have them audit your network for general hygiene. A first approximation to a secure network can be had by the systematic application of a few basic rules: least access, encrypt it if possible, collect and read your logs, etc. Lots of these things don't require money, just the time and know how to implement them. A good security analyst will be able to point out almost immediately where you're network is falling down. A lot of people will probably object to this approach, because it relies on the expertise of a particular individual and treats security as an art rather than a science. This is a valid criticism to a point, and I'll accept it for what its worth. But such criticism disregards the fact that every institution with a computer can greatly benefit by first remediating a list of common problems (or verifying that such remediation isn't necessary). You shouldn't need a formal risk analysis to tell you that you should be using SSH instead of telnet. The starting point for security shouldn't be formal risk assessment, since most organizations don't meet the security baseline that makes such assessments useful. Instead organizations should focus foremost on the laundry list of hardening steps and best practices which can be implemented for free. If they manage to get through that list then they can start to consider formal assessment and its associated expenditures. Here's another area where formal models break down; they don't take politics into account. Rather, they assume that any issue can be remediated if serious enough. This, I know from personal experience, is demonstrably false. So take the list of suggestions that your consultant came up with and cross a line through all of the ones which you know you'll never be able to implement. Go on, do it now, we'll wait... done? Continuing on then... We're still left with the question of how to prioritize the fixes which need to be implemented... this is where the "cost-benefit" from the title comes in. Its very easy to say "cost-benefit", wave your hands, and call it a solution; this is essentially the approach which the CISSP model takes. But such an approach is unsatisfactorily nebulous; what do "cost" and "benefit" mean within the domain of IT operations? In the current context of the discussion "cost" can't be measured in $$$; the fixes which I propose implementing are generally free. Nor can you easily convert person-hours to $$$ since most IT employees are salaried (and generally expected to work long hours to fix things if need be). "cost" is best measured directly in the number of hours it will take to implement the fix. Such an approach has the added benefit that a competent administrator can usually make such estimations accurately; if you have X servers to fix and each server takes Y minutes then the aggregate time required is damn close to X*Y. But cost is only half the equation... what about "benefit"? Admittedly this is a hard question to answer. The benefits of security are notably difficult to quantify, given that security is largely about preventing hypotheticals. I'm going to suggest a practical metric for measuring benefit which will probably piss a lot of people off: the suck factor. The use of this particular metric is ultimately why this is an operations-based approach. You, as an operations person, generally have a pretty good idea of how much it would suck if a particular problem was exploited. Lest I be accused of undue cynicism I'll point out that there's often a strong correlation between business requirements and the suck factor. But really, this metric is about operations folks making their lives easier, not about the needs of the business. Because, frankly, if there's no inherent suckage in a system getting pwned, then why bother in the first place? So really, we're looking at the ratio (implementation hours)/(suckage)2. Admittedly "suckage" is kind of vague, so lets see if we can come up with a concrete analogue, preferably denominated in hours so we can deal with a dimensionless number. A good proxy for suckage is "time needed to fix", which has the additional benefit of also being fairly easy to calculate. The lower this ratio the more it makes sense to implement the fix. In a nutshell: If you're in charge of security for a typical shop the best thing you can do is chuck formal metrics; they probably don't apply to you. Instead find yourself a security analyst or some hardening guides and rank the resulting fixes based on the ratio of the time needed to implement the fix vs. the time needed to clean up if the vulnerability is exploited.
1 Which, incidentally, isn't that good of a book, regardless of what the Amazon reviews may say. It claims to be a 'practical' guide, but actually has very few specifics regarding how to secure existing technologies. A lot of the time Mr. Dwivedi say something along the lines of "this is what the relevant standard says you can do, see you vendor documents for implementation details". That's fine for a theoretical work, but as I said this book claims to be practical. I would expect to see a more thorough review of existing technologies. There are also some technical errors, such as equating "zone hopping" in a SAN fabric to VLAN hopping when the two phenomena are distinctly different. Aside from those criticisms the book needs heavy editing; it is readily apparent that English isn't Mr. Dwivedi's primary language. 2 Here's where I tried to do MathML, see previous post.

0 Comments:

Post a Comment

<< Home

Blog Information Profile for gg00