Secure Channel Ziff Davis Enterprise Ziff Davis Enterprise
Advertisement
Advertisement
Friday, October 30, 2009 8:18 AM/EST

Congressional Ethics Leak Demonstrates DLP Shortcomings

Data loss prevention products and technologies are what's buzzing in the security market. PowerRating, the security division of distributor Bell Micro, is pushing its resellers to adopt DLP and push it to their customers. McAfee is partnering with Adobe to embed DLP technologies into the software company's offerings. And Symantec, IBM and McAfee are fielding a steady stream of reports that show insider threats are on the rise and that DLP will mitigate potential for intentional and accidental data security breaches.

DLP is hot among large and small enterprises, and increasingly among midmarket companies. According to Nemertes, one-third of businesses it surveyed are using DLP products. Another 21 percent plan to adopt DLP in the next 24 months. And 24 percent are evaluating or planning to evaluate DLP solutions. The research firm says that DLP should reach an 80 percent adoption rate within the next two years.

That's a significant opportunity that security vendors want to cash in on, and the reason behind their push to have solution providers get on board with DLP.

But if you want an example of DLP's limitations, look no further than Congress, which is dealing with an ethics probe that got fouled up by a security breach. This isn't to deter you from adopting and selling DLP to your customers, but rather to understand what the technology can and can't do, and set user expectations.

According to the New York Times, a "low-level" congressional aide took home a computer file that contained the names of at least 10 congressmen and woman who are either under investigation or suspicion of using their influence to gain the favor of government agencies to help businesses in their districts. According to the report, the aide "accidentally" placed the file in a folder that was open to file sharing applications. As a result, it was leaked and the politicians' names became public.

Prior to the incident, only California representatives Maxine Waters and Laura Richardson, and New York's Charles Rangle, all Democrats, were known targets. Now the names of seven others--five Democrats (John P. Murtha of Pennsylvania, Peter J. Visclosky of Indiana, James P. Moran of Virginia, Norm Dicks of Washington and Marcy Kaptur of Ohio) and two Republicans (Todd Tiahrt of Kansas and C. W. Bill Young of Florida) are in the open.

While DLP vendors want to sell the idea that they can intercept and block the disclosure of sensitive and confidential information, the reality is the technology would have been powerless in this and similar situations. Here's why.

Security vendors have grappled with the problem of data becoming increasingly ubiquitous and portable for years. Encryption, access control and file permission security measures have been tried to curb this threat. DLP, which essentially matches the pattern of known data sets, is the latest incarnation of security measures to counter unauthorized data disclosure. Each of these technologies has fallen short because of two things: They either rely on the user to understand and classify the data as sensitive, or they rely on known data strings to identify sensitive data.

DLP remains an immature but improving technology. It's gotten very good at finding and blocking files and traffic streams that contain information such as Social Security numbers, credit card and banking account numbers, addresses, and dates of birth. With some fine tuning, DLP could likely capture data sets down to specific words, such as "investigation" and "Maxine Waters."

Where DLP falls down is in identifying unclassified and contextual data. Maxine Waters is a well-known, active and controversial figure on Capitol Hill. Imagine the volume of false positives if security administrators tuned the DLP to block or quarantine every file and e-mail that contained the words "Maxine Waters" and "investigation."

Every security pro knows there's no panacea for ensuring data integrity and confidentiality. In fact, most security measures are designed and function as roadblocks and speed bumps—they lower the probability of a security breach, but never prevent one from happening. DLP is one of those technologies that is getting increasingly better at filtering out common data leaks and thefts, but is still relatively powerless against contextual data breaches.

When selling, deploying and supporting DLP in customer environments, it's important that you—as the trusted solution provider—set reasonable expectations as to what DLP can and cannot do. This may cause the customer to think twice about purchasing DLP, but it will also prevent them from holding you responsible should DLP fail to prevent a breach that it was never designed to prevent.

TrackBack

TrackBack

http://blogs.channelinsider.com/cgi-bin/mte/mt-tb.cgi/18388

Comments (7)

Kevin Rowney :

I think some basic fact checking is in order here. This article clearly demonstrates a misunderstanding of what Data Loss Prevention technologies are and how they work.

Pre-classification of data is *not* a requirement with modern DLP solutions.

Keyword-only searching is the hallmark of bottom-of-the-barrel vendor solutions.

New advanced algorithms detection algorithm (many of them pioneered by Vontu) have made many of the types of breach you talk about above a quite solvable problem. They have high accuracy, low false positives, and don't require pre-classification.

In fact, these forms of breach defense are not just theory...we've done lots of diving catches very similar to these circumstances of breach.

DLP is no panacea, but the range and breadth of protection is way wider than many practitioners and bloggers now understand.

Kevin Rowney
Founder, Data Loss Prevention Division
Symantec

Kevin, I appreciate your position, and understand the advances made in DLP technology and approaches. However, I would beg to disagree that DLP has gotten to the point to automatically differentiate between routine "Maxine Waters" correspondence and "Maxine Waters" on a list of potential ethics violations without triggering a high volume of false positives.

That said, I will say that Symantec (technology acquired through Vontu), RSA and Websense are among the best when it comes to data loss prevention. In fact, Gartner and other analysts have said that these vendors not only have the best technology, but the professional services to deliver DLP.

From my vantage point, the professional services really makes the difference here since what you're describing sounds as though it requires a lot of fine tuning to catch contextual leaks.

If I'm wrong, out of step or just plain ignorant, I welcome getting an education.

Kevin Rowney :

Larry-
I think there's some common misconceptions out there, so perhaps we are not doing a good enough job of letting the world know the facts; but yes your article (and response) reflects an understanding of the state of DLP technology prior to year 2001.

There really are better ways (that are now currently available) to spot confidential data outside of simply matching for the occurrence of 'Maxine Waters' in the text. This kind of keyword matching is a minimum-necessary capability with DLP solutions, but it is by no means the limit (nor even the best practice) of what can be done for content detection.

In more detail: new advanced indexing algorithms can produce "rolling hash" summaries of the inter-relationship of fragments of text of a given profiled document or database table. These indices can then in turn be used to produce highly accurate search outcomes.

We have 100s of deployments using these techniques allowing brand new forms of protection.

For this particular case, it would've been possible to setup indexing of various folders of the confidential documents of the work of the panel and then unleash automatic searches for the exposure and proliferation of this data. These detections would ignore any simple mention of a congress-person, but would trigger on multi-paragraph direct quotes (or even multi-sententce quotes) of the document embedded inside other documents.

I don't mean to impugn the reputation of any of the security teams who are involved with this case. These kinds of protections are relatively new and it's understandable that practitioners don't yet know what kind of protection is possible. Most textbook training in security day is wholly ignorant of these new advances.

What I mean to do in this response to your post and comment is correct the mistaken notion that this class of data breach is not preventable with DLP. These detection algorithms, combined with universality of coverage across data-at-rest/data-in-motion/data-on-the-endpoint, provide brand new ways to get on top of these risks.


Kevin Rowney

Kevin, I appreciate what you're saying, but I think saying that the description in the post is that of DLP prior to 2001 is a bit disingenuous. I concede that there are and continue to be advances in the technology, but most of those advances have been made in just the past two or three years.

I will concede your point that the algorithms today are able to detect and block contextual data. However, you said something very interesting in your response: the location of the data in a file. One of the more interesting trends I've been monitoring is the integration of DLP with identiy management; tying an ID to data adds a new layer of abstraction to validate (or invalidate) a transaction. But if some of the algorithm's performance is based on the file's source location, wouldn't that make the entire scheme dependent on some user's pre-classification of the document? If the data is removed from that source file, would the algorithm still be able to different "Maxine Waters" the routine mention from "Maxine Waters" the investigation target?

Honestly, this is a fascinating discussion. I look forward to your response.

Omri D. :

Larry,

I actually believe you are getting this right. The challenges of CIOs nowadays are three: a) The IT infrastructures they have built are too expensive for the revenues their companies generate - this pushes companies to shrink the IT footprints and costs, move to more collaborative, infrastructure independent business models (e.g. Virtualization, Outsourcing, Off shoring, Homeworkers); b) Their investments should be oriented towards enabling business and growing profits rather than investing in more security; and c) they have to do more with less IT staff.

All of this creates a need for a new data centric, unified, proactive, infrastructure-independent, and policy based reference architecture that focuses on protecting information at an enterprise level as a business enabler, rather then on providing commodity plugs for data leakage.

Such a platform, and many a large company leaders in private and public sectors recognize this more and more, needs to combine categorization of data, management and segregation of access to data elements integrated with IAM, dynamic context based authorization of data usage, and a real time data level risk and compliance assessment back end in one solution.

DLP and the commoditization of that small sliver of information protection it provides is a far cry from the need and will constantly continue to provide challenges the likes of which you describe in your fine review.

I will be delighted to engage with you on a separate channel.

I’ve been watching the evolution of DLP for 6 years. At Gartner we view the presence of “sophisticated” detection mechanisms as a requirement to get into our annual DLP Magic Quadrant. Sophisticated means beyond keyword matching and regular expressions which are usually good enough for simple PII detection. This would include mechanisms like partial document match, term proximity, and other registered data mechanisms. So I would agree with Kevin that those capabilities have been around for a while.

However, there’s no magic in those sophisticated capabilities and they have to be configured properly to be effective. We tell clients their success will be based on their ability to understand what differentiates the sensitive data from the rest of the data and to choose products that can detect those differences in the use cases they need (crossing the enterprise boundary, segmented stored data, print, copy to removable media, etc). The vendors have no pixie dust to determine, on your behalf, what is sensitive in your environment.

So, the presence of sophisticated mechanisms does not magically solve the problems Larry raises in the original article. More powerful mechanisms are available, but that does not mean that they would have been effective at detecting the issue or that an appropriate policy or enforcement would have prevented the problem. DLP would certainly be beneficial in the capture of operations involving congresspersons names but that would have to be combined, in this case, with a policy that controlled the use of documents on a home computer that contained those names.

DLP is the dynamic application of policy, at the time of an operation, based on the presence of sensitive data. Before you talk to vendors you should have a good understanding of what your sensitive data looks like, what operations put it at risk, and what policy you will enforce. Do that before you talk to vendors and you will increase your probability of success. Listen to the vendors giving you comfort that they will take care of that for you with their powerful capabilities and you will run smack into the shortcomings that Larry describes. It’s not up to the product to solve your problem, it is up to you.

Expanding on the points above, standalone DLP technologies are approaching a very mature stage and could very well address scenarios such as the one concerning the Congressional ethics leak highlighted above. However, security organizations need to transform the “could” of being able to do something (such as identify and protect this kind of sensitive content), into real action where the sensitive content is actively protected. “Operationalizing” these capabilities is the key to an effective DLP/data protection solution.

Identity and Access Management (IAM) is an important consideration to operationalize DLP. In the example above, understanding that the person in possession of this document was a “low-level” aide provides significant context for DLP to correctly address the data residing on the aide’s laptop – or with data the aide is saving, printing, or uploading. For example if an aide is in possession of a file containing this kind of data, DLP could scan and delete that file as, arguably, an aide should never have that kind of file in their possession. On the other hand, if this kind of file was discovered on a senior Representative’s laptop, perhaps the file would be moved to a secure, encrypted location (vs. simply deleting the file). Understanding identity is critical to empowering DLP with the context to take the correct action and to be effective.

IAM process integration is just as important as the technical integration. Implementing DLP within a well-structured IAM processes enables DLP to be more effective and, in most cases, more easily managed. Ultimately, protecting sensitive data can be “done right” only when an organization can automate and integrate DLP into existing technologies, systems and processes.

Gijo Mathew
CA, Security Management

Post a Comment

 
 
Advertisement
Advertisement