![]() |
Log management has emerged in the past few years as a must-do discipline in IT for complying with regulatory standards, and protecting the integrity of critical IT assets. However, with millions of logs being spit out on a daily basis by firewalls, routers, servers, workstations, applications and other sources across a network, enterprises are deluged with log data and there is no stemming the tide. In fact, the tide is just beginning to come in. With always-on high-speed internet connectivity and an increasing number of servers and devices that an IT department has to manage, the task of collecting, storing and making sense of all this data is no mean feat. Adding to the confusion are non-specific regulatory requirements relating to logging and archiving that are entirely vague on what an IT department must do, coupled with the increasing pressure for data privacy. It is not surprising then that for many companies the default plan to keep the auditors happy is to simply collect and retain everything from every source. However, collecting and retaining every single log ever generated is often unnecessary from both a regulatory and forensic standpoint, and the retention of the data can often represent a security or liability risk itself.
This confusion in the log management space is further compounded by vocal proponents amongst the vendor community of the "collect everything" approach as necessary for being compliant and secure. My experience is that the world is not a black and white place but a myriad of grays. If you dig a little deeper you might find a reason for the extreme position. It turns out that some vendors really sell capacity for storing logs, others have license fees tied to log volume, yet others have no ability to enforce central configuration of filters across a large installation.
OK, putting aside cynicism, are they actually right? Is this one of those rare cases where the broad statement is simply the correct statement ("don't smoke" immediately springs to mind)? Let's explore this in some more detail.
Firstly, many systems generate large amounts of log data that is often completely worthless from a compliance standpoint or even for a generic forensic inquiry. For instance, with system auditing levels on high for Common Criteria certified Operating systems such as Windows or Solaris BSM or the new LuAS in the 2.6 Linux kernel, logs are generated at a highly granular level to record every kernel action. If applied indiscriminately to any or all machines and all users/files, then routine actions such as listing directory contents generate a flood of events. While this is necessary for C-2 certification, it is rarely useful in the common situation. In cases where such audit levels are needed, they must be applied with care but beyond that, it is necessary to consider if these logs even need to be gathered and retained.
One such fairly modestly-sized utility company concerned with compliance and driven by uncertainty decided to err on the side of caution and collect everything. This resulted in the generation of a whopping 1.5 million events every 5 minutes that quickly overwhelmed the security team. Realizing that most of the data was nothing but worthless noise, they decided to adjust their logging policy and collect only relevant data. Just by cutting out the noise, they were able to scale back to a much more manageable 150,000 events every 5 minutes. A number of other users also have said enough already, and have elected to put in place a company policy on what gets generated, what gets collected and for how long log data gets stored. I like to think of this as maturation of the market.
The utility company above had one advantage going for them when they went to reduce their log volume. They had a fairly sophisticated collection method that allowed them to suppress the collection of events that were not required. Can you get to the right level of auditing for your business requirements by simply setting audit levels in the system? Sometimes, but often not. The Windows OS, for example, turns on auditing levels through coarse group settings. Some of the events in each group are extremely valuable but many are, for all practical purposes, entirely worthless. Take a System Admin use case—if you want to collect status events for printers attached to Windows Servers to monitor for print problems, you turn on print auditing. You get the status events, but you also get a message every time a user prints something. Again, it makes less sense to keep it all merely because the send is indiscriminate.
Another example…I was talking to the compliance team at a large health care provider the other day. The machines they were monitoring had the system audit policies set pretty high, and they were generating enormous quantities of logging data, which the compliance team needed only a fraction of. Internal requirements for other constituencies prevented them from changing the audit levels. They had tried to collect all the Log Data in the pilot project and had quickly realized that this approach was simply going to kill them. They finally adopted granular suppression of event forwarding at the host level. By doing this, they were able to reduce both the network load and the processing load on the collection console (even discarding events requires processing). Having the ability to be flexible ultimately enabled them to save a significant amount of money in terms of infrastructure costs. Took a bit more work up front figuring out what they needed, but it was well worth it for them in the end.
Secondly, the view that large amounts of raw log data must be stored for long periods of time for successful compliance is a misconception. Consider HIPAA as an example. It is often repeated that HIPAA requires all raw data from all transactions to be retained for 7 years to support auditor requirements. However, HIPAA requires only a review of audit data and not necessarily the record of raw log data itself--Information system activity review (Required). Implement procedures to regularly review records of information system activity, such as audit logs, access reports, and security incident tracking reports "[HIPAA 45 CFR 164.308 (a) (1) (ii) (D)]." PCI-DSS, which is one of the more precise standards, calls for the retention of specific audit trails pertaining to unauthorized access, invalid logins, deletions and admin activity only. Sarbanes-Oxley is another case in point and is concerned mainly with data relating to financial controls and reporting. Nowhere do these standards state or imply that all log data from all sources within an enterprise needs to be collected and archived.
Some would argue that collecting everything is essential for establishing forensic reality when lawyers and/or auditors ask for evidence of who did what and when. However, it must be understood that these people are interested in who's touching your crucial data, and more importantly who's modifying that data and whether this behavior complies with your security policy. In this case, as in the above example, listing directory log entries are useless, as they cannot prove if someone actually accessed a particular file, and can certainly not prove if the file was modified.
Thirdly, while it might seem logical to collect everything, an important thing to consider also is operational impact and feasibility. While storage has certainly become cheaper, it is by no means free. In the above example, 1.5 million events every 5 minutes translate to roughly 100 MBs of disk space, and this is with 90% compression on the logs! This can quickly add up to a significant amount of your IT budget being spent on storage. Also, despite what any vendor says regarding the robustness of their indexing, there is always much more of an impact on query time when querying a multi-terabyte repository than a multi-gigabyte repository.
Another interesting perspective that I get from some companies that I already mentioned in the introduction is that from a legal perspective, collection and retention of log data is in itself a security issue. The Legal Departments have decided that the logs themselves are sensitive data and should be collected and retained only on a need-to-have basis. I have talked to a number of companies that are collecting only what is explicitly required, and are pretty aggressive in discarding it as quickly as possible. They save the proof that they have collected and examined the data, and perhaps the summary reports but not the actual data itself.
All of these stories remind me of something I went through with my financial records. Over the years I have built up a number of accounts, mutual funds etc. (sadly they don't have a lot of money in them). I used to collect, read and store everything that came from the various institutions. But darn, it was a lot and then it just started to be more and more as the financial reporting requirement laws got tighter over the years. After a while I got so annoyed with the amount of boring mail I was getting I started to simply toss it--often to the point of missing things that I really needed to know about. Also discarding the account information in a safe manner was a real pain (and not something I did really well, to be honest). Plus all those records with account numbers and the like lying around was probably not a really smart thing, even though it was in my house.
Finally I got smart and told the financial institutions to stop sending the routine stuff. I still get too much but I do tend to pay some attention to it and make sure it does not lie around in large piles. I think the only entity not happy with my decision is the USPS as they got paid to carry all that mail.
Granted this is not completely a fair analogy--I suspect if I really needed information I could probably obtain it for a fee with a lot of effort, whereas with a log once it is overwritten it is gone. But the fact is I never have needed the information, and probably never will. And you know, even if I could not get the information after the fact, I would still be doing the same thing. I like to think I took a risk-based approach to my record retention. The cost in terms of convenience outweighed the advantages of having it all.
So my thought is that "Leave No Log Behind" as a strategy makes sense only at first blush or for small event volumes. For the bigger shops one of the first steps to smart and cost-effective log management is to figure out what is needed and what is not needed. This requires thought. Keep enough, and you will be able to demonstrate due diligence with regulatory standards and defend critical IT assets from internal and external abuse. Keep too much, and the cost of storage and resources required to archive and analyze the data could easily skyrocket to unmanageable and unjustifiable levels. While the final determination of the valuation of log retention depends on each company's business requirements, having no option beyond what the system vendor provides to reduce the log volume is neither beneficial nor cost-effective.
Steve Lafferty is VP of marketing at Prism Microsystems. www.PrismMicroSys.com
