Big data can mean even bigger problems for Security Operations (SecOps).
As platform providers promote the ability to monitor every single endpoint and network access avenue across your infrastructure with a unifying view of multiple tools, SecOps has more data than they can possibly process, let alone use to take intelligent action.
With attack surfaces constantly expanding for just about any enterprise, knowledge is power. Instead, what SecOps faces is an unrelenting stream of collected data that is overwhelmingly unmanageable. The irony is that more data has led to less value. Having hundreds or more data sources to monitor isn’t what matters—it’s how we use that data in our SecOps programs and how it gets reported to executive management.
What SecOps needs today is a “small data” movement.
Data is only valuable if you can make sense of it
Today’s business and marketing strategies that harness big data tend to collect everything they can and then aggregate it into data lakes that can be tapped by a variety of business users.
Unfortunately, this approach doesn’t address the inherent conflict between what IT operations and SecOps need from a shared infrastructure. Rather than getting just the data they need, SecOps faces a massive onslaught of everything collected; it’s not given the necessary authority to tell IT staff to focus narrowly on what matters to them for detection and response so they can generate deeper value via automation.
There’s no point in having a hundred data sources reporting into your Security Information and Event Management (SIEM) system if you only have rule logic applied to five or 10 of them. Big data is actually achieving the opposite effect from what was intended. Rather, it’s having a significantly negative impact on your actual monitoring effectiveness.
Today the goal is to achieve analytical monitoring in as close to real-time as possible, paying attention to performance, health status, availability, configuration, and security. But each of these has varying requirements from the high volume of logs produced and need to be analyzed differently too. SecOps needs to be able to easily see malicious signals in amongst the vast amounts of data generated by logs, such as signatures, traffic patterns or behaviors.
Although big data can deliver on that requirement, historically it’s proven extremely difficult for humans to monitor at scale effectively—hence the critical need for automation, which can also address the budget and resource constraints of SecOps. Unfortunately, the “more is better” paradigm continues to win out because the belief is that even though we only process a small fraction of hundreds of data sources, we should continue to collect as much as possible just in case it’s useful forensically someday.
The constant onslaught of data generated by 24×7 monitoring can often lead to disengaged security analysts. One strategy to keep in them engaged is to set aside time every week to do a “retrospective hunt” to find numerous incidents that were missed by monitoring. These hunts unveil the highest quality incidents, not the big data efforts—they’re far more valuable than console monitoring
Even with successful SecOps automation, these retrospective hunts should continue. You won’t get the benefits if you’re automating from the onslaught of raw data that comes from infrastructure devices that produce logs and sensors such as network telemetry, security telemetry, application telemetry, host-based telemetry, cloud telemetry, and contextual telemetry. Each of these logs and alerts are highly inconsistent and difficult to leverage for any purpose. Big data capabilities are ultimately producing a lot of useless noise that must be sifted through.
Network sources such as routers, switches, and other network infrastructure equipment can provide a great deal of data about their behaviour, but most of it is not useful without additional tools and expertise. Because they’re highly repetitive in their log messages, there’s a very high noise to signal ratio, and the signal that does come through isn’t all that useful to SecOps. Even security devices such as firewalls can only tell you they’ve dropped a packet—finding out why takes more work. Proxies, identity and access management systems, and intrusion prevention sensors are more relevant by streaming alerts about signatures or anomalies that might indicate suspicious activity in the form of malicious hackers or code, but the volume of data just further buries SecOps with more data on consoles.
When you add in all the endpoint protection systems and growing cloud deployments, there’s far too much data being generated to be used by anyone effectively, let alone the necessary context to take meaningful action and make business-prioritized risk decisions.
Quality over quantity
You could argue that all these data sources are somewhat narcissistic because they only talk about themselves, often repeating things that aren’t valuable. So, rather than collecting all their data, there needs to be a move toward small data that not only is selective, but also has a use in mind for it before it’s collected and stored.
Part of this shift is cultural—we need to get people to stop believing the myth that collecting a hundred data sources into a single platform is providing real value for SecOps or any of the other applications. Further, the small amounts of data we do collect need to be useful and have an obvious purpose—not every log from every router that repeats itself thousands of times a day.
The current approach is actually increasing your costs and volume of data while barely reducing your risk. Humans should not be monitoring consoles for high noise, low signal use cases because they will just get overwhelmed, ignore things and clear the alerts just to keep on top of them.
There’s no point in having a hundred data sources reporting into SIEM system if you only have rules applied to 10 of them. #security #respectdata
Click to Tweet
The small data movement means collecting only what matters and focusing on deriving deeper value from it. Successful SecOps automation means knowing what data to collect, knowing what to do with it, and being able to do it.