Blog | Apr 08, 2020

Operating a World-Class SOC – No Matter the Circumstance

Just like every parent worries about protecting their children and caring for their extended family during these uncertain COVID-19 times, I find myself having similar feelings for my teams. How can I ensure our staff and their families stay safe from this virus? How can I best support them in this time of need? How can we help give them as much peace of mind as possible under the circumstances?

It is even more worrisome when I think about our ability to deliver on our core value: a customer’s network can never be compromised. How do we ensure our staff members are safe and continue meeting eSentire’s customer obligations?

Through years of diligence and discipline, we have built a robust, resilient security operations center (SOC) capability. In good times, our SOCs are our secret sauce. In today’s environment, our ability to continue to operate a world-class SOC when our analysts are remote is even more critical.

Without giving away too much of our recipe, I thought it might be useful to have a look inside our global cybersecurity services organization to show how we operate (in general) and why we are well-positioned for business as (un)usual in this unprecedented situation.

The human element: SOC analysts

eSentire is known for pioneering a new form of security service: Managed Detection and Response (MDR). Essentially, MDR is a combination of advanced threat detection technologies, extensive processes to monitor and react to those technologies and, most importantly, cybersecurity experts who decide if/when a response is needed to attacks on customers. While we employ countless technologies to automatically locate that “needle in the haystack” of potential threats, there is no way we can 100 percent make a response decision without relying on some “grey matter” correlation.

The SOC analysts are indispensable to the MDR delivery model. We are in our nineteenth year of delivering this mode of cybersecurity, and one truism we have come to learn is that threat actors are always changing their game to evade state-of-the-art cybersecurity protective controls. Or, as the army generals will tell you, no battle plan survives contact with the enemy because the enemy still gets a vote.

The human analyst has always been (and will always be) key to effective MDR service delivery. We designed our SOC and supporting business processes with that in mind. In fact, we built a six-point playbook to ensure we can attract and retain skilled security analysts despite a global shortage of extremely high-in-demand cybersecurity professionals.

Building a talent pipeline

Recruiting top analyst talent begins with the relationships we’ve cultivated with educational institutions in the regions around each SOC. We’re proud to count a dozen renowned universities and colleges as partners. Several members of our organization have board and advisory positions at these institutions and provide informed recommendations regarding their respective information security programs.

This proactive approach helps to ensure a steady stream of graduates who are prepared to step into our SOCs, equipped with strong foundations for analyzing today’s threats using modern tools and techniques.

Taking care of our team to prevent burnout

Operating an effective and efficient SOC takes operational maturity and a recognition of the importance of looking after your team. Once someone joins our organization, we do everything we can to make them successful because when they succeed, we succeed—and our customers are defended.

“SOC burnout” is a major problem facing most managed security service providers (MSSPs) and private SOCs. We take this threat very seriously and have instituted programs and processes to mitigate it:

  • We limit shift length to preserve analyst efficacy
  • We pair new analysts with a senior SOC resource, so there’s no “trial by fire”
  • We optimize our shift staffing based on historical data
  • We provide analysts with a minimum of three weeks of vacation, plus sick days, and we genuinely encourage and enable them to use this paid time off to decompress

Assuring quality

Within the SOC, we employ a variety of quality assurance techniques—including random sampling, auditing our automated alerts and continuous tracking—in pursuit of two key outcomes:

  • Helping analysts grow and improve by providing objective, continuous performance feedback and by informing new training requirements
  • Ensuring customers receive the best possible service experience

Investing in SOC efficiency

Running a world-class SOC isn’t a “set it and forget it” exercise, and this is what trips up many private SOCs and MSSPs and, ultimately, contributes to analyst burnout.

Let me show you what I mean.

Over the years, we’ve built up a full spectrum detection capability, with signals coming from networks, endpoints, clouds, logs and more. As we’ve won new customers and introduced new capabilities, the volume of signals coming into our SOCs has grown enormously. No amount of incremental adjustment or increased effort can accommodate this increase. It’s only through continuous investment in state-of-the-art tools—both proprietary and third party—that we are able to analyze and process the volume of threat signals that we take in on a daily basis.

This investment in technology ensures our SOC analysts have what they need to perform their jobs at a tremendously high level, without succumbing to the burnout which plagues the industry.

Continuous education and certification

Cybersecurity professionals are in high demand, which means they can pick and choose where to work. Money is often a secondary factor when it comes to making that choice.

One of the top things these professionals look for is the opportunity to grow their knowledge and skillset, and eSentire provides them with that. In fact, we have a dedicated team of learning professionals who manage SOC analyst onboarding and advancement. This program is a major reason why SOC analysts who join eSentire stay at eSentire.

Career advancement

The importance of career advancement opportunities is closely related to the previous point. Working for eSentire is attractive because we offer numerous paths for career development—whether someone is interested in deep technical growth, a managerial path or exploring other roles within our company.

Ensuring business continuity

We find ourselves in a global pandemic where governments are restricting movement and even forcing businesses to work remotely or not at all. Many companies are operating under suboptimal conditions, which affects their cybersecurity posture. I can tell you, based on the inquiries and calls I have had recently, they are worried.

We’ve got their backs. In our 19 years, this isn’t the first time we’ve had to defend our clients during a disaster.

Back in 2012, our client base in New York faced Hurricane Sandy, the deadliest storm of the season. The storm killed 233 people in eight countries, affected 24 U.S. states, caused major flooding in Manhattan streets and subway tunnels and caused $64 billion in damage.

Our SOC team studied traffic analytics for a three-month period around Hurricane Sandy. Data showed a 30 to 40 percent drop in network traffic across our client base located in New York City for the two weeks during and after the hurricane. However, the level of threats remained constant throughout this period. In fact, attacks spiked by 30 percent during the week following the hurricane. When a disaster strikes, chaos is often a result, and criminals know how to take advantage of chaos.

For our SOC, this raises the stakes even higher because cyberthreat activity increases while our customers are more vulnerable. Many are dealing with unprecedented remote working scenarios and the mental load of new professional and personal stressors.

Being prepared and decisive

As the reality of this pandemic set in at the end of February, we created a standing team of key leaders to do daily briefings on their parts of the business, to react to ever-changing conditions and to run “what if” scenario planning. This team made the recommendation to invoke the remote access portion of our business continuity plan (BCP) for all staff, which we implemented on Friday, March 13.

For us, this was an easy decision as we have been investing continuously over the years in having a best-in-class BCP. In addition to the key leaders’ daily briefs, individual teams within the organization also do daily stand-up meetings, which begin with a health and wellness check on team members.

Enabling the distributed SOC

So, what does life for a SOC analyst look like during remote access? The answer is pretty much the same as it would be prior to enacting our BCP. Let’s look at some of the tools and methods we use to enable this.

To start with, we need to look at our threat pooling model, which is core to how we manage threats. This model utilizes the equivalent of an automated call display, as found in call centers, to implement a first-in and first-out model for threat investigations. What makes it through our vast array of filtering, correlation and analytics models is on average one threat signal to be investigated per 1,000 raw event signals. The threat pooling model allows us to always assign the next available analyst to the next threat to be investigated in the queue. If we have a surge of threats, we modulate the number of analysts analyzing threats versus “off board: duties. Typically, this hovers around 25 percent, but we have capacity to handle substantially more. We strive to triage threat signals within a couple of minutes against a service level objective of 20 minutes. Year over year, our average threat investigation is closed in about 10 minutes – today 3 weeks into remote access we are averaging 8.7 minutes.

In the SOC, this all happens within the analyst’s cubicles, on the analyst’s workstations—all of which are laptops. When operating remotely, the analysts use exactly the same laptop. Within the office, the analysts connect to our AWS data pipeline via a zero trust, single-tunnel VPN connection. When operating remotely, they use the exact same connection. All analyst authentication is verified via a multi-factor method. When working in the SOC, analyst actions are continuously audited (at least 100 times a day) and this same auditing continues while operating remotely. Aside from the change of location, it really is business as usual when it comes to threat analysis.

In terms of collaboration tools in our two primary SOC locations, the team standardized some time ago on particular tools for video-conferencing, chat communication and remote assistance applications (to allow senior analysts to coach less experienced analysts on complex or unique investigations—sometimes four eyes are better than two!).

During this pandemic, these communication tools have taken on increased importance beyond enabling our SOC analysts to be effective. The sense of isolation that can arise during this period of WFH and social distancing can be a threat, and these tools let us maintain rich human contact.

We also don’t hit the pause button on training—threat actors are working hard to come up with new attack methods, and our training team works hard to ensure our analysts are kept up-to-date.

Business as (un)usual: defending our customers

Operating an effective and efficient SOC demands operational maturity and recognition of the importance of taking care of your team.

None of our practices change because of COVID-19. In fact, they become even more important. Everyone is adjusting to a new operating reality and these adjustments can be challenging. In challenging times, protecting our well-being is a major priority, and it’s a prerequisite for defending our customers.

Our rapid decision to isolate and protect our staff and their families is ultimately the best thing we can do to ensure we are living our core value: a customer’s network can never be compromised.

For more information on how we ensure through our SOC that we live our core value of "a customer can never be compromised," no matter the circumstance, view our on-demand webinar.

J. Paul Haynes

J. Paul Haynes

President & Chief Operating Officer

J.Paul Haynes is a professional engineer with a 25-year entrepreneurial track record of success. J.Paul has led eSentire to 10x its size since he joined the company in late 2010.