Navigating Splunk Implementation (with Enterprise Security): A Practical Approach

In this post, I’d like to go over a Splunk project implementation (on-premises) concerning security. This is based on our experiences as consultants, where we’ve helped numerous organizations overcome common challenges and unlock the power of Splunk for enhanced security insights and streamlined operations. As security professionals ourselves, we understand the critical need for a robust security posture, and Splunk plays a vital role in achieving that. In our work, our goal is to provide a reliable, scalable, and maintainable solution by understanding project requirements and following Splunk’s best practices. However, like any project, it comes with its own set of challenges. In this blog, we’ll explore the journey of implementing a Splunk project, from planning to overcoming hurdles and achieving success. I hope this high-level overview empowers other practitioners (e.g. security teams, architects, consultants, etc.) to navigate Splunk implementations successfully.

Assumption

Before diving into the intricacies of the project, it’s important to establish a few assumptions. Assuming there are no significant business or contractual issues, we can focus our attention on the key aspects of the implementation process. Note that Splunk has significant resources and documentation available online (see References below) and I’m trying to summarize our interpretation based on our years of experience implementing Splunk projects on-premises.

Project Plan

Each project is unique but there are some standards we utilize to provide better estimates of project timelines. At the very least, we need to know the general objectives, sizing (e.g. GB/day), number of data sources, and number of use cases. The more detail we have, the better the project plan and estimates are. Typically a successful implementation can take anywhere from a few months to a year depending on how we schedule our mutual times with the customers. I’d like to provide the steps of the project plan template that we use with our customers.

screenshot Note: While the plan outlines a general sequence, it doesn’t have to follow a strictly linear approach. We’ve found this method efficient in minimizing dependencies on other teams, which can help streamline the process. However, it’s important to acknowledge that this approach can also limit flexibility in later stages. Some phases, like data collection, application implementation, and use-case development, can be iteratively refined throughout the project.

1. Architecture and Planning Workshop

Like any project, a project kicks off with engaging workshops to align stakeholders and define project goals, scope, and timelines. Most of the time, we have already determined the high-level requirements and desired architecture for Splunk implementation during POC and pre-sales. In this phase, we dive deeper by referencing Splunk Validated Architectures, infrastructure resources, specifics on data inputs, and use cases.

Notes

2. Splunk Implementation

I have to admit that this is probably the easiest of all the steps in any project. We almost always gather server information (e.g. IP and/or hostname), desired ports for communications (e.g. replication ports, data inputs, etc.), and other requirements such as replication and data retention. Based on this information we prepare the Splunk base configurations before meeting with the customer and make sure necessary commands and scripts (i.e. Ansible playbooks) are ready to be deployed so that during implementation we don’t have to reinvent the wheel but simply review and copy/paste configurations/scripts. Depending on the environment, this step usually takes a few hours unless we are forced to do manual installations. Note that most of our installations are Distributed Clustered Deployment + SHC - Multi-Site (M4 / M14) as per Splunk Validated Architectures, which provides better High-Availability of stored data and increased load distribution (ingest & search) for performance.

Notes

3. Data Collection

The goal at this stage is to make sure we’re collecting all the data that is needed with performance requirements in place while ensuring Common Information Model (CIM) compliance as much as possible. CIM compliance is crucial so that we can utilize out-of-the-box use cases in the Enterprise Security App or develop new ones with (accelerated) Data Models. It is best to have a test or pre-production environment to test and verify various data inputs; however, in practice, this is rarely the case. For standard data sources such as Windows event logs, Palo Alto syslog, etc. we don’t have to do much other than installing the proper add-on and getting the data in. For custom or not-so-standard data sources we usually ask for masked data examples so that we can test and make the proper configurations in our dev/test environment before deploying directly in production.

Notes

4. Enterprise Security Application Implementation

Installation of the application is pretty straight forward. For our projects, especially if the customer has experience in more classical or legacy SIEM systems, we need to talk about the Urgency matrix and explain the importance of adding asset and identity information with proper prioritization. The challenge is usually not technical but departmental and policy based depending on customer’s environment. The other important part would be adding new/custom Threat Intelligence sources.

Notes

5. Use-Case Development

Developing use cases on a clean slate is relatively easier but I’d like to mention the case where we usually transition from an existing SIEM system. It is very likely for an existing system to have hundreds of correlation searches. Based on the projects we’ve done so far, we usually face 200–600 correlation searches on old/legacy SIEM systems that need to be transferred over to the new Splunk installation. Instead of diving directly into rule implementation, we usually spend 1–3 days (depending on content) going over Splunk’s viewpoint. Up to this stage working with Splunk to arrange workshops and trainings for the team is crucial so that we can all speak the same language. This step can become pretty different for each customer since each environment is unique and may have varying requirements. We try to stick to general guidelines and out-of-the-box use cases/analytic stories as starting points and shape our approach accordingly.

Notes

6. Project Closing

At this stage, the goal is to verify the implementation of previous steps by testing, documentation, and knowledge transfer to ensure a smooth transition to operational use. I find the specific documentation aspect to be very important. It is always best to finish testing and documentation during implementation. Still, in practice (e.g. due to time limitations) we may end up reviewing and doing these tasks at the very end.

Notes

Challenges

As is the case with any project, we always face issues and challenges throughout the implementation. I’ve tried to list some of the major ones below:

Conclusion

Implementing a Splunk project requires meticulous planning, effective communication, and a proactive approach to challenges. By leveraging the platform’s capabilities and addressing these hurdles head-on, organizations can unlock valuable insights from their data, leading to improved operational efficiency and informed decision-making. While the specifics may differ, the core principles of planning, communication, and overcoming challenges can be applied to many data-oriented projects. I encourage you to share your own experiences and feedback in the comments below. This outline, based on our experience, can serve as a starting point for navigating similar projects and unlocking the power of data for your organization.


References: