Work Sample 2023

The goal of this document is to be a quick sample of my Splunk knowledge. While not comprehensive, it should provide verification of my skill on various topics of searching, alerting, dashboarding, data feedback, and access control.

Searching with a focus on efficiency

My goal in any given search is to return the required results in the most efficient way for the search head and indexers to process.

Example:

  • I started with a |tstats command to look in firewall logs for the first destination IP for any given source IP starting with “0.133”.
    • Using this command on indexed logs rather than on a data model trades a negligibly slower search for the benefit of not constantly running a search to update a data model.
    • The exception to this is if fields contain certain breakers, which do not play nicely with TERM() and PREFIX().
  • After removing the trailing “=” from field names, I evaluate a field that converts search results to TERM() values, then turn it all into one result with one column.
    • |return $search_field will return a new field called "search", populated with the previous value of search_field. This is a unique way of returning data which can then be easily fed into a subsearch, as seen below.
This search is the equivalent of searching line 1 with the results of the subsearch (seen in the previous picture) as additional parameters.

The use of |tstats and a subsearch which returns IP addresses surrounded by TERM() will return results much faster, and with much less weight on the search head, than a standard search.

Alerts and Saved Searches

Alerting is not a feature allowed by my personal Splunk instance. In lieu of an example, some practices that have served me well are:

  • Alerts and other saved searches should not all run at the same time.
    • Every scheduled search should not run on the hour or half-hour, but instead on odd minutes to distribute the search load.
  • Be judicious in what you choose to alert on, in order to prevent alert-fatigue for analysts.
    • Alert fatigue can be caused by too many alerts, a high false-positive rate, or lack of context in the alert message.
    • Communication with analysts is essential in tuning alerts so alert fatigue is prevented without letting suspicious traffic fall through the cracks.
  • Use saved searches to improve efficiency in other areas.
    • I use saved searches to populate a lookup table with IDS events, interesting port events, and more, with standard fields which are useful to analysts.
    • I then use this lookup to quickly populate a table on their landing page, floating these important events to the top of their pile, without the need to search for these events anew each time the table is refreshed.

Dashboarding and Data Feedback

In the previous section, I mentioned an analyst landing page. From the items in the table, they are able to, with just a few clicks and filling out a couple of text boxes:

  • Add information to the lookup, which informs our data scientists of what types of information is most useful in alerts
  • Automatically generate a document which can be sent to their customers out-of-band reporting on the events with recommendations for remediation.

Here is a simplified version of some of the techniques used:

Upon clicking the speech bubble in the “evaluate” column, another panel with a couple of buttons appeared:

Yes, it is ugly. The code for this dashboard is intentionally as simple as possible, without added HTML/CSS.

Source code with comments:

Click to enlarge.

After interacting with the dashboard, the lookup feeding it now looks like this, ready for the dstport_actionable field to be associated with the other parameters for use in machine learning.

Additionally, an alert can be used to monitor for changes in this lookup table. If a new change is found, it can send relevant information to wherever it needs to go, such as to external servers set up to fill in the sent data into pre-formatted documents.

Access Control

The access control features in my personal Splunk instance are limited. However, some ways I have enforced least-neccessary access are:

  • Customers have permissions only to view their own indexed data (index=customer_<custid>) within their role permissions, but we needed some results of their dashboard interactions to also be indexed.
    • I created a new index for these dashboard interactions (index=workflow) and set it within their role permissions as included, but not default.
    • In “role restrictions”, I set the search parameters of the parent of all customer roles to only allow index::customer_*
    • With this change, customers can still only search their own index, but using |collect to write to index=workflow works properly.
  • To mitigate the impact of resource-heavy real-time searches, I have implemented the following measures:
    • Within dashboards where real-time data is necessary, I limit duration of such searches using Workload Management settings.