Introduction

In this guide, we will go through the two main methods you can follow to delete events from Splunk.

The first method uses Splunk’s delete command to selectively delete events from Splunk. The second method leverages index retention settings to delete events.

Method 1 - Using Splunk’s delete Command

🔴 Please be sure to read Splunk’s documentation regarding the delete command before attempting to delete events. Once deleted, you CANNOT retrieve the data.

  1. Add the can_delete role to your account temporarily or create a account to use specifically for event deletion.

    Before you can run the delete command, your account will need the can_delete role. I recommend adding this role for the duration it takes to delete events and removing it immediately after. This will help prevent unintentional event deletion. You can also create a separate Splunk user with the can_delete role that is only used for this specific purpose. Again, these are just measures to prevent accidental event deletion.

  2. Create a search to pull the events you want to delete. (Ex. index=myindex sourcetype="_json" over the last 6-hours)

    Once you have the can_delete role, you need to create a search to pull all the events you want to delete. Her are some examples:

    Simple Search index=myindex sourcetype="_json"

    Adding Additional Filters index=myindex sourcetype="_json" log_level="DEBUG"

    Using Regex to Pull Events Matching a Pattern index=myindex sourcetype="_json" | regex _raw = "\d{3}-\d{2}-\d{4}"

  3. Pipe the events into the delete command. (Ex. index=myindex sourcetype="_json" | delete over the last 6-hours)

    Once you have verified that your search is only retrieving the events you want to delete AND that you are only pulling events within the timeframe you are intending, you can pipe the events to the delete command. Here is what that looks like with the previous examples:

    Simple Search index=myindex sourcetype="_json" | delete

    Adding Additional Filters index=myindex sourcetype="_json" log_level="DEBUG" | delete

    Using Regex to Pull Events Matching a Pattern index=myindex sourcetype="_json" | regex _raw = "\d{3}-\d{2}-\d{4}" | delete

  4. Remove the can_delete role from your account.

The process is complete and all the matched events will no longer show in searches. Make sure you don’t accidently rerun the search over a different timeframe! Remove the can_delete role from your account.

Keep in mind that this process does not delete the events from disk. Once events are indexed in Splunk, they are immutable. When we run the delete command, the matched events are simply “tagged” as deleted and will no longer show in searches. There is no was to revert this process. If you want the events back, you will need to re-index them from the source. Since the data is not deleted from disk, it will continue to take up some diskspace until the data is removed based on your index retention settings (or it will roll into frozen if you have that configured).

Method 2 - Deleting the Data

What if you need the data deleted from disk for some reason? The delete command is not going to do that for you. Unfortunately, there is no good way to truly delete data from Splunk.

Here are two options you have, but I don’t recommend doing this because you risk losing data you need and is more of a nuclear option.

Option 1 - Reduce Index Retention to Delete Events Reduce the index’s retention configuration to allow Splunk to automatically delete older events or roll to frozen (You can delete the frozen buckets if needed).

For example, if the unwanted events are from 2-months ago, you can temporarily set the index’s retention length to 1-month and allow Splunk to delete the older buckets. From there, you can revert back to your original retention settings.

The obvious issue is that you might be deleting old data that you still need. Again, I don’t recommend you doing this unless you don’t care about the historical data in the index.

Option 2 - Delete and Re-Index This is similar to option 1, but it requires you to have a copy of the source data that you want to keep so that you can re-index it. The re-indexing process will use your license.

For this, you will need to configure your index’s retention length to something super short to allow for all the data in the index to roll out and get deleted (If you have frozen, you can either remove it temporarily while you are doing this or remove the frozen buckets later.) Once the buckets have been deleted by Splunk, you can start re-indexing the data from the source. You might need to use crcSalt and related configs from this Splunk doc to re-index the data.

Again, this has the same down-sides of option 1. You also need to have the original data that you want to keep ready for re-index, which you may not have.

Both these options are only if you absolutely need to get the events off of the disk for whatever reason. You can also use some creative thinking to mix and match the methods we talked about to get the desired effect you are going for.

Conclusion

Using the delete command is a great way to “delete” events from Splunk and not have it show in searches. The command does not remove the events from disk, so it will not save you any disk space. Method 2 involves deleting data from the index based on time or completely wiping an index and re-indexing data you need if you have them on-hand. I do not recommend following the second method, but you options are limited if you truly need to wipe the data from the disk.