Splunk’s index retention settings might seem tricky because they involve various options. If you don’t fully understand these configuration options, you could encounter problems like data being deleted too early or not being removed as expected. Let’s explore some important index retention settings in indexes.conf.

What are Buckets?

In Splunk, we use “Indexes” as the primary way to organize data, much like folders. “Buckets” are essentially sub-folders within an index, but they represent a specific time window. These buckets can be found at the default location /opt/splunk/var/lib/splunk/ (on your Indexer in a distributed setup). Inside this directory, you’ll discover various indexes that the component is writing to. For instance, let’s take a look at the _internal index, which is structured as follows:

/opt/splunk/var/lib/splunk/_internaldb/
	- colddb/
	- datamodel_summary/
	- db
	- thaweddb/

Please note that while the directory names I mentioned are common, different environments have the flexibility to customize them, so your setup may appear different. Keep an eye out for the colddb, db, and thaweddb directories, as they correspond to the cold, hot, and thawed bucket states, as discussed below.

If you were to explore one of the db directories, you’d find buckets representing different time windows. Here’s an example of what you might see in the db/ directory:

/opt/splunk/var/lib/splunk/_internaldb/db/
	- db_1696034645_1695777367_112/
	- db_1696034853_1696034643_113/
	- db_1696035653_1696034852_114/
	- ...

In clustered environments, you may come across prefixes other than db_, such as rb_ for replicated buckets. The first number (e.g., 1696034645) is an epoch timestamp of the first event in the bucket, the second number (e.g., 1695777367) is an epoch timestamp of the last event in the bucket, and the final suffix (e.g., _112) is the bucket’s unique identifier within that index.

Understanding Bucket States

Splunk’s buckets go through different states in a process often referred to as “rolling.” Let’s explore these states:

  • Hot Buckets: These buckets are actively written to and frequently read from during user searches. Typically located in the db/ directory, they benefit from high-performance SSDs with high IOPs.

  • Warm Buckets: While no longer written to, warm buckets are still frequently read from because they contain recent data. In clustered setups, they’re ready for replication and also reside in the db/ directory.

  • Cold Buckets: Cold buckets have aged and are infrequently searched. These buckets might reside on slower, cost-effective storage like SATA SSDs to accommodate their lower access frequency while still maintaining reasonable performance.

  • Frozen Buckets: By default, a bucket rolling to the frozen state means the data is deleted. Optionally, you can configure frozen storage. When a bucket freezes, metadata is stripped, and it’s compressed to save space. Data in this state is not searchable until you “thaw” it. Frozen storage is ideal for long-term, deep archiving, often utilizing solutions like tape drives or AWS S3 Glacier Deep.

  • Thawed Buckets: Thawed buckets are essentially previously frozen buckets that have undergone the thawing process, which restores their searchability. Typically, this process is manual, involving the submission of frozen buckets to Splunk for thawing and patiently waiting for it to complete. In some cases, a separate Splunk instance may be created for thawing and searching previously frozen buckets to prevent any disruptions to the production environment. For instance, if you encounter a security incident requiring a search of Windows logs within a specific time frame, you can conveniently thaw the relevant Windows buckets from that time period and proceed with your search. However, it’s important to note that focusing on a particular time window or index for thawing may limit your ability to pivot to other indexes. While you can choose to thaw all indexes during that time frame, this might increase the time required, depending on the volume of data needing thawing. Importantly, the process of thawing frozen buckets does not consume your daily ingest licensing.

Index Retention Settings

Now that we understand buckets and their states in Splunk, let’s explore the configurations that dictate your retention preferences. It’s worth noting that there are numerous index configurations available, which you can find in the Splunk Docs. However, we’ll focus on some key settings to keep in mind.

Here’s a diagram illustrating Splunk’s bucket stages and how various configurations impact them. This diagram was derived from one that originally came from another article. I am no longer able to find the original article to link here. You can find the original at the bottom of this article for reference.

Splunk Index Settings Info

Let’s take a look at each of the index configuration options. Keep in mind that some of these configurations can be set as a global default or per index. Make sure to reference the Splunk Docs before making any changes to your configurations:

maxDataSize

  • What: Max size (in MB) a hot bucket can reach before rolling to warm.
  • Auto-tuning: Use “auto” or “auto_high_volume” for self-configuration.
  • Default: “auto” (750 MB)

maxHotSpanSecs

  • What: Time limit (in seconds) for hot/warm buckets.
  • Caution: Changing can create many hot/warm buckets. Use carefully.
  • Default: 7776000 seconds (approx. 90 days)

homePath.maxDataSizeMB

  • What: Max size (in MB) for ‘homePath’ containing hot and warm buckets.
  • Implication: Excess data moved to cold DB.
  • Default: 0 (unlimited)

coldPath.maxDataSizeMB

  • What: Max size (in MB) for ‘coldPath’ containing cold buckets.
  • Implication: Excess data gets frozen.
  • Default: 0 (unlimited)

frozenTimePeriodInSecs

  • What: Time (in seconds) to roll indexed data to frozen.
  • Implication: If no script, data is deleted.
  • Default: 188697600 seconds (approx. 6 years)

maxTotalDataSizeMB

  • What: Sets the max size (in MB) of an index.
  • Implication: When reached, oldest data is frozen.
  • Caution: Could override frozenTimePeriodInSecs, causing data loss.
  • Default: 500000 MB

Here is an example of what these configs combined could look like:

[plex]
# Location for hot and warm buckets
homePath   = $SPLUNK_DB/plex/db

# Location for cold buckets
coldPath   = $SPLUNK_DB/plex/colddb

# Location for thawed buckets
# This is required even if you don't use it
thawedPath = $SPLUNK_DB/plex/thaweddb

# I want Splunk to choose the hot bucket size
maxDataSize = auto

# I'm leaving out maxHotSpanSecs so it goes
# with the default setting
# maxHotSpanSecs = 

# Max size for hot and warm buckets combined
homePath.maxDataSizeMB = 1000

# Max size for all cold buckets
coldPath.maxDataSizeMB = 10000

# Max time before buckets are rolled to frozen
# This means deleted for me since there is
# no frozen script configured for this index
# This example is 30-days
frozenTimePeriodInSecs = 2592000

# Max size I want the index to grow to
maxTotalDataSizeMB = 11000

Final Thoughts

I hope this article provides good insight into the various options that can be used to configure your indexes. Remember, you can utilize the dbinspect command to extract bucket information for in-depth analysis (more information about that here).

It’s crucial to monitor your index configurations continually to ensure they align with your retention requirements and available resources. Here are some key takeaways to keep in mind:

  • Make sure you have enough disk space to meet your index retention requirements.
  • Splunk will delete data based on volume and age, whichever comes first. If your age is set to 90 days, but the volume for that index is set to 20 GB and its full, the data will be prematurely deleted.
  • Leverage the Monitor Console’s built-in searches or create custom alerts to proactively keep track of potential indexing storage issues.

References

Here is the original chart. I’m not able to find the original creator. If I find any information about it’s origin, I’ll post it here.

rollbucket-blogscreen-021