Hey there! In my previous post, I shared how to set up a local S3 environment using Docker and LocalStack. We had some fun uploading and retrieving files with the AWS CLI—all without needing to touch any real AWS infrastructure!
Now, I’m excited to dive into a practical use case: automatically archiving frozen Splunk buckets to S3 with our handy LocalStack setup.
Splunk provides the option to create a coldToFrozenScript, a customized script that runs whenever a bucket transitions from cold to frozen.
In this project, we will use this script to:
journal.zst file from that bucket to our LocalStack-powered S3 bucket, following the path format <repo-name>/<index-name>/<bucket-name>/rawdata/journal.zstI’m excited to show you how everything comes together!
Before we dive in, let’s make sure you’ve got everything set up. If you haven’t already, check out the previous blog post here for some helpful instructions. Here’s a quick recap of what you need to do:
docker run --rm -it -p 4566:4566 -p 4571:4571 localstack/localstack
Friendly Reminder: Don’t forget to add your dummy credentials as environment variables! 😊
aws s3 mb s3://s3-frozen-test-bucket --endpoint-url=http://localhost:4566
Now that you’re all set up, let’s move on to configuring Splunk and writing the script!
coldToS3.py ScriptHey there! I’ve got a great idea for you: why not create a custom app to manage the coldToS3.py script? It’s a smart practice!
Quick Reminder: Don’t forget to make sure the
coldToS3.pyfile is executable!
🗂 App Structure: Here’s how your app should look:
my_frozen_buckets_to_cloud_app/
├── bin/
| └── coldToS3.py
├── local/
│ └── app.conf
└── metadata/
└── local.meta
📄 app.conf
[install]
state = enabled
[package]
check_for_updates = false
[ui]
is_visible = false
is_manageable = false
📄 local.meta
[]
access = read : [ * ], write : [ admin ]
export = system
📄 coldToS3.py
!/usr/bin/env python3
import os
import sys
import boto3
from botocore.exceptions import BotoCoreError, ClientError
# === CONFIGURATION ===
S3_BUCKET_NAME = "s3-frozen-test-bucket"
LOCALSTACK_ENDPOINT = "http://localhost:4566"
def archive_journal_to_s3(bucket_path, index_name, bucket_name):
journal_path = os.path.join(bucket_path, "rawdata", "journal.zst")
if not os.path.isfile(journal_path):
print(f"[SKIP] journal.zst not found at {journal_path}")
return
s3_key = f"{index_name}/{bucket_name}/rawdata/journal.zst"
print(f"Uploading {journal_path} → s3://{S3_BUCKET_NAME}/{s3_key}")
try:
# if use_localstack:
print("[INFO] Using LocalStack endpoint:", LOCALSTACK_ENDPOINT)
s3 = boto3.client(
"s3",
endpoint_url=LOCALSTACK_ENDPOINT,
aws_access_key_id="test",
aws_secret_access_key="test",
region_name="us-east-1"
)
s3.upload_file(journal_path, S3_BUCKET_NAME, s3_key)
print("[OK] Upload complete.")
except (BotoCoreError, ClientError) as e:
print(f"[ERROR] Upload failed: {e}")
sys.exit(1)
# === ENTRY POINT ===
if __name__ == "__main__":
if len(sys.argv) < 2:
sys.exit("Usage: python coldToS3.py <bucket_path>")
bucket_path = sys.argv[1]
if not os.path.isdir(bucket_path):
sys.exit(f"[ERROR] Invalid bucket path: {bucket_path}")
index_name = os.path.basename(os.path.dirname(os.path.dirname(bucket_path)))
bucket_name = os.path.basename(bucket_path)
archive_journal_to_s3(bucket_path, index_name, bucket_name)
indexes.confLet’s update your index stanza! For this example, I will use the oyku_test index, which should look like this:
[oyku_test]
coldPath = $SPLUNK_DB/oyku_test/colddb
homePath = $SPLUNK_DB/oyku_test/db
thawedPath = $SPLUNK_DB/oyku_test/thaweddb
coldToFrozenScript = "$SPLUNK_HOME/bin/python" "$SPLUNK_HOME/etc/apps/my_frozen_buckets_to_cloud_app/bin/coldToS3.py"
frozenTimePeriodInSecs = 10
maxHotSpanSecs = 10
maxHotBuckets = 1
maxWarmDBCount = 1
maxDataSize = auto_low_volume
This setup will ensure that any buckets older than 10 seconds get rolled over and trigger your script. You can see my configuration in the image below.
![]() |
|---|
| Figure 1: A snapshot of my Splunk setup. |
Step 1: Start by generating some test events in the oyku_test index. Here’s a handy command:
curl -k https://localhost:8088/services/collector -H 'Authorization: Splunk <TOKEN>' -d '{"event": "demo", "sourcetype": "test", "index": "oyku_test"}'
Feel free to use any event ingestion method you prefer! 🐣
Step 2: If you’re feeling a bit impatient, you can manually force roll hot to warm buckets with this command:
$SPLUNK_HOME/bin/splunk _internal call /data/indexes/oyku_test/roll-hot-buckets -auth admin:admin123
Just a little reminder: I’m using one of the secret credentials, so let’s keep that between us! 🫶🏼😌
Step 3: Now, check your bucket in LocalStack:
aws --endpoint-url=http://localhost:4566 s3 ls s3://s3-frozen-test-bucket/oyku_test/
You should see your bucket’s journal.zst located at a path like this:
oyku_test/db_1761656400_1761656400_23/rawdata/journal.zst
Take a look at my results in Figure 2 below!
![]() |
|---|
| Figure 2 Terminal image of the LocalStack image. |
Now you’re all set! You can use this data for downstream testing, restoring, or even to practice long-term archival scenarios. Happy coding! 🌟
Hey there! 🎉 Congratulations on making it this far! You’ve successfully set up automatic frozen bucket archiving in Splunk using the coldToFrozenScript and sent your aged data to a local S3-compatible bucket with LocalStack.
This is such an awesome achievement! Not only does it give you a powerful workflow that feels like production, but it also helps you avoid any real cloud costs. Perfect for testing, CI pipelines, or simulating air-gapped archival strategies!
Whether you’re looking to test recovery procedures, play around with object lifecycle policies, or simply want more control over your frozen data, this setup gives you all the flexibility and speed you could need.
Here’s a quick recap of what you’ve achieved:
coldToFrozenScript that’s all set to goIn the next post, we’ll switch things up and talk about restoring those archived frozen buckets back into Splunk’s thawed path so you can start searching them again! 🔄
Got any questions or suggestions about this blog? I’d love to hear from you! Feel free to reach out to me on my LinkedIn account.
Until next time-happy testing! ☁️🐍
Hey there! In our next post, we’re excited to show you how to take those archived buckets and bring them back to life in Splunk’s thaweddb directory. We’ll walk you through the process of turning them into a readable format using the journal.zst file.
Thanks for joining us on this journey, and we can’t wait to share more with you soon! Stay tuned!