“wHaT tHe HeCK iS hEc?” - You, probably. I know it was me at some point.
HEC (HTTP Event Collector) is a super easy way to send data into Splunk. It opens up the opportunity to quickly update a script or application to send data into Splunk without having to install a forwarder or setting up a syslog server.
In this guide, we’ll go through the process of setting up HEC and making a simple Python script to send data into Splunk.
Setting Up HEC
Let’s start by setting up HEC in Splunk.
- If it doesn’t exist yet, make a new index for the data you’re going to send.
- Go to Settings > Data Inputs > HTTP Event Collector.
- Click on Global Settings and make sure
All Tokens
is enabled and click Save. This will allow HEC to accept data. - Click on New Token. Fill in the details and click Next. (Unless you’re working with a data source that needs indexer acknowledgement, you can leave it unchecked.)
- In the Input Settings page, there are a couple items to configure:
- Source type: You can either pick an existing source type or create a new one. I’m going to make a sourcetype called
bearlychilly:json
and I’ll leave theprops.conf
stanza for it below as an example. - Select Allowed Indexes: You can leave this blank.
- Default Index: Select the index you want to send the data to.
- Source type: You can either pick an existing source type or create a new one. I’m going to make a sourcetype called
- Click Review, verify the settings, and click Submit.
- Take a note of the token that’s generated. You’ll need it to send data to Splunk.
# Nothing fancy here, just a simple sourcetype for JSON data.
[bearlychilly:json]
KV_MODE = json
TRUNCATE = 9999
# The timestamp parsing is for the "advanced" example at the end of this guide and not the ISS example.
TIME_PREFIX = "time": "
MAX_TIMESTAMP_LOOKAHEAD = 26
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%f
Gathering Additional Information
Now that we have the HEC token, let’s gather the rest of the information that we need to send data to Splunk. Let’s start with the URI. This doc from Splunk has a good explanation of the URI format: https://docs.splunk.com/Documentation/Splunk/latest/Data/UsetheHTTPEventCollector#Send_data_to_HTTP_Event_Collector.
There are different formats for the URI depending on Splunk Cloud or Splunk Enterprise (on-prem) and JSON or Raw data. Here is an example of the format used for Splunk Enterprise: <protocol>://<host>:<port>/<endpoint>
. The <endpoint>
part of the URI depends on what type of data you’re sending. For HEC formatted JSON data (Check this doc out for more info on the HEC metadata formatting for events), the endpoint is /services/collector/event
. For normal raw events (including JSON events that don’t follow the HEC format), the endpoint is /services/collector/raw
. In this example, I’ll be sending JSON data that is not in the HEC format, so the endpoint will be /services/collector/raw
.
This is what my URI looks like: https://splunk.bearlychilly.com:8088/services/collector/raw
.
Testing the Connection
Now that we have the token and the URI, let’s hop onto the host you’re going to send data from and test the connection to Splunk. We’ll run the following command to send a test event to Splunk (Of course, replace the URI and token with your own moving forward):
Psspss: Use ChatGPT to convert this to your language of choice.
curl -k https://splunk.bearlychilly.com:8088/services/collector/raw -H "Authorization: Splunk <token>" -d '{"event": "Hello, Splunk!"}'
The response from Splunk should look like the following and you should be able to search for the event in Splunk:
{"text":"Success","code":0}
If that didn’t work, check the following:
Optional Troubleshooting
There is a whole bunch of things that could go wrong. Here are a few things to check:
curl: (7) Failed to connect to splunk.bearlychilly.com port 8088 after 12 ms: Connection refused
: Make sure that you went to Global Settings and enabledAll Tokens
. If you didn’t, HEC won’t accept data.nc -vv splunk.bearlychilly.com 8088
: Run a netcat command to see if you can connect to the host.- If the command fails and says
nc: getaddrinfo: Name or service not known
, there’s likely a DNS issue. - If the command fails and says
nc: connect to splunk.bearlychilly.com port 8088 (tcp) failed: Connection refused
, there’s a network issue preventing you from reaching the host. Again, make sure you went to Global Settings and enabledAll Tokens
. It could be a network firewall, a host-based firewall, or other networking issues.
- If the command fails and says
ping splunk.bearlychilly.com
: Run a ping and make sure you can reach the host. If you’re network allows ICMP, you should at least see the IP address of the host and get a response.- Oh wait, I also have this article you can check out to help with troubleshooting: https://bearlychilly.com/notes/basics-of-network-connectivity-troubleshooting/.
Sending Data to Splunk
Now that we have the connection working, if you are using an application that can send data to Splunk, you can configure it to send data to the URI we tested with the token we generated. If you’re working on adding HEC functionality to a script or application, here’s a simple Python example to send data to Splunk:
In this example, we are going to capture the current position of the International Space Station (ISS) and send it to Splunk. We’ll use the requests
library to send the data to Splunk.
import requests
import json
# The URI we tested with
uri = "https://splunk.bearlychilly.com:8088/services/collector/raw"
# The token (use a better way to store this in production)
token = "your_token_here"
# The data we're going to send
response = requests.get("http://api.open-notify.org/iss-now.json")
data = json.dumps(response.json())
# Send the data to Splunk
# The verify=False is to ignore SSL verification. You should use a proper certificate in production.
response = requests.post(uri, headers={"Authorization": f"Splunk {token}"}, data=data, verify=False)
print(response.text)
Some More Examples
These are just some example use cases and not meant to be used in production as is. There are much better and supported ways to send some of this data to Splunk.
“One-liner” example of using bash to read each line from a text file (Ex. /var/log/syslog
) and send it to Splunk:
while read -r line; do curl -k https://splunk.bearlychilly.com:8088/services/collector/raw -H "Authorization: Splunk <token>" -d "$line"; done < /var/log/syslog
Send some Linux host info to Splunk:
- Current CPU usage
- Current memory usage
- Current Up Time
curl -k https://splunk.bearlychilly.com:8088/services/collector/raw \
-H "Authorization: Splunk <token>" \
-H "Content-Type: application/json" \
-d "$(jq -n --arg hostname "$(hostname)" \
--arg cpu_perc "$(top -b -n 1 | grep 'Cpu(s)' | awk '{print $2}')" \
--arg mem_free "$(free -m | awk '/^Mem:/{print $4}')" \
--arg uptime "$(uptime -p)" \
'{
"hostname": $hostname,
"cpu_perc": $cpu_perc,
"mem_free_mb": $mem_free,
"uptime": $uptime
}')"
Forward the output of any command to Splunk:
command | curl -k https://splunk.bearlychilly.com:8088/services/collector/raw -H "Authorization: Splunk <token>" -d @-
You can even go a step further and setup a function or alias to send data to Splunk. With this example spl
function, you can send the output of any command to Splunk by just piping it to spl
:
# This function sends each line as a separate event and adds in the hostname. The output is in JSON format.
function spl { tee /dev/tty | while IFS= read -r line; do jq -n --arg hostname "$(hostname)" --arg output "$line" '{"hostname": $hostname, "output": $output}' | curl -s -k https://splunk.bearlychilly.com:8088/services/collector/raw -H "Authorization: Splunk <token>" -d @-; done; }
# Usage
ps aux | spl
A Bit More Advanced Python Example
The logging
library in Python is a super easy way to send logs to Splunk. Here’s an example setup that uses the logging
library to send logs to Splunk through HEC:
import logging
import requests
import json
from datetime import datetime
class SplunkHandler(logging.Handler):
def __init__(self, uri, token, level=logging.NOTSET):
super().__init__(level)
self.uri = uri
self.token = token
def emit(self, record):
try:
# Prepare the log data with timestamp and level
log_entry = {
"time": datetime.utcnow().isoformat(),
"level": record.levelname,
"message": record.getMessage()
}
# Convert log entry to JSON format
log_data = json.dumps(log_entry)
# Send the JSON log data to Splunk
headers = {"Authorization": f"Splunk {self.token}"}
response = requests.post(self.uri, headers=headers, data=log_data, verify=False)
if response.status_code != 200:
print(f"Failed to send log to Splunk: {response.text}")
except Exception as e:
print(f"Exception while sending log to Splunk: {e}")
# Initialize logging
logging.basicConfig(level=logging.INFO) # Adjust logging level as needed
# Initialize SplunkHandler
# Adjust the URI and token as needed
# Again, use a better way to store the token in production
splunk_uri = "https://splunk.bearlychilly.com:8088/services/collector/raw"
splunk_token = "your_token_here"
splunk_handler = SplunkHandler(splunk_uri, splunk_token)
# Attach SplunkHandler to the root logger
logging.getLogger().addHandler(splunk_handler)
# Example logging
logging.info("This is an info message")
logging.error("This is an error message")
The output of this in Splunk will look like this:
{"time": "2024-10-25T22:12:57.496015", "level": "ERROR", "message": "This is an error message"}
{"time": "2024-10-25T22:12:57.472257", "level": "INFO", "message": "This is an info message"}
Conclusion
And that’s it! Implementing the HTTP Event Collector into your scripts or applications is as easy as that. Once you have it setup, its a super easy way of onboarding new data sources into Splunk. If I forgot to take out a token somewhere, don’t worry, it’s pretty much useless to you. I hope you learned something new in this guide. Anyway, it’s getting late. Happy learning! 🙂