Detecting Typosquatting with Splunk and the URL Toolbox App
Typosquatting is a common attack vector that is often overlooked. It involves the use of domain names that are similar to legitimate domain names, but with slight differences. This can be used to trick users into visiting malicious websites, or to steal sensitive information. In this post, we will explore how to detect typosquatting using Splunk and the URL Toolbox app.
Prerequisites
Before we get started, you will need to have installed the URL Toolbox app for Splunk. You can find it on Splunkbase, or install it directly from the Splunk web interface. You will also need access to a Splunk instance with the necessary permissions to install apps.
You can find the URL Toolbox app on Splunkbase here: https://splunkbase.splunk.com/app/2734/
Detecting Typosquatting
Inorder to detect typosquatting, we need a way to compare domain names and identify those that are similar. One way to do this is by calculating the Levenshtein distance between domain names. The Levenshtein distance is a measure of the similarity between two strings.
For example, the Levenshtein distance between “example.com” and “exampIe.com” is 1, because only one character needs to be changed to transform one into the other. Similarly, the Levenshtein distance between “example.com” and “examp1e.com” is also 1.
Using the URL Toolbox App
The URL Toolbox app for Splunk provides a macro called ut_levenshtein
. This macro takes two inputs and calculates the Levenshtein distance between them. We can use this macro to compare domain names and identify typosquatting attempts.
Here is an example search that uses the ut_levenshtein
macro to compare domain names:
index=your_index
| eval our_domain="example.com"
| `ut_levenshtein(our_domain, domain_name)`
| where ut_levenshtein<=2
| table _time, our_domain, domain_name, ut_levenshtein
In this search, we first define our legitimate domain name as a variable called our_domain
. We then use the ut_levenshtein
macro to calculate the Levenshtein distance between our domain name and the domain names in our index. We filter the results to only show domain names with a Levenshtein distance of 2 or less, as these are likely to be typosquatting attempts.
Here is another example you can use if you don’t have data to test with:
| makeresults
| eval domain_name="exaample.com;examplle.com;exampple.com;examplee.com;examp1e.com", our_domain="example.com"
| makemv delim=";" domain_name
| mvexpand domain_name
| `ut_levenshtein(our_domain, domain_name)`
| where ut_levenshtein<=2
| table _time, our_domain, domain_name, ut_levenshtein
In this example, we use the makeresults
command to generate some test data. We define a list of domain names with intentional typos, and use the ut_levenshtein
macro to calculate the Levenshtein distance between each domain name and our legitimate domain name. We then filter the results to only show domain names with a Levenshtein distance of 2 or less. Here is the output of the search:
_time | our_domain | domain_name | ut_levenshtein |
---|---|---|---|
2024-02-21 13:05:30 | example.com | exaample.com | 1 |
2024-02-21 13:05:30 | example.com | examplle.com | 1 |
2024-02-21 13:05:30 | example.com | exampple.com | 1 |
2024-02-21 13:05:30 | example.com | examplee.com | 1 |
2024-02-21 13:05:30 | example.com | examp1e.com | 1 |
Conclusion
By using the URL Toolbox app for Splunk and the ut_levenshtein
macro, we can easily detect typosquatting attempts. This can help us identify potential security threats and take appropriate action to protect our systems and data. By monitoring domain names and calculating their Levenshtein distance from legitimate domain names, we can stay one step ahead of attackers and keep our systems secure.