An interesting question came up the other day: can a penetration tester use ChatGPT in a meaningful way to accomplish particular tasks?
In general, the primary use case presented for ChatGPT in penetration testing is to assist with things like crafting compelling phishing emails or fake profiles for websites or social media. I wanted to see if something more concrete and technical could also benefit from ChatGPT, so I asked it a very specific technical question:
The answer I got back was extremely unhelpful:
Being somewhat familiar with the concept of “prompt engineering” however, I figured there was likely a way to get around this restriction. I decided the simplest path was to simply tell ChatGPT that I had permission, and to make it more convincing, I also told it that I am a teacher, and am looking to use this example with my students.
This convinced the AI that everything was cool, and it came back with the following response:
In this case, the answer contained a semi-useful Python script, but as a busy professional, I definitely don’t want to have to edit the script to change the usernames and passwords I want to try every time I need this functionality.
Kali Linux (among other resources) comes with a number of files that contain usernames and passwords that can be used for login attempts; couldn’t I just use those for this? I asked ChatGPT if there was a way:
Helpfully, the AI rewrote the script to do what I requested:
This code example would be much more useful as a penetration tester (so it’s included here at the end of this post), but if you carefully consider the question I asked, it was about fuzzing.
Did ChatGPT give me a way to fuzz NTLM authentication? That depends on what “fuzzing” means.
ChatGPT certainly gave me a script to automate brute force login attempts – that is great, but I didn’t ask for help brute forcing logins, I wanted to fuzz NTLM authentication. These are not the same thing:
- Brute forcing sends a number of usernames and passwords in an attempt to find valid credentials.
- Fuzzing, on the other hand, sends unexpected data types in an attempt to find vulnerabilities in the system or application that processes this input.
In the case of NTLM authentication, fuzzing could include things like:
- LDAP injection strings against the Active Directory server that is being used to validate the provided input to attempt to break out of the authentication process and perform arbitrary LDAP commands and queries
- Sending strange characters that would not typically be included in a username or password (things like Unicode strings, etc.) to determine whether the system handles these character types in an unsafe way that could be exploited
- Sending varying sizes of data chunks, to see if there are exploits in the way the system allocates memory to handle the input
- and so on.
ChatGPT didn’t pick up on that nuance, so while it provided a useful penetration testing script with a great deal of confidence, it didn’t actually understand the assignment, and didn’t provide a solution to accomplish the task I was looking to perform.
This highlights one of the key dangers of ChatGPT: you can’t necessarily trust it, because it doesn’t include any indication of how correct (or incorrect) its responses are, and it words its responses in a very compelling and engaging way that is easy to believe in. In these ways, ChatGPT is a fantastic example of what a good social engineering dialog should look like.
For another example of this, I’ve previously asked chatGPT to perform a very similar exercise (“write a python script that tests NTLM authentication”), and in that case the resulting Python code provided by ChatGPT was as follows:
import requests
url = "http://example.com/secure-page"
username = "user"
password = "pass"
ntlm_auth = requests.auth.HTTPDigestAuth(username, password)
response = requests.get(url, auth=ntlm_auth)
if response.status_code == requests.codes.ok:
print("Authentication successful!"
else:
print("Authentication failed.")
At first glance, this looks legit, and it’s even more convincing when it’s combined with a bunch of text explaining what NTLM authentication is, and what the Python requests library does.
However, close examination shows that the script doesn’t do NTLM authentication at all: it performs HTTP Digest based authentication, which is not the same thing.
In fact, the native Python requests library does not contain a way to perform NTLM authentication – you have to use the requests_ntlm module to accomplish this task (which, I noticed in preparing this article, is what ChatGPT suggests now, so apparently it’s realized this error and corrected it’s training accordingly).
That last sentence is important: the AI has changed its generated code to be more correct as a result of the training it’s getting.
As more penetration testers turn to ChatGPT and help refine the training, the more accurate and useful its help will become. It’s easy to forget how new AI and GPT is, because we see it everywhere. As a result, we tend to look at the current state and make a decision of “is this useful or not” based on it.
Things change very fast with AI (and ChatGPT as part of that world) though, so while the answer today may be “meh, it’s sort of useful but wrong in nuanced ways that make it dangerous”, in a fairly short time it seems inevitable that this answer will become “yes, it’s extremely useful for ramping up and quickly generating new tools and methods”.
chatgpt_ntlm_fuzzer.py
import requests
from requests_ntlm import HttpNtlmAuth
# URL to the target web server
url = "http://example.com"
# Read the usernames and passwords from external files
with open("usernames.txt", "r") as usernames_file:
usernames = usernames_file.read().splitlines()
with open("passwords.txt", "r") as passwords_file:
passwords = passwords_file.read().splitlines()
# Loop over the usernames and passwords and attempt to authenticate
for username in usernames:
for password in passwords:
# Create a new session for each request
session = requests.Session()
session.auth = HttpNtlmAuth(username, password)
response = session.get(url)
# Check the response status code to determine if authentication succeeded
if response.status_code == 200:
print(f"Authentication succeeded with username '{username}' and password '{password}'")
else:
print(f"Authentication failed with username '{username}' and password '{password}'")