WHOIS is a simple, plaintext-based protocol that is used to retrieve information about a given domain. WHOIS servers listen on the TCP port 43. The protocol is defined by RFC 3912, but that RFC doesn’t give useful information regarding how WHOIS works for getting information about domains.
The response from WHOIS servers is made to be human-readable rather than machine-readable but the fields you need to extract information from usually follow a
Header name: Header data format. It is a good idea to turn all header names to lowercase when you are searching for a specific one.
WHOIS requests need to be terminated with a carriage return + line feed (
- Connect to whois.iana.org. Send the top-level domain/TLD, followed by a newline. (e.g. Send “com” + “\r\n”)
- The WHOIS server for that TLD, along with a bunch of other data, will be sent in a header with the name
- Connect to that server on port 43, send the full domain name followed by a newline. (e.g. Send “example.com” + “\r\n”)
- * The response data you get from this server is the WHOIS data, but there’s usually more data you can get from another server.
- * This server’s address is sent to you in a header called