Email Authentication

Original article by David MacQuigg.  For other views see Wikipedia, the free encyclopedia.

Ensuring a valid identity on an email has become a vital first step in stopping spam, forgery, fraud, and even more serious crimes. Unfortunately, the Simple Mail Transfer Protocol (SMTP) that handles most email today was designed in an era when users of the Internet were mostly honest techies who expected others to be equally honest. This article will explain how email identities are forged and the steps that are being taken now to prevent it.

Mail Transfer Chaos

In a simple mail transfer, there are four key players: the author or originator of the email, the sender {1} or agent who first puts the email on the public Internet, the receiver or agent who gets the email from the Internet, and the recipient who is the person supposed to read the email.[1] (http://spf.pobox.com/mailflows.html) When we say Internet, with a capital I, we mean the world-wide network that shares a common set of IP addresses, not the internal networks before the sender or after the receiver. For example, the computer I am writing this article on shares a local network with other computers having addresses I can assign at will. My network connects via a router to the network of my Internet Service Provider ( ISP ), and he can assign whatever addresses he wants within his network, including the address of my router. It is only when he connects his router to the Internet that a real Internet IP address is needed.

Image:Email_Authentication_01a.png

Other than the sender's IP address, there is no verification of any information in an email. It is quite easy for a spammer to make an exact copy of an email from smithbarney.com, including a long complicated sequence of headers and a genuine logo in the body of an email, then change the content to send readers to a website that appears to be genuine, but is actually a "phishing" scam designed to capture names, passwords, and credit card numbers.

So why can't the sender's IP address be used to identify the spammer? There are two problems. One is that spammers often work through forwarders to hide their IP addresses (see below). Another is that the sender is often a zombie that has been infected by a computer virus, and is programmed to send spam without the owner even knowing about it. There are millions of insecure home computers, and they have now ( March 2005 ) become a major source of spam.

Attempts to stop spam by blacklisting sender's IP addresses have failed to reverse the worldwide growth. Most IP addresses are dynamic, i.e. they are frequently changing. An ISP, or any organization directly connected to the Internet, gets a block of real Internet addresses when they register in the Domain Name System ( DNS ). Within that block, they assign individual addresses to customers as needed. A dial-up customer may get a new IP address each time they connect. By the time that address appears on blacklists all over the world, the spammer will have new addresses for the next run. There are 4 billion possible IP addresses on the Internet. The game of keeping up with these rapidly changing IP addresses has been facetiously called "whack-a-mole".

There are a number of things that ISPs have done to stop zombies and deliberate spamming by their customers. Infected computers can be cleared of viruses and patched to resist further infection. Outgoing email can be monitored for any sudden increase in flow or in content that is typical of spam. Some ISPs have been quite successful {2}, but others don't care to make the effort. With spam now over 80% of all email traffic, we can expect that there will always be ISPs who are willing to provide services for spammers.

Authenticating Senders

Email authentication greatly simplifies and automates the process of identifying senders. By quickly verifying the claimed domain name, it is possible to reject forgeries and block email from known spamming domains. It is also possible to "whitelist" email from known reputable domains, and bypass content-based filtering, which always loses some valid emails in the flood of spam. The fourth category, email from unknown domains, can be treated the same as we now treat all email – rigorous filtering, return challenges to the sender, etc. Success of a domain-rating system will encourage reputable ISPs to stop their outgoing spam and get a good rating.

There are a number of ways to authenticate a sender's domain name ( SPF[2] (http://spf.pobox.com), SenderID[3] (http://www.microsoft.com/mscorp/twc/privacy/spam/senderid/default.mspx), CSV[4] (http://mipassoc.org/csv/index.html), DomainKeys[5] (http://antispam.yahoo.com/domainkeys) ). All are very effective in stopping the kind of forgery now prevalent. None exclude the use of other methods, although SPF, CSV, and SenderID appear to be competing for the same niche. The most widely used will likely be the ones that require the least effort on the part of ISPs and others currently operating public mail servers.

SPF, CSV, and SenderID authenticate just a domain name. DomainKeys uses a Digital Signature to authenticate domain names and the entire content of a message. SPF and CSV can reject a forgery before any data transfer. SenderID and DomainKeys must see at least the headers. SPF and SenderID may have a problem with forwarders (see below).

SPF, CSV, and SenderID work by tying a temporary IP address to a claimed domain name. Every incoming email has an IP address that cannot be forged {3}, a bunch of domain names in the email headers, and a few more in the commands from the sender's SMTP server. The methods differ in which of these names to use as the sender's domain name. All of them can be faked, but what cannot be faked is a domain name held by a DNS server for that section of the Internet {4}.

Image:Email_Authentication_02a.png

The procedure to authenticate is basically simple. When a request to deliver an email arrives, the claimed sender's domain name is sent in a query to a high-level Domain Name Server. That DNS server in turn, refers to lower level servers until an answer is found that is authoritative for the domain in question. The answer returned to the receiver includes the information to authenticate the email. For SPF and SenderID, the query returns the IP addresses which are authorized to send mail on behalf of that domain. Typically there will be very few authorized SMTP sending addresses, even from a domain with millions of dynamically assignable IPs. For DomainKeys, the query returns the public key for the domain, which then validates the signature in the email. A successful validation proves the domain name is not faked, and neither the headers nor the body of the email were altered on its way from the sender.

A spammer has no access to any of the connections between these DNS servers. Even if he were to falsify records in the DNS server for his own domain, he would not be able to forge someone else's domain name. When a spammer tries to send an email claiming to be from amazon.com, for example, the receiver queries the .com DNS server, then a server in a secure building at Amazon. The IP address on the message from the spammer won't match any of Amazon's authorized IP addresses, and the email can be rejected. Alternatively, the DomainKey will show the signature in the email is invalid.

Use of the DNS database to register authentication information for a domain is relatively new. The new information is added to existing DNS records, and queries for this information are handled the same way as any other DNS query. Publishing authentication records in DNS is voluntary, and many domains probably won't bother. However, any legitimate domain, even those that don't intend to operate public mail servers, will most likely want to block others from using their name to forge emails. A simple code in their DNS record will tell the world, "Block all mail claiming to be from our domain. We have no public mail servers."

The Problem with Forwarders

At this point, you probably know all you need to know about email authentication, but there are some additional details when an email forwarder is involved. Forwarders perform a useful service in allowing you to have one simple permanent address, even if you change jobs or ISPs. List servers perform a similar function, forwarding email to many receivers on behalf of one sender.{5} Forwarders can complicate an IP-authentication method like SPF or SenderID. They pose no problem for an end-to-end authentication method like DomainKeys. CSV limits its focus to one-hop authentications, and assumes a signature method will be used for end-to-end authentication.

Image:Email_Authentication_03d.png

Use of a forwarder prevents the receiver from directly seeing the sender's IP Address. The incoming IP packets have only the forwarder's IP Address. Two solutions are possible. Either you trust the forwarder to authenticate the sender, or you trust the forwarder to at least accurately record the incoming IP Address and pass it on, so you can do your own authentication.

The situation gets complicated when there is more than one forwarder. A sender can explicitly authorize a forwarder to send on its behalf, in effect extending its boundary to the public Internet. A receiver can trust a forwarder that it pays to handle email, in effect designating a new receiver. There may be additional "MTA Relays" in the middle, however. These are sometimes used for administrative control, traffic aggregation, and routing control. All it takes is one broken link in the chain-of-trust from sender to receiver, and it is no longer possible to authenticate the sender.

Forwarders have one other responsibility, and that is to properly route Delivery Status Notices (DSNs) and spam bounces. Normal DSNs should be sent straight to an address chosen by the sender. Spam bounces should not be sent to any address that may be forged. These bounces may go back by the same path they came, if that path has been authenticated.

Footnotes

{1} The term sender has many meanings to email experts. Our use of it here is not an endorsement of any one point of view. See IETF draft-crocker-email-arch-04 for precise definitions.
{2} America Online claims to have eliminated outgoing spam. http://www.circleid.com/article/917_0_1_0_C/ A small sample of reports from SpamCop seems to validate this. http://forum.spamcop.net/forums/index.php?showtopic=3675&hl=whitelist&st=50
{3} IP Address forgery is possible, but generally involves a lower level of criminal behavior ( breaking and entering, wiretapping, etc.), and these crimes are neither exciting to a hacker, nor sufficiently risk-free for a typical spammer.
{4} There have been attacks on DNS servers, but doing this on a large scale over a long period of time may be orders of magnitude more difficult than spreading zombie infections among millions of insecure home computers. The much smaller number of DNS servers could be upgraded to use DNSSEC if such attacks were to become commonplace.
{5} Forwarders and List Servers are examples of email Mediators and should not be confused with packet routers. See IETF draft-crocker-email-arch-04 for a thorough discussion and precise definitions. Routers don't change the IP addresses in the packets. They are completely transparent to mail protocols. Now, if a spammer could hijack a router ...

References

[1] How mail flows through the Internet http://spf.pobox.com/mailflows.html
[2] Sender Policy Framework (SPF) http://spf.pobox.com/
[3] SenderID http://www.microsoft.com/mscorp/twc/privacy/spam/senderid/default.mspx
[4] Certified Server Validation (CSV) http://mipassoc.org/csv
[5] DomainKeys homepage http://antispam.yahoo.com/domainkeys