Email
Authentication
Original article by David
MacQuigg. For other views see Wikipedia,
the free encyclopedia.
Ensuring
a valid identity on an
email has become a vital first step in stopping spam, forgery, fraud, and even more
serious crimes. Unfortunately, the Simple Mail Transfer Protocol (SMTP)
that handles most email today was designed in an era when users of the Internet
were mostly honest techies who expected others to be equally honest. This
article will explain how email identities are forged and the steps that are
being taken now to prevent it.
Mail Transfer Chaos
In a simple mail transfer, there are
four key players: the author or originator of the email, the sender
{1} or agent who first puts the email on the public Internet, the receiver
or agent who gets the email from the Internet, and the recipient who is
the person supposed to read the email.[1] (http://spf.pobox.com/mailflows.html)
When we say Internet, with a capital I, we mean the world-wide network
that shares a common set of IP addresses, not the internal networks before the
sender or after the receiver. For example, the computer I am writing this
article on shares a local network with other computers having addresses I can assign
at will. My network connects via a router to the network of my Internet Service Provider ( ISP
), and he can assign whatever addresses he wants within his network,
including the address of my router. It is only when he connects his router to
the Internet that a real Internet IP address is needed.
Other than the sender's IP
address, there is no verification of any information in an email. It is quite
easy for a spammer to make an exact copy of an email from smithbarney.com,
including a long complicated sequence of headers and a genuine logo in the body
of an email, then change the content to send readers to a website that appears
to be genuine, but is actually a "phishing" scam designed to capture
names, passwords, and credit card numbers.
So why can't the sender's IP
address be used to identify the spammer? There are two problems. One is that
spammers often work through forwarders to hide their IP addresses (see below).
Another is that the sender is often a zombie
that has been infected by a computer
virus, and is programmed to send spam without the owner even knowing about
it. There are millions of insecure home computers, and they have now ( March
2005 ) become a major source of spam.
Attempts to stop spam by blacklisting
sender's IP addresses have failed to reverse the worldwide growth. Most IP
addresses are dynamic, i.e. they are frequently changing. An ISP, or any
organization directly connected to the Internet, gets a block of real Internet
addresses when they register in the Domain Name System ( DNS ). Within that block,
they assign individual addresses to customers as needed. A dial-up customer may
get a new IP address each time they connect. By the time that address appears
on blacklists all over the world, the spammer will have new addresses for the
next run. There are 4 billion possible IP addresses on the Internet. The game
of keeping up with these rapidly changing IP addresses has been facetiously
called "whack-a-mole".
There are a number of things that
ISPs have done to stop zombies and deliberate spamming by their customers.
Infected computers can be cleared of viruses and patched to resist further
infection. Outgoing email can be monitored for any sudden increase in flow or
in content that is typical of spam. Some ISPs have been quite successful {2},
but others don't care to make the effort. With spam now over 80% of all email
traffic, we can expect that there will always be ISPs who are willing to
provide services for spammers.
Authenticating Senders
Email authentication greatly
simplifies and automates the process of identifying senders. By quickly
verifying the claimed domain name, it is possible to reject forgeries and block
email from known spamming domains. It is also possible to "whitelist"
email from known reputable domains, and bypass content-based filtering, which
always loses some valid emails in the flood of spam. The fourth category, email
from unknown domains, can be treated the same as we now treat all email –
rigorous filtering, return challenges to the sender, etc. Success of a
domain-rating system will encourage reputable ISPs to stop their outgoing spam
and get a good rating.
There are a number of ways to
authenticate a sender's domain name ( SPF[2] (http://spf.pobox.com),
SenderID[3] (http://www.microsoft.com/mscorp/twc/privacy/spam/senderid/default.mspx),
CSV[4] (http://mipassoc.org/csv/index.html),
DomainKeys[5] (http://antispam.yahoo.com/domainkeys)
). All are very effective in stopping the kind of forgery now prevalent. None
exclude the use of other methods, although SPF, CSV, and SenderID appear to be
competing for the same niche. The most widely used will likely be the ones that
require the least effort on the part of ISPs and others currently operating
public mail servers.
SPF, CSV, and SenderID
authenticate just a domain name. DomainKeys uses a Digital Signature to authenticate domain names
and the entire content of a message. SPF and CSV can reject a forgery before
any data transfer. SenderID and DomainKeys must see at least the headers. SPF
and SenderID may have a problem with forwarders (see below).
SPF, CSV, and SenderID work by
tying a temporary IP address to a claimed domain name. Every incoming email has
an IP address that cannot be forged {3}, a bunch of domain names in the email
headers, and a few more in the commands from the sender's SMTP server. The
methods differ in which of these names to use as the sender's domain name.
All of them can be faked, but what cannot be faked is a domain name held by a
DNS server for that section of the Internet {4}.
The procedure to authenticate is
basically simple. When a request to deliver an email arrives, the claimed
sender's domain name is sent in a query to a high-level Domain Name Server. That
DNS server in turn, refers to lower level servers until an answer is found that
is authoritative for the domain in question. The answer returned to the
receiver includes the information to authenticate the email. For SPF and
SenderID, the query returns the IP addresses which are authorized to send mail
on behalf of that domain. Typically there will be very few authorized SMTP
sending addresses, even from a domain with millions of dynamically assignable
IPs. For DomainKeys, the query returns the public key for the domain, which
then validates the signature in the email. A successful validation proves the
domain name is not faked, and neither the headers nor the body of the email
were altered on its way from the sender.
A spammer has no access to any of
the connections between these DNS servers. Even if he were to falsify records
in the DNS server for his own domain, he would not be able to forge someone
else's domain name. When a spammer tries to send an email claiming to be from
amazon.com, for example, the receiver queries the .com DNS server, then a
server in a secure building at Amazon. The IP address on the message from the
spammer won't match any of Amazon's authorized IP addresses, and the email can
be rejected. Alternatively, the DomainKey will show the signature in the email
is invalid.
Use of the DNS database to
register authentication information for a domain is relatively new. The new
information is added to existing DNS records, and queries for this information
are handled the same way as any other DNS query. Publishing authentication
records in DNS is voluntary, and many domains probably won't bother. However,
any legitimate domain, even those that don't intend to operate public mail
servers, will most likely want to block others from using their name to forge
emails. A simple code in their DNS record will tell the world, "Block all
mail claiming to be from our domain. We have no public mail servers."
The Problem with Forwarders
At this point, you probably know
all you need to know about email authentication, but there are some additional
details when an email forwarder is involved. Forwarders perform a useful
service in allowing you to have one simple permanent address, even if you
change jobs or ISPs. List servers perform a similar function, forwarding
email to many receivers on behalf of one sender.{5} Forwarders can complicate
an IP-authentication method like SPF or SenderID. They pose no problem for an
end-to-end authentication method like DomainKeys. CSV limits its focus to
one-hop authentications, and assumes a signature method will be used for
end-to-end authentication.
Use of a forwarder prevents the
receiver from directly seeing the sender's IP Address. The incoming IP packets
have only the forwarder's IP Address. Two solutions are possible. Either you
trust the forwarder to authenticate the sender, or you trust the forwarder to
at least accurately record the incoming IP Address and pass it on, so you can
do your own authentication.
The situation gets complicated
when there is more than one forwarder. A sender can explicitly authorize a
forwarder to send on its behalf, in effect extending its boundary to the public
Internet. A receiver can trust a forwarder that it pays to handle email, in
effect designating a new receiver. There may be additional "MTA
Relays" in the middle, however. These are sometimes used for
administrative control, traffic aggregation, and routing control. All it takes
is one broken link in the chain-of-trust from sender to receiver, and it is no
longer possible to authenticate the sender.
Forwarders have one other
responsibility, and that is to properly route Delivery Status Notices (DSNs)
and spam bounces. Normal DSNs should be sent straight to an address
chosen by the sender. Spam bounces should not be sent to any address that may
be forged. These bounces may go back by the same path they came, if that path
has been authenticated.
Footnotes
{1} The term sender has many
meanings to email experts. Our use of it here is not an endorsement of any one
point of view. See IETF draft-crocker-email-arch-04 for precise
definitions.
{2} America Online claims to have eliminated outgoing spam. http://www.circleid.com/article/917_0_1_0_C/
A small sample of reports from SpamCop seems to validate this. http://forum.spamcop.net/forums/index.php?showtopic=3675&hl=whitelist&st=50
{3} IP Address forgery is possible, but generally involves a lower level of
criminal behavior ( breaking and entering, wiretapping, etc.), and these crimes
are neither exciting to a hacker, nor sufficiently risk-free for a typical
spammer.
{4} There have been attacks on DNS servers, but doing this on a large scale
over a long period of time may be orders of magnitude more difficult than
spreading zombie infections among millions of insecure home computers. The much
smaller number of DNS servers could be upgraded to use DNSSEC if such
attacks were to become commonplace.
{5} Forwarders and List Servers are examples of email Mediators
and should not be confused with packet routers. See IETF draft-crocker-email-arch-04
for a thorough discussion and precise definitions. Routers don't change the IP
addresses in the packets. They are completely transparent to mail protocols.
Now, if a spammer could hijack a router ...
References
[1] How mail flows through the
Internet http://spf.pobox.com/mailflows.html
[2] Sender Policy Framework (SPF) http://spf.pobox.com/
[3] SenderID http://www.microsoft.com/mscorp/twc/privacy/spam/senderid/default.mspx
[4] Certified Server Validation (CSV) http://mipassoc.org/csv
[5] DomainKeys homepage http://antispam.yahoo.com/domainkeys