Spam Scenarios                                     5/6/05

In an effort to understand the fundamental problems with stopping spam, I'm putting together some scenarios showing how spam might be handled once email authentication is widely accepted.  I've assumed an IP-authentication method, like SPF or SenderID.  Equivalent scenarios should be done with DomainKeys.

I'm trying to stick fairly close to the terminology and entities as Dave Crocker has described them in http://www.ietf.org/internet-drafts/draft-crocker-email-arch-03.txt - Internet Mail Architecture.  In some places I've used words like sender, receiver, and forwarder, rather than the more formally defined source, destination, and mediator.  While not all-inclusive, I think the more commonly-used words will facilitate discussion.

Here is a figure that we can use for discussion of the scenarios.  It is based on Fig. 3 in the above draft.

 

   +------------+                         +-----------+

   | Originator |                         | Recipient |

   +-----+------+                         +-----------+

         |                                      ^

         |                                      |     

         V                 Return-Path          |

     +---------+    +--------+             +----+-----+

     |         |    | Notice |<------------+          |

     | Sender  +--->| Handler|     DSN     | Receiver |

     |         |    |        |<---+        |          |

     +----+----+    +--------+    |        +----------+

          |                       |             ^

          V                       |             |

     +---------+             +----+----+   +----+----+

     |Forwarder+-->  - -  -->|Forwarder+-->|Forwarder|

     +---------+             +----+----+   +----+----+

                                                ^

                         <-- Spam Bounce <--    |

            +---------+                    +---------+

            | Spammer +-->   - - - -    -->|Forwarder+

            +---------+                    +---------+

A.  Spammer trying to hide by using a forwarder

1.       Spammer creates a pile of headers looking like a chain of legitimate Forwarders, and using fake addresses everywhere, including the Return-Path.

2.       Spam is injected at an insecure Forwarder along a normal path known to the Spammer.

3.       Spoofed Forwarder authenticates Spammer's Forwarder, and records the IP and domain name in an authentication header.

4.       Spam is clever and gets past blocklists and filters all the way to Recipient.

5.       Recipient verifies it is spam, and decides to "Report as Spam", rather than just delete.

6.       Receiver's sysadmin reviews the spam report and decides among any or all options:

a.       Blocklist the domain for the Recipient.

b.      Blocklist it for the entire site.

c.       Bounce the spam to the previous Forwarder.

d.      Report it to any or all of the domain lists to which Receiver is subscribed.

7.       Forwarder receives the bounce, and adds a black mark to the tally for the domain of the previous Forwarder.

8.       Manager at Forwarder notices a large number of spams coming from a particular domain, and decides to investigate.

9.       Based on the result of that investigation, the Forwarder manager might block the domain, bounce the reports upstream, or report to the domain-lists.

10.   Domain list manager has 24/7 monitoring of spam sources worldwide.  Possible actions include:

a.       Downgrading a domain's rating as low and as long as necessary.

b.      Notice to the domain owner to resolve the problem.

Comments:

1.       Blocking and downgrading must be done only on authenticated domains, reported by trusted Forwarders or Receivers.  Often the true Spammer's domain will appear only as a Forwarder halfway down the list of headers, with plenty of legitimate-looking headers below that point.

2.       To detect fake bounces, Forwarders must keep a record of forwarded emails.  This could be as compact as a file with a hash code for each message.  If the hash code on a bounced message doesn't match a code at the right time-point in the record, something was altered.

B. Spammer trying to get a domain name with a good rating

Assume a domain-rating company provides a list of domains with the following ratings:

A)  Proven good domains.  These can bypass spam filtering.

B)  Trusted, but not yet proven.  New domains that have not yet earned an A rating.

C)  Unknown.  This is the default rating (not an actual list), and would include the vast majority of domain names that don't need to operate public mail servers.

D)  Known spamming domains.  These can be blocked without even downloading.

C-rated domain names are a dime a dozen.  The challenge for the spammer is to get a B-rating long enough that he can send out a profitable amount of spam before that domain is downgraded to D.

Upgrading to B is done by filling out an online application with the domain-rating service.  Most of these applications are from corporations registered in one or another state.  The approval process then requires only checking the corporate records and calling a representative of that corporation to verify that the application is not forged.  The cost of establishing a corporation is far larger than any benefit from a few hours of spamming, so this is an unlikely option for spammers.

That leaves individuals.  The spammer can't use his own name because of his criminal and credit history.  He must use a stolen ID, or a dummy ( someone to apply in his own name without being aware of what he is doing ).   Both of these are far more costly than the benefit of a few hours of spamming.  Most of the spam won't get through anyway, because B domains are filtered.

C.  ISP using domain ratings with forwarded mail

I'm an ISP with many subscribers who use forwarders like pobox.com on incoming mail.  How do I determine the domain to feed into my spam filter?

1.       Make a list of recognized forwarders that you trust to faithfully authenticate their incoming mail.

2.       When mail arrives from those forwarders, after first authenticating the forwarder, use their authentication header as if it were your own.

D. Authentication Headers

Here are the headers from a typical phishing scam.  Authentication headers have been added.  Drawing a line under each authentication header shows the boundaries of each administrative domain.

Return-Path: <SRS0=BF6T=RK=amazon.com=forged@bounce2.pobox.com>

Delivered-To: dmx@gainusa.com

Received: ( **SaniMail 47655 invoked from network** ); 28 Feb 2005 15:35:36 -0000

Received: from unknown (HELO gold.pobox.com) (208.210.124.73)

  by mail5.mailsystem.us with SMTP; 28 Feb 2005 15:35:36 -0000

Authent: 208.210.124.73 pobox.com CSV1 PASS

Received: from gold.pobox.com (localhost [127.0.0.1])

  by gold.pobox.com (Postfix) with ESMTP id 6CFD67111A

  for <dmx@gainbroadband.com>; Mon, 28 Feb 2005 10:42:29 -0500 (EST)

Delivered-To: dmx@pobox.com

Received: from gold (localhost [127.0.0.1])

  by gold.pobox.com (Postfix) with ESMTP id 4DE60710B8

  for <dmx@pobox.com.07422030.000.icgmh>; Mon, 28 Feb 2005 10:42:29 -0500 (EST)

X-Pobox-Pass: forged@amazon.com is whitelisted

Received: from amazon.com (unknown [216.183.71.194])

  by gold.pobox.com (Postfix) with SMTP id 51DD271180

  for <dmx@pobox.com>; Mon, 28 Feb 2005 10:42:26 -0500 (EST)

Authent: 216.183.71.194 amazon.com SPF1 PASS

Received: from [219.130.16.181] smithbarney.com (helo=67.15.58.29 verified)

  by server5.emwd.com with smtp (Exim 4.44) id 1Cxh3v-0006Ab-8W

  for cdx-design@cdx-design.org; Sun, 06 Feb 2005 02:42:52 -0500

Authent: 219.130.16.181 smithbarney.com SenderID PASS

FCC: mailbox://identifdep_id7@smithbarney.com/Sent

X-Identity-Key: id1

Date: Sun, 06 Feb 2005 06:36:53 -0100

From: SmithBarney <identifdep_id7@smithbarney.com>

X-Accept-Language: en-us, en

MIME-Version: 1.0

To: cdx-design@cdx-design.org

Subject: Attention To AII Smith Barney CIients

Content-Type: multipart/related; x-Spamnix=checked;

  boundary="------------060705020202080904070006"

X-Antivirus-Scanner: Clean mail though you should still use an Antivirus

Message-Id: 20050228154226.51DD271180@gold.pobox.com

Authentication headers are added only when an email crosses an administrative boundary.  We really don't care about all the intermediate systems within email forwarders like pobox.com.  The authentication header appears immediately below the Received line for the incoming MTA in each administrative domain.  It includes only authenticated information ( IP address and domain name ), and keywords indicating the authentication method and result.

Unlike Received headers, an authentication header is an affirmative statement – "I authenticated this domain, using this protocol, and got this result".  False information here is evidence of lying or gross negligence, not just forgivable errors in processing a complex pile of headers.  This will greatly reduce the administrative burden in deciding whether to downgrade a domain.

A simple standard format for authentication headers will allow a spam filter to quickly sort through piles of headers, skipping past multiple Received lines and other clutter within its own domain and any domains of trusted forwarders, and getting straight to the domain that needs to be blocked or filtered.

In the example above, the domains amazon.com and smithbarney.com were forged.  The statement that amazon.com was authenticated by pobox.com is false.  ( In this case it really was incompetence due to initial setup confusion at pobox.com, which ignored Amazon's SPF policy clearly saying to reject any emails that don't come from amazon's authorized senders.)

The lines below pobox.com, including the smithbarney.com authentication were added by the spammer, and are a complete fabrication.  Neither EMWD nor Smith Barney are responsible.  Nothing below the first false authentication header should be trusted.

In this case, the Spam Bounce path is  pobox.com --> amazon.com --> smithbarney.com

The spam complaint would go to abuse@pobox.com, as it should.  Hopefully at that point, PoBox would realize its mistake and not bother Amazon with the complaint.

E.  DNS Loading             4/14/05

Assume:
      2,000 zombies, widely distributed
     50,000 emails from each zombie
100,000,000 recipient addresses, widely distributed
    100,000 recipient domains
          3 hops from sender to receiver
Then:
  2000 senders --> 3 hops -->  100,000 receivers
     approx. 150,000 MTAs needing to authenticate
 
Scenario E1:  All DNS queries to bigdomain.com

    bigdomain.com has one authentication record for entire domain
    Total 150,000 queries, cached for 48 hours
 
Scenario E2: 

   DNS queries to 1000 servers, distributed over bigdomain.com
   Typical server:  serv138.austin.bigdomain.com
   150,000 MTAs x 1000 servers = 150,000,000 queries !!
   Client caches are 1000X larger, and 1000X less
     likely to hit.

 

E2a: 

   bigdomain.com has 75 valid third-level subdomains, and SPF records

     are provided at that level.

 

E2a1:

   Queries all include valid subdomains.

F. Nuclear Option

With an effective domain-rating system, spammers will have only two options.  Try to "fly under the radar" with flows small enough that they are not detected, or "go nuclear" with a full-force strike that is completed before the system can react.  The slow-flow scenario is least likely to succeed.  Without a history of legitimate mail from the domain, the assumption will be that it is pure spam.  Even if it takes a day to downgrade the domain, there won't be enough output to cover the cost of registration.

The more worrisome scenario is a spammer who might hijack a reputable name, buy access to a major pipeline, and pump out maybe 10 million spams per hour.  To ensure a quick reaction, rating services will need a well-distributed network of spam detectors (honeypots) and 24/7 monitoring to detect the leading edge.  This is what spam-blocking services do now with the satellite feeds that are used to distribute IP blocklists.  Updating a DNS-based domain-name registry should be even quicker.

 

===  End of File  ===