Policy

The Spam Wars

How should the Internet deal with junk mail?

|

Junk e-mail is like seasickness: If you don't get it, you don't really understand how bad it is. In 1997, when I proposed my first article on spam, the English editor I approached insisted it was an "American problem." One of his colleagues and I convinced him to take it seriously by ganging up on him: For a week we both sent him a copy of every piece of junk we received.

We couldn't do that now. Heavy Internet users can get hundreds of these messages every day. America Online reported in March that the company filters an average of 22 junk e-mail messages a day per account—up to 780 million per day. Brightmail, which supplies spam-blocking services and products to Internet service providers (ISPs) and enterprises, counted 1.9 million spam campaigns in November 2001, 7 million in April 2003. By June 2003, the company says, spam was 48 percent of Internet e-mail traffic, up from 8 percent in January 2001. Spam volume is likely to grow further: The longer people are on the Net, the more spam they tend to receive. Aggregating all of my addresses, some of which get no junk and some of which get nothing else, 56 percent of my incoming e-mail is already spam.

Originally, the rage against spammers had more to do with the sense that a private space previously open only to friends and relatives had been invaded. Now the issue is that e-mail is becoming unusable. Even with postal mail, volume matters. If you get 10 pieces of junk postal mail a day, you put a wastebasket next to the front door and trash the unwanted mail on arrival. If you get 200 pieces, finding your bills and personal letters becomes a time-consuming chore. Your postal address has become virtually unusable.

This is the situation many e-mail users are now in. With dozens, if not hundreds, of messages a day, "just press the delete key" is too time-consuming. Spam now has far more in common with viruses and other malicious attacks than it does with any other Net phenomenon. As Brad Templeton, director of the Electronic Frontier Foundation, has said, "It sits at the intersection of three important rights—free speech, private property, and privacy." One of the most astonishing phenomena he says, is the rage spam provokes, turning libertarians into regulators, advocates of distributed networking into supporters of centralized control, and ardent defenders of the right to free, open e-mail systems and anonymity into people who demand that every e-mail sender be forced to identify himself and pay for the privilege.

Why Spam?

The term spam did not originate with junk e-mail. Hormel's ham-based meat product (and a memorable Monty Python skit) aside, Templeton's and others' research suggests its use originated in the game-playing sites known as MUDs (for "multi-user dungeons"), where it applied to several different types of abusive behavior. One of these was floods of repetitive messages, so the term moved on to cover mass posting to the worldwide collection of electronic bulletin boards known as Usenet. Junk e-mail, when it first surfaced, was technically known as unsolicited bulk e-mail, or UBE. Now everyone calls that spam, too, and Hormel is unhappy but resigned.

It makes more sense to define spam in terms of behavior than in terms of content. Bulk-sending millions of identical, unwanted messages can feel just as antisocial if the purpose is to promote a charity (say, the Royal Society for the Prevention of Cruelty to Animals) as it is if the aim is to promote a seedy financial scam (Nigeria must secretly be a very rich country). So the key defining characteristics are "bulk" and "unsolicited." Some spam relates to activities that are already illegal—bilking people out of their money, selling Viagra without a prescription—but the problem of spam is distinct from the question of prosecuting people for those activities.

The chief reason people send spam is that it's incredibly cheap to do so. The response rates are tiny compared to those seen in other types of direct marketing, but there are no printing costs, minimal telecommunications costs, almost no labor costs, and no publisher reviewing the content of your ad. It is, to be sure, socially unacceptable behavior. ISPs that let people send spam through their servers may find themselves blacklisted and their customers' e-mail blocked by other ISPs, and companies who send it themselves or who hire third parties to do it for them may find themselves boycotted.

One of the key objections to banning spam is that it amounts to censorship: No one, the argument goes, should have the right to interfere with a person's private e-mail or decide who can or cannot send e-mail or what it may contain. What is often forgotten is that spam itself can be a form of censorship. Many e-mail services have limits on the amount of e-mail that can be stored in a user's inbox at one time. Fill up that space with an unexpected load of junk, and wanted e-mail gets bounced.

Similarly, consider the advice that's often given to people to help avoid getting on the spammers' lists: Hide your e-mail address. The advice makes some sense. In March the Center for Democracy and Technology released the results of a six-month study on how spammers get people's addresses; the most popular method was to harvest them from Usenet or the Web. But hiding addresses has many undesirable social consequences. If there's no visible e-mail address, you can't tell someone there's an error on his Web site or ask for more information. Businesses have to choose between making staff less accessible and making e-mail less productive.

On the brighter side, the study also found that replacing part of your e-mail address with human-readable or HTML equivalents—typing out the word "at" in the place of the @ sign, for example—could keep it out of spammers' hands. (The automated programs, called bots, that they use to harvest addresses don't recognize such substitutions.) But spammers are beginning to adopt much more invasive techniques to get their messages through. So-called dictionary attacks send identical messages to endless combinations of letters at a single domain, hoping some will get through to valid users. My skeptic.demon.co.uk domain, which I've had since 1993, gets a lot of these—hundreds of identical messages over a single weekend. In April the British ISP Blueyonder, which supplies broadband cable access to tens of thousands of subscribers, suffered a dictionary attack of such ferocity that it was unable to deliver e-mail to its paying customers for two days. That's not free speech—it's a denial-of-service attack.

The next trend is for spammers to use a type of virus called a Trojan to get ordinary people's computers to send their mail for them. In the past, this approach has been used to mount distributed denial-of-service attacks on everything from commercial Web sites to hobbyist Internet Relay Chat networks. The computer's actual owner may not even know it's infected; some Web pages are designed to infect unwary visitors.

A couple of these e-mail Trojans already exist; they're known as Jeem and Proxy-Guzu. Both open up the infected computer for use as a "spambot," that is, a machine churning out spam like a robot. That leaves the innocent owner to face the consequences, which may include being blacklisted so that ISPs block that person's legitimate outbound correspondence. To combat this type of attack, users need not only to protect their machines with anti-virus software but to install firewalls to protect their Internet connections. Learning to configure a firewall isn't easy if you are not technically literate, but these devices will become increasingly necessary. Indeed, one downside to the rollout of broadband is that the computers with fast connections that are always online can be used to mount far more destructive attacks than their dial-up forebears.

What Is to Be Done?

Most solutions to spam fall into three classes: technical, economic, and legal. All three have major drawbacks, and even without those none would provide a total solution.

The technical solutions are probably the most familiar because they're the things you can do for yourself. Primarily, these solutions involve filtering the junk out of the stream of e-mail. There are several places along the path from sender to recipient where filtering can be carried out: at the sending ISP, at the receiving ISP, and through the recipient's own e-mail software. Early in the Internet's history, any mail server was available to send e-mail on behalf of any user: You just specified the machine in your e-mail software, and the server relayed the mail for you. When junk e-mail became a problem, these relays were one of the first things to go. Running an open relay, once seen as a social contribution, became socially irresponsible. The cost was that closing those relays down made it more difficult for travelers and guests to send e-mail while out of range of their own systems.

A number of ISPs offer filtering services for their customers using a third-party service like Brightmail or SpamAssassin, and these vary from discarding the junk on your behalf to marking suspected junk in such a way that you can set your e-mail software to filter it into a separate location. The advantage to the first scheme is that you never see the junk; the advantage to the second is that you can see what's being discarded, and if a legitimate message is incorrectly marked you have a way of retrieving it.

San Francisco's conferencing system the WELL employs the second approach; it uses SpamAssassin, which marks the junk for filtering. SpamAssassin assigns a score to each message that in its estimation indicates the likelihood that the message is spam. A Web interface provided by the WELL lets users set the threshold above which the message is marked. At the default settings, the WELL's implementation catches about two-thirds of the junk. It's not enough: WELL accounts tend to attract a lot of spam even if they haven't been used outside the WELL itself.

At the user level, a number of companies make plug-ins for standard e-mail software, some free, some commercial. These sit between your e-mail software and your ISP, and examine your messages as they arrive, marking or deleting anything they can identify as spam. Internally, they all work slightly differently. Some of these filters check the origins of messages against one or more Realtime Blackhole Lists and eliminate anything that comes from known spam-tolerant ISPs. These blacklists do weed out a lot of junk, but again there's a price, since it's always possible for an innocent domain to get listed by mistake or malice. Of course, the same is true of system administrators who put filters in place; they've been known to block whole countries (including the U.K.).

Collaborative filtering systems, such as Cloudmark and Spamcop.net, collect reports from the first people who get a particular spam and apply them to the entire user base so that everyone gets less. SpamAssassin, which is built into several different e-mail client plug-ins, uses a type of statistical analysis known as Bayesian filtering to help it learn from existing spam to identify unfamiliar spam more accurately, theoretically getting better and better over time.

One significant strand of technical development is challenge-and-response systems such as SpamArrest and iPermitMail. With these, you white-list known correspondents and always accept e-mail from them. Unknown correspondents are sent a challenge that a human being can read and answer but a spambot can't (yet, anyway). When the response is received, the original e-mail is let through to its destination. A complex variant of this approach, developed at AT&T and marketed to corporations as Zoemail, uses unique addresses for each correspondent; at the first hint of spam to an address, the correspondent's address is voided and replaced. One problem with challenge-and-response is that many legitimate correspondents find it hostile and don't bother to respond. Another is that it's disruptive to mailing lists, which rely on automated systems.

As the trend toward virus techniques shows, this is a technological arms race. There is already a company, Habeas, whose mission in life is to sell direct marketers a product to help their messages get past spam filters. So far it's been impossible to get ahead of the spammers for long, but a lot more brainpower is being trained on the problem than in the past; MIT even hosted a technical conference on the subject earlier this year.

Law and Economics

From time to time, someone proposes an economic solution to spam. There are a number of variations, but they all boil down to one idea: You should pay, literally, for all the e-mails you send. This is a popular idea because even a tiny charge that wouldn't cost individual users very much would impose a substantial burden on spammers. At a penny per e-mail, for instance, sending 1 million messages would cost $10,000. At the very least, such a fee would get spammers to clean their lists.

There are several problems with this idea. First and foremost, no ISP in the world is set up to charge this way. It would require an entirely new infrastructure for the industry. In addition, charging for e-mail would kill free services such as Yahoo! and Hotmail in a single stroke and, with greater social costs, make today's many valuable mailing lists economically unfeasible.

If we had micropayments—that is, the technical ability to manage transactions of a penny or even fractions of a penny—we'd have more flexibility to consider charging schemes with fewer social costs. If, for example, you could require that unknown correspondents attach one cent to an e-mail message, you could void the payment for wanted e-mail, leaving only the spammers to pay it.

But we don't have micropayments and we have little immediate prospect of getting them. Given the costs to the industry of altering its billing infrastructure, the only way a pay-per-message scheme would work is if it were legally mandated—and even then, such a mandate could not be imposed worldwide.

In one of the biggest turnarounds in Net history, many people who formerly opposed the slightest hint of government regulation online are demanding anti-spam legislation. So far, the European Union has made spam illegal, 34 states in the U.S. have banned it, and a number of competing federal bills are in front of Congress, which has considered such legislation before. Various proposed federal laws would require spam to include labels, opt-out instructions, and physical addresses; to ban false headers; to establish a do-not-mail registry; or to ban all unsolicited advertising. Most of the state laws require labeling and opt-out mechanisms.

Not everyone is happy with the U.S. legislation's provisions, however: Steve Linford, head of Europe's Spamhaus Project, says America's opt-out approach will legalize flooding the world with spam. He notes that the world's 200 biggest spammers are all based in Florida. With an opt-out system, anyone would have the right to put you on any list at any time, as long as they remove you if you request it. Linford believes instead that "opt in"—prohibiting companies from adding addresses to lists unless their owners have given their specific consent—is the key to effective anti-spam legislation.

Whatever the merits of Linford's and others' proposals, there's an important point to remember: None of the anti-spam laws passed so far has been effective, and that's not likely to change. Lots of spam includes opt-out instructions that don't work; the key is getting businesses to honor them. A do-not-mail registry would double as a free address registry for spammers based offshore. And requiring a physical address for the sender would, like any mandated identification system, make anonymous speech on the Net illegal. Just about everyone is against spam, but most people are for anonymous speech and its ability to let whistleblowers and other vulnerable people speak their minds. Existing and proposed legislation seriously threatens anonymity, raising legitimate worries about censorship.

The ultimate problem with legislation is that spam is a global problem, not a state or federal one. A patchwork of conflicting laws will do nothing to improve the ease of use of e-mail communications. None of the laws so far passed have diminished the amount of spam flooding the Net. Lawrence Lessig, a Stanford law professor and the author of Code and Other Laws of Cyberspace, believes the problem is enforcement, and his proposal is for the government to pay a bounty to any geek who can track down and identify a spammer. He's even offered to quit his job if this scheme is tried for a year and fails.

The Usenet Experience

There's one more approach to the spam problem that we should consider. For the lack of a better term, we might call it the community solution. Alternatively, we could call it the Usenet approach.

Created in 1979, Usenet is in many respects still the town square of the Internet. It played that role even more in 1994, when the Web was still in its "before" stage and two Arizona lawyers, Martha Siegel and Laurence Canter, sent out an infamous spam advertising their services, provoking a furious reaction. The technical method used to post the message meant that you couldn't mark it read in one newsgroup and then not see it in the others, so anyone reading a number of Usenet newsgroups saw the message in every single group.

When the uproar eventually settled, a new hierarchy of ad-friendly newsgroups was created, each beginning with the prefix "biz." But this approach never really worked, because the kind of people who advertise anti-cellulite cream, get-rich-quick schemes, and cable descramblers don't care if they annoy people; they just want maximum eyeballs. In the ad hoc newsgroup news.admin.net-abuse. usenet, users and administrators discussed and developed a system that took advantage of the cancellation features built into Usenet's design. These are primarily designed so people can cancel their own messages, but a number of public-spirited people hacked them so third parties could use them to cancel spam.

By now spam has died out in many newsgroups, partly because the system worked and partly because the spammers simply moved to e-mail's wider audience. But the worst spam period cost Usenet many of its most valuable and experienced posters, who retreated to e-mail lists and more private forms of communication and have never come back.

The key to making this system work was community standards that defined abuse in terms of behavior rather than content. Spam was defined as substantively identical messages, posted to many newsgroups (using a threshold calculated with a mathematical formula) within a specified length of time. The content of the message was irrelevant. These criteria are still regularly posted and can be revised in response to community discussion. Individual communities (such as newsgroups run by companies or ISPs) can set their own standards. It is easy for any site that believes canceling spam threatens free speech to block the cancels and send an unfiltered newsfeed.

The issues raised by Usenet spam were identical to those raised by junk e-mail today. The community, albeit a much smaller one, managed to create standards supported by consensus, and it came up with a technical scheme subject to peer review. A process like this might be the best solution to the spam e-mail problem. The question is whether it's possible given the much more destructive techniques spammers now use and given the broader nature of the community.

Some working schemes for blocking spam are based on community efforts—in which the first recipients of a particular spam send it in, for example, so it can be blocked for other users in the group. In addition, the Net has a long tradition of creating tools for one's own needs and distributing them widely so they can be used and improved for the benefit of all. As in the Usenet experience, there is very little disagreement on what spam is; that ought to make it easier to develop good tools. I can't create those tools, but I can offer less technical friends a spam-filtered e-mail address on my server, which has SpamAssassin integrated into it (after a month of work to get it running), to help them get away from the choked byways of Hotmail or AOL. If everyone with the technical capability to run a server offered five friends free, filtered e-mail, many consumers would be able to reclaim their inboxes. Some ISPs are beginning to offer—and charge for—such a service.

In the end, the ISPs are crucial to this fight. In the Usenet days, system administrators would sometimes impose the ultimate sanction, the Usenet Death Penalty—a temporary block on all postings from an ISP that had been deaf to all requests to block spam sent from its servers. It usually took only a couple of days for the offending ISP to put better policing in place—the customers would demand it. That's what the Realtime Blackhole Lists do, constructing their databases of known spam sources from pooled reports. But the bigger and richer ISPs, such as Hotmail and AOL, can take the lead by taking legal action, as they are beginning to do. AOL filed five anti-spam lawsuits last spring alone.

The Usenet experience shows that the Net can pull together to solve its own problems. I don't think we're anywhere near the limits of human technical ingenuity to come up with new and more effective ways of combating spam, any more than I think e-mail is the last electronic medium that spammers will use. (There have been a couple of cases of "blogspam," where robot scripts have posted unwanted advertising to people's blogs.) The problem of spam may be a technical arms race, but it's one that's likely to be much easier to win than a legislative arms race.

When I spoke with Danny Meadowes-Klue, head of the U.K.'s Interactive Advertising Bureau, he told me, "Spam is the biggest threat to the Internet." But he didn't mean what you think he meant. He was talking about the destructiveness of so many efforts to stop it.