April 26, 2004

Fighting spam - surgical strike or carpet-bombing?

I get a lot of spam. A whole load of spam. Metric bucketloads of the stuff. I've been getting spam ever since it was invented. If you want to see exactly how much spam I get, take a look at the sight which greeted me upon logging in to Gmail a couple days ago. Bear in mind that what's visible there as spam is only what made it through the spam filters. Yes, I know that there are people who get more spam than me, but I don't think I could really claim that I don't get much. Spam, in short, is a major issue to me.

I spent quite a long time active on various anti-spam mailing lists until a couple of years ago. Back then people seemed to be having good ideas about ways of coping with the spam problem, but today it seems to me that people are running out of ideas and there's more division than unity in the fight against spam.

I'm a sysadmin at a large university. The primary complaint we got about our mail service from users until recently wasn't the limited quota size, perceived instability of the service (it's pretty stable anyway), lack of a central address book or anything else. Our users were complaining that they Got Too Much Spam. Now, I doubt any of them got quite as much as me, but for most people one piece of spam is too much, and I agree totally. The only acceptable amount of spam is.. no spam. None more spam, as Nigel Tufnell would put it.

Unfortunately for sysadmins, an even more important fact of life is that the only acceptable number of false positives when doing spam filtering is also zero. I've never regarded collateral damage when filtering spam as acceptable. The job of a mail service is to permit legitimate communication, not to block it. So what's a hard-pressed sysadmin to do?

The simple and obvious "solution" is blacklisting. Unfortunately (and boy, the word "unfortunately" is appearing a lot here), careful, conservative blacklisting is hard to do, as Paul Vixie's MAPS RBL discovered. This requires careful maintenance of blacklists, verification, at least one or two attempts to communicate with offending sites before they're blacklisted. They also require careful monitoring so that sites which have fixed problems as a result of blacklisting get un-blacklisted as soon as possible. While all this verification is happening the spammers are still spamming, which sometimes leads people to use more aggressive "one strike and you're out" blacklists which often cause more damage than they fix.

A while ago I had a confused phone call from a user who wasn't able to send mail to another educational institution in the UK. A little investigation showed that our primary mail relays had been blacklisted by SpamCop's blacklist because they had - surprise, surprise - been used as intermediate relays in the delivery of mail to a user at our site, who had then reported it as spam. Well, that's exactly what those machines are there for - they receive mail inbound to our site and shuffle it around for delivery to the right machine for the user to pick it up. SpamCop's code was too stupid to work this out and blacklisted us automatically. I had to send a shirty mail message to SpamCop who, fortunately, didn't use their own blacklist and to their credit we were unblocked within an hour or so. We should never have been blacklisted in the first place, however, for the simple reason that our site wasn't sending spam, only delivering spam which someone elsewhere had sent to one of our users.

In the most extreme circumstances, frustration at the growing tide of spam leads people to do things which I believe are exceptionally unwise and are actively causing damage to the Internet. Some sites blacklist mail from entire countries, usually countries like Korea where some sysadmins don't speak English and therefore don't understand when people from the English-speaking world mail them to complain when their open relay gets exploited. This is a terrible thing - people are being punished because they don't speak English? How about the anti-spam community working with people who do speak Korean to get the message out, or maybe providing standard mail messages in English and Korean which can be just cut-and-pasted and sent to offending sites? Simply choosing to ignore entire countries because they are perceived to be common sources of spam is unproductive, prejudiced and selfish as it sends the message that no communication from that country could ever be of any interest. Attitudes like this are more destructive and damaging to the Internet community than the spam epidemic itself.

Possibly worst of all are the systems that require senders of mail to visit a web page or go through some other form of handshaking process to send mail to a specific recipient. What these systems tell the sender is that the recipient is far more important than they are, and if they really want to communicate with them they'd better jump through these hoops. This system is the ultimate head-in-the-sand solution to spam - "hey, I'm not getting any spam, so everything's fine and we don't need to worry any more!". This is neither a sustainable or scalable attitude - if I send mail to five people, I don't want to have to jump through five seperate hoops to get it delivered. It's simply not something I want to waste time on, and if they don't want to waste time deleting spam that gets through the filters, I don't want to waste my time for their personal convenience. It's incredibly destructive as far as electronic mail's core function as a quick and easy way of communicating with people is concerned, and it's also exceptionally arrogant.

My favourite anti-spam tool at the moment is SpamAssassin. Many people will be familiar with it already, but for those who aren't it's a content-based filter which looks for common patterns which often mark out spam and assigns a score according to how much a message looks like spam. This works very well, and with a well-tuned SpamAssassin and a sensible threshold score it's possible to almost entirely eliminate spam without more than a very, very few false positives. It's really the only way to go - we combine it with a number of very conservative blacklists and the difference since we put this system in place has been enormous. We have even received praise from users, which is an almost unheard-of event - as is well known, most people are quick to complain but slow to compliment. To us, intelligent spam filtering is key rather than nuking entire continents.

This solution works for us. We don't have to blacklist entire countries (a good thing, as our researchers and administrative staff are in daily contact with people all over the world, not just in the UK, Europe and US) or do any other damaging things. So why don't other people think it would work for them?

So what's gone wrong in the anti-spam world? I put these woes down to two things:

1) Technical woes: There are too many self-appointed vigilantes determined that their solution will kill off spam entirely. Whether it's automatic scattergun blacklisting, the belief that SMTP can be quickly replaced with something else that's less spoofable (not going to happen overnight, chaps) or Bayesian filtering in mail clients, none of these are an instant fix to the spam problem. Take a tip from me - what the anti-spam world really doesn't need right now is yet more blacklisting or yet another proposed replacement for SMTP. What the anti-spam world does need is to work together and collaborate on specific projects rather than constantly reinventing the wheel and squabbling about whose solution is the best.

2) Social woes: Neglect of the social issues surrounding spam and why it gets sent. Most of the anti-spam legislation that's out there has been utterly ineffective because legislators either water it down to the point of uselessness, make assumptions that spam to users in the state concerned can be simply legislated away, or both. The social approach to spam needs to combine informed legislation, dialogue with legislators and dialogue with the Internet community as a whole. The "I'm blacklisting Korea!" stuff is almost all down to failure to communicate rather than down to Korea actually being notable for being a spamhaven. Individual hosts and networks are spamhavens, open relays and spam-relaying zombies. Entire nations are not.

A couple of years ago I was actually optimistic - it looked like the war against spam was being won due to a good combination of technical and social initiatives , as well as a healthy amount of informed debate. Now, things look much darker and the spammers are winning. Until a sensible dialogue can be re-established across technical, social and legislative boundaries rather than the current mixture of misguided, ego-driven vigilantism and scorched-earth, head-in-the-sand blackholing, neither of which seem to do much to actually tackle the problem of spam at its source and in many cases only make life worse for real people trying to communicate with each other, the spammers will always be one step ahead.

It is time to get the fight against spam back on track. Until the battle is won, though, there is one unavoidable fact - whatever filters you put in place and whatever blackhole lists you subscribe to, some spam will always get through. This is an unavoidable fact, and the only way to make it go away completely and still allow legitimate communication is for the best technical and social-policy heads on the Internet to work together rather than constantly pulling in different directions.

Posted by mpk at April 26, 2004 1:42 PM | TrackBack
Comments

Amen brother!

Posted by: Michael Underwood at August 13, 2004 5:35 PM
Post a comment









Remember personal info?