Spam protection technologies

A modern spam mass mailing containing hundreds of thousands of messages can be distributed within a few minutes. Most often spam comes from zombie networks – formed by a quantity of users’ computers infected by malicious programs. What can be done to resist these attacks? Currently the IT security industry offers a lot of solutions and anti-spam developers have various technologies available in their arsenal. However, none of these technologies can be deemed to be a ‘silver bullet’ in the fight against spam. A universal solution simply does not exist. Most state-of-the-art products have to integrate several technologies, otherwise the overall effectiveness of the product is not very high.

The most well-known and widely spread technologies are specified below.

Blacklisting

DNSBL (DNS-based Blackhole Lists) is one of the oldest anti-spam technologies. This blocks the mail traffic coming from IP servers on a specified list.

  • Advantages: The blacklist guarantees 100% filtering of mail traffic coming from suspicious sources.
  • Disadvantages: The level of false positives is rather high, and that is why this technology must be used carefully.

Detecting bulk emails (DCC, Razor, Pyzor)

This technology provides detection of completely identical or slightly varying bulk emails in mail traffic. An efficient ‘bulk email’ analyzer needs huge traffic flows, so this technology is offered by major vendors who have considerable traffic volumes, which they can analyze.

  • Advantages: If this technology works, it guarantees detection of bulk emailing.
  • Disadvantages: Firstly, ‘big’ mass mailing can contain completely legitimate messages (for example, ozon.ru and subscribe.ru are sending out thousands of messages which are practically similar, but are not spam). Secondly, spammers can break through this defense with the help of smart technologies. They use software which generates different content (text, graphics etc.) in each spam message.

Scanning of Internet message headings

Special programs are written by spammers that can generate spam messages and instantaneously distribute them. Sometimes, mistakes made by the spammers in the design of the headings mean that spam messages do not always meet the requirements of the RFC standard for a heading format. These mistakes make it possible to detect a spam message.

  • Advantages: The process of detecting and filtering spam is transparent, regulated by standards and fairly reliable.
  • Disadvantages: Spammers learn fast and make less and less mistakes in the headings. The use of this technology alone provides detection of only one-third of all spam messages.

Content filtration

Content filtration is another time-proven technology. Spam messages are scanned for specific words, text fragments, pictures and other spam features. Initially, content filtration analyzed the theme of the message and the text contained within it (plain text, HTML etc). Currently spam filters scan all parts of the message, including graphical attachments.

The analysis may result in the creation of a text signature or calculation of the ‘spam weight’ of the message.

  • Advantages: Flexibility, and the possibility to fine-tune the settings. Systems utilizing this technology can easily adapt to new types of spam and rarely make mistakes in distinguishing spam from legitimate email traffic.
  • Disadvantages: Updates are generally required. Specialists, and sometimes even anti-spam labs, are required in the setting-up of spam filters. Such support is rather expensive and this influences the cost of the spam filter itself. Spammers invent special tricks to bypass this technology. For example, they may include random ‘noise’ in spam messages, which impedes the evaluation and detection of the spam features of the message, or they may use a non-alphanumeric character set. This is how the word viagra may look if this trick is used vi_a_gra or vi@gr@, or they may generate color-varying backgrounds within the images, etc.

Content filtration: Bayes

Statistical Bayesian algorithms are just another approach to the analysis of content. Bayesian filters do not require constant adjustments. All they need is initial ‘teaching’. The filter ‘learns’ the themes of emails typical for a particular user. For example, if a user works in the educational sphere and often holds training sessions, any emails with a training theme will not be detected as spam. If a user does not normally receive training invitations, the statistical filter will detect this type of messages as spam.

  • Advantages: Individual setting.
  • Disadvantages: It works better if used for individual email traffic. Bayesian filtration does not work perfectly on corporate servers with many different types of emails. If a user is lazy and does not ‘teach’ the filter, the technology will not be effective. Spammers try to find ways to bypass Bayesian filtration and in general, they are quite successful.

Greylisting

Greylisting is the temporary denial of the ability to receive a message. The denial includes an error code understood by all email systems. Normally the sender would then resend the message. However, once denied, the programs used by spammers do not resend emails.

  • Advantages: This is one possible solution.
  • Disadvantages: Delays in email delivery. For many users this solution is unacceptable.

Listes grises

Une liste grise empêche temporairement la réception d’un message. Le refus inclut un code d’erreur que tous les systèmes d’e-mails comprennent. En général, l’expéditeur va renvoyer le message. Cependant, une fois qu’ils ont été refusés, les programmes utilisés par les spammeurs ne peuvent pas renvoyer les e-mails.

  • Avantages : C’est une solution possible.
  • Désavantages : Cette technique cause des retards pour recevoir des e-mails. Pour de nombreux utilisateurs, cette solution est inacceptable.