Contemporary spammer technologies

Spammers use dedicated programs and technologies to generate and transmit the billions of spam emails which are sent every day (from 60% to 90% of all mail traffic). This requires significant investment of both time and money.

Spammer activity can be broken down into the following steps:

  1. Collecting and verifying recipient addresses; sorting the addresses into target groups
  2. Creating platforms for mass mailing (servers and/or individual computers)
  3. Writing mass mailing programs
  4. Marketing spammer services
  5. Developing texts for specific campaigns
  6. Sending spam

Each step in the process is carried out independently of the others.

Collecting and verifying addresses; creating address lists

The first step in running a spammer business is creating an email database. Entries do not only consist of email addresses; each entry may contain additional information such as geographical location, sphere of activity (for corporate entries) or interests (for personal entries). A database may contain addresses from specific mail providers, such as Yandex, Hotmail, AOL etc. or from online services such as PayPal or eBay.

There are a number of methods spammers typically use to collecting addresses:

  • Guessing addresses using common combinations of words and numbers – john@, destroyer@, alex-2@
  • Guessing addresses by analogy – if there is a verified joe.user@hotmail.com , then it’s reasonable to search for a joe.user@yahoo.com, @aol.com, Paypal etc.
  • Scanning public resources including web sites, forums, chat rooms, Whois databases, Usenet News and so forth for word combinations (i.e. word1@word2.word.3, with word3 being a top-level domain such as .com or .info)
  • Stealing databases from web services, ISPs etc.
  • Stealing users’ personal data using computer viruses and other malicious programs

Topical databases are usually created using the third method, since public resources often contain information about user preferences along with personal information such as gender, age etc. Stolen databases from web services and ISPs may also include such information, enabling spammers to further personalize and target their mailings.

Stealing personal data such as mail client address books is a recent innovation, but is proving to be highly effective, as the majority of addresses will be active. Unfortunately, recent virus epidemics have demonstrated that there are still a great many systems without adequate antivirus protection; this method will continue to be successfully used until the vast majority of systems have been adequately secured.

Once email databases have been created, the addresses need to be verified before they can be sold or used for mass mailing.

  • Initial test mailing. A test message with a random text which is designed to evade spam filters is sent to the entire address list. The mail server logs are analysed for active and defunct addresses and the database is cleaned accordingly.
  • Once addresses have been verified, a second message is often sent to check whether recipients are reading messages. For instance, the message may contain a link to a picture on a designated web server. Once the message is opened, the picture is downloaded automatically and the website will log the address as active.
  • A more successful method of verifying if an address is active is a social engineering technique. Most end users know that they have the right to unsubscribe from unsolicited and/or unwanted mailings. Spammers take advantage of this by sending messages with an ‘unsubscribe’ button. Users click on the unsubscribe link and a message purportedly unsubscribing the user is sent. Instead, the spammer receives confirmation that the address in question is not only valid but that the user is active.

However, none of these methods are foolproof and any spammer database will always contain a large number of inactive addresses.

Creating platforms for mass mailing

Today’s spammers use one of these three mass mailing methods:

  • Direct mailing from rented servers
  • Using open relays and open proxies – servers which have been poorly configured and are therefore freely accessible
  • Bot networks – networks of zombie machines infected with malware, usually a Trojan, which allow spammers to use the infected machines as platforms for mass mailings without the knowledge or consent of the owner.

Renting servers is problematic, since anti-spam organizations monitor mass mailings and are quick to add servers to blacklists. Most ISPs and anti-spam solutions use blacklists as one method to identify spam: this means that once a server has been blacklisted, it can no longer be used by spammers.

Using open relay and open proxy servers is also time consuming and costly. First spammers need to write and maintain robots that search the Internet for vulnerable servers. Then the servers need to be penetrated. However, very often, after a few successful mailings, these servers will also be detected and blacklisted.

As a result, today most spammers prefer to create or purchase bot networks. Professional virus writers use a variety of methods to create and maintain these networks:

  • Pirate software is also a favorite vehicle for spreading malicious code. Since these programs are often spread via file-sharing networks, such as Kazaa, eDonkey and others, the networks themselves are penetrated and even users who do not use pirate software will be at risk.
  • Exploiting vulnerabilities in Internet browsers, primarily MS Internet Explorer. There are number of browser vulnerabilities in browsers which make it possible to penetrate a computer from a site being viewed by the machine’s user. Virus writers exploit such holes and write Trojans and other malware to penetrate victim machines, giving malware owners full access to, and control over, these infected machines. For instance, pornographic sites and other frequently visited semi-legal sites are often infested with such malicious programs. In 2004 a large number of sites running under MS IIS were penetrated and infected with Trojans. These Trojans then attacked the machines of users who believed that these sites were safe.
  • Using email worms and exploiting vulnerabilities in MS Windows services to distribute and install Trojans: MS Windows systems are inherently vulnerable, and hackers and virus writers are always ready to exploit this. Independent tests have demonstrated that a Windows XP system without either a firewall or antivirus software will be attacked within approximately 20 minutes of being connected to the Internet.

Modern malware is rather technologically sophisticated – the authors of these programs spare neither time nor effort to make detection of their creations as difficult as possible. Trojan components can behave as Internet browsers asking websites for instructions – whether to launch a DoS attack or to start spam mailing, etc. (the instructions may even contain information about the time and the ‘place’ of the next instruction). IRC is also used to get instructions.

Spammer Software

An average mass mailing contains about a million messages. The objective is to send the maximum number of messages in the minimum possible time. There is a limited window of opportunity before anti-spam vendors update signature databases to deflect the latest types of spam.

Sending a large number of messages within a limited timeframe requires appropriate technology. There are a number of resources available that are developed and used by professional spammers. These programs need to be able to:

  • Send mail over a variety of channels including open relays and individual infected machines.
  • Create dynamic texts.
  • Spoof legitimate message headers
  • Track the validity of an email address database.
  • Detect whether individual messages are delivered or not and to resend them from alternative platforms if the original platform has been blacklisted.

These spammer applications are available as subscription services or as a stand-alone application for a one-off fee.

Marketing spammer services

Strangely enough, spammers advertise their services using spam. In fact, the advertising which spammers use to promote their services constitutes a separate category of spam. Spammer-related spam also includes advertisements for spammer applications, bot networks and email address databases.

Creating the message body

Today, anti-spam filters are sophisticated enough to instantly detect and block a large number of identical messages. Spammers therefore now make sure that mass mailings contain emails with almost identical content, with the texts being very slightly altered. They have developed a range of methods to mask the similarity between messages in each mailing:

  • Inclusion of random text strings, words or invisible text. This may be as simple as including a random string of words and/or characters or a real text from a real source at either the beginning or the end of the message body. An HTML message may contain invisible text – tiny fonts or text which is colored to match the background. All of these tricks interfere with the fuzzy matching and Bayesian filtering methods used by anti-spam solutions. However, anti-spam developers have responded by developing quotation scanners, detailed analysis of HTML encoding and other techniques. In many cases spam filters simply detect that such tricks have been used in a message and automatically flag it as spam.
  • Graphical spam. Sending text in graphics format hindered automatic text analysis for a period of time, though today a good anti-spam solution is able to detect and analyze incoming graphics
  • Dynamic graphics. Spammers are now utilizing complicated graphics with extra information to evade anti-spam filters.
  • “Fragmented” Images. Actually the image consists of several smaller images, but a user sees it as complete text. Animation is just another type of fragmentation whereby the image is split into frames that are layered over each other, with the end result being complete text.
  • Paraphrasing texts. A single advertisement can be endlessly rephrased, making each individual message appear to be a legitimate email. As a result, anti-spam filters have to be configured using a large number of samples before such messages can be detected as spam.

A good spammer application will utilize all of the above methods, since different potential victims use different anti-spam filters. Using a variety of techniques ensures that a commercially viable number of messages will escape filtration and reach the intended recipients.

Spam and psychology

Sending messages quickly and getting them past all filters to the recipient is an important part of the spamming process, but there’s more to it than that. Spammers also need to ensure that a user will read the message and do what the spammer wants (i.e., call a designated number, click on a link, etc.).

In 2006, spammers continued to master the psychological methods used to manipulate spam recipients. In particular, in order to hook a user into reading an email, spammers tried to persuade recipients that messages were actually personal correspondence, not spam. At the beginning of the year, spammers mainly used primitive approaches, such as adding RE or FW at the beginning of a subject line to indicate that a message was a reply to a previous email or that it had been sent from a known address. By the middle of the year, spammers had begun to use more subtle tactics.

Spammers began working on their message texts. Today, some spam message texts are stylistically and lexically designed to look like personal correspondence. There are some convincing examples that might even fool an expert at first glance, not to mention less experienced users. This kind of spam is often highly impersonal (it doesn’t address anyone in particular or uses words like ‘girlfriend’ or ‘sweety’, etc.) in order to create the illusion that the email was intended only for the recipient. Sometimes names are used in faked personal correspondence. Whatever the case, the user’s curiosity will be piqued and s/he may well read it to find out who it came from, or if s/he should forward it, etc.

Another spammer trick utilizing social engineering technique is the use of hot news themes (sometimes thought up by the spammers themselves) in spam messages.

The structure of a spammer business

The steps listed above require a team of different specialists or outsourcing certain tasks. The spammers themselves, i.e. the people who run the business and collect money from clients, usually purchase or rent the applications and services they need to conduct mass mailings.

Spammers are divided into professional programmers and virus writers who develop and implement the software needed to send spam, and amateurs who may not be programmers or IT people, but simply want to make some easy money.

Future Trends

The spam market today is valued at approximately several hundred million dollars annually. How is this figure reached? Divide the number of messages detected every day by the number of messages in a standard mailing. Multiply the result by the average cost of a standard mailing: 30 billion (messages) divided by 1 million (messages) multiplied US $100 multiplied by 365 (days) gives us an estimated annual turnover of $1095 million.

Such a lucrative market encourages full-scale companies which run the entire business cycle in-house in a professional and cost-effective manner. There are also legal issues: collecting personal data and sending unsolicited correspondence is currently illegal in most countries of the world. However, the money is good enough to attract the interest of people who willing to take risks and potentially make a fat profit.

The spam industry is therefore likely to follow in the footsteps of other illegal activities: go underground and engage in a prolonged cyclic battle with law enforcement agencies.