Home » Email » SpamAssassin Page 2

SpamAssassin on mail.pa.msu.edu

Part 2: How SpamAssassin Works


  1. SpamAssassin Overview

    When you have SpamAssassin filtering set up on mail.pa.msu.edu, each of your E-mail messages is given as input to the SpamAssassin program, which performs a series of tests. Each test result contributes a positive or negative number of points to the E-mail message's score, according to whether the result makes it more or less likely that the message is spam. At the end, the points are added up, and messages scoring over a certain threshold number are considered "probable spam" (and, in our configuration, those messages above an even higher threshold score are just called "spam").

    The tests fall into three basic categories.

    1. Tests against known patterns of text.
    These tests check for known common text patterns seen in spam of various types ("act now", "your mortgage has been approved", "my <male relative> was <government position> in <name of African country>", etc.) which earn the message points towards its total score.

    2. Checks sending IP addresses against lists of known spam originators.
    These lists are maintained on the network by various organizations as a public service, and tend to be kept pretty much up-to-date. On the other hand, spammers are constantly finding new systems (known or unknown to their owners) from which to disseminate unasked-for E-mail messages. You may receive a spam message today from a system which is not on one of the "spammer address" lists, and so does not score high enough to be called spam. The same message from the same system the next day, after others have reported the offending site to the list, will score additional points, increasing the chances of its total score crossing the threshold to be called spam.

    3. Analysis of patterns within the message and comparison to your previously received messages' patterns.
    This incorporates "learning" behavior and acts upon it, based on the particular spam messages received by each individual user. This analysis yields a final probability number based on how similar the message is to messages previously called spam, and how similar it is to messages previously identified as "not spam" (or "ham"). If the database of previously identified messages is large enough (over 200 each of spam and non-spam), this probability will be assigned a score to be added to or subtracted from the total.


  2. Local 'probable-virus' filter

    On the mail.pa.msu.edu system, after the E-mail message goes through the SpamAssassin filters, receives a score, and is not diverted to the IN.spam or IN.probable-spam folder, it passes through a set of additional filters which look for text patterns common in messages carrying E-mail viruses. If any of these patterns are found, the message is diverted to the IN.probable-virus folder.

    If a message is not diverted to one of these specialized mail folders, it goes into your normal mail spool file (generally accessed as "Inbox").

  3. Notes

    People who request custom filters may also have a mail folder named IN.checkspam, where messages matching the custom filter are placed.

    The program which manages the application of the filters is called procmail. Its settings are in the file ".procmailrc," and a log of each incoming message and where it ends up may be found in a file established by one of the .procmailrc settings (typically the file .procmail.log in your login area.)

    The procmail program is invoked by an entry in the .forward file in your login area. This means that other uses of the .forward file must be handled a different way. This includes forwarding E-mail to a different address and the use of the vacation auto-reply program. Both of these functions can be achieved with the appropriate procedure -- just not the default procedure. The most straightforward configuration for either of these uses in a SpamAssassin-enabled environment will forward or auto-reply only non-spam messages, which is usually more in line with what you want to happen anyway! Contact helpdesk@pa.msu.edu for assistance with either of these tasks.

  4. Links



Questions not covered in this FAQ? Make sure to send them in!

Last Updated: Tuesday, 21 March 2005 by G J Perkins