SpamAssassin on mail.pa.msu.edu
Part 2: How SpamAssassin Works
- SpamAssassin Overview
When you have SpamAssassin filtering set up on mail.pa.msu.edu,
each of your E-mail messages is given as input to the SpamAssassin
program, which performs a series of tests. Each test result contributes
a positive or negative number of points to the E-mail message's score,
according to whether the result makes it more or less likely that the
message is spam. At the end, the points are added up, and messages
scoring over a certain threshold number are considered "probable spam"
(and, in our configuration, those messages above an even higher
threshold score are just called "spam").
The tests fall into three basic categories.
1. Tests against known patterns of text.
These tests check for known common text patterns seen in spam of
various types ("act now", "your mortgage has been approved", "my
<male relative> was <government position> in <name of
African country>", etc.) which earn the message points towards its
total score.
2. Checks sending IP addresses against lists of
known spam originators.
These lists are maintained on the network by various organizations as a
public service, and tend to be kept pretty much up-to-date. On the
other hand, spammers are constantly finding new systems (known or
unknown to their owners) from which to disseminate unasked-for E-mail
messages. You may receive a spam message today from a system which is
not on one of the "spammer address" lists, and so does not score high
enough to be called spam. The same message from the same system the
next day, after others have reported the offending site to the list,
will score
additional points, increasing the chances of its total score crossing
the threshold to be called spam.
3. Analysis of patterns within the message and
comparison to your previously received messages' patterns.
This incorporates "learning" behavior and acts upon it, based on
the particular spam messages received by each individual user. This
analysis yields a final probability number based on how similar the
message is to messages previously called spam, and how similar it is to
messages previously identified as "not spam" (or "ham"). If the
database of previously identified messages is large enough (over 200
each of spam and non-spam), this probability will be assigned a score
to be added to or subtracted from the total.
- Local 'probable-virus'
filter
On the mail.pa.msu.edu system, after the
E-mail message goes through the SpamAssassin filters, receives a score,
and is not diverted to the IN.spam or IN.probable-spam
folder, it passes through a set of additional filters which look for
text patterns common in messages carrying E-mail viruses. If any of
these patterns are found, the message is diverted to the IN.probable-virus
folder.
If a message is not diverted to one of these specialized
mail folders, it goes into your normal mail spool file (generally
accessed as "Inbox").
- Notes
People who request custom filters may also have a mail
folder named IN.checkspam,
where messages matching the custom filter are placed.
The program which manages the application of the filters
is called procmail.
Its settings are in the file ".procmailrc," and a log of each
incoming message and where it ends up
may be found in a file established by one of the .procmailrc
settings (typically the file .procmail.log
in your login area.)
The procmail program is invoked by an entry
in the .forward file in your login area. This means that other uses of the .forward
file must be handled a different way.
This includes forwarding E-mail to a different address and the use of
the vacation auto-reply program.
Both of these functions can be achieved with the appropriate procedure
-- just not the default procedure. The most
straightforward configuration for
either of these uses in a SpamAssassin-enabled environment will forward
or auto-reply only non-spam messages,
which is usually more in line with what you want to happen anyway!
Contact helpdesk@pa.msu.edu for assistance
with either of these tasks.
- Links
|
Questions not covered in this FAQ? Make sure to send them in!
|
|
Last Updated: Tuesday, 21 March 2005 by G J Perkins
|
|