Bayesian Spam Filtering: What You Need to Know

Bayesian spam filtering is an important process with yet another strange sounding name. Bayesian refers to a theorem of probability originally put forth by Reverend Thomas Bayes. In a nutshell, Bayes’ theorem involves understanding how the probability that something is true changes based on new evidence. With Bayesian spam filtering, Bayes’ rule is put to work.

As a user marks certain emails as spam, the Bayesian filter compares those marked emails against the user’s legitimate emails and creates a database containing information about what makes the marked email look spammy. For example, a spammy email may be loaded with certain words such as Cialis, Viagra, or debt relief while the users’ legitimate emails never, or rarely, contain those words.

As new emails come in and are marked as spam, new evidence is introduced. Bayesian spam filters look at the new evidence. In this example, if the newly marked spam messages also contain the same spammy words, then the new evidence supports the spam filter’s original theory that any message containing those words is probably spam. Bayesian spam filters learn, compare, and adapt. They assign scores to certain triggers, and if an email’s score is too high, that message is marked and treated as spam.

As a computer user, you likely appreciate how Bayesian spam filtering keeps your inbox clear of the latest spam messages touting everything from male enhancement to the latest pleas from Nigerian princes needing help converting their fortunes. As an email marketer, Bayesian spam filtering may cause you to worry about your messages getting filtered unnecessarily. By understanding how Bayesian filters work, you can avoid many of the triggers that could penalize your legitimate messages.

What traits do your messages have that could raise a red flag in the Bayesian spam filtering process?
Below are a few practical examples:
– Using the word “FREE,” especially in all caps. Your FREE offer could look like all those other FREE offers that previous users have marked as spam.
– Using a bold red font or yellow highlighting. These attributes are commonly incorporated in spammy messages, and Bayesian spam filters know it. As a result, if your email message uses similar tactics, expect the filters to ding you for it.
Using words like Cialis, Viagra, and debt relief. Spammers love sending offers for Cialis, Viagra, male enhancement products, offshore gambling, debt relief, and mortgage related services. While marked spam frequently contains these words, legitimate email messages rarely do. Bayesian spam filters have the databases to back up their theories and new evidence continues to validate it, making it wise to avoid words commonly associated with spam. Because Bayesian filters use scores, you may not be dinged as heavily for the word mortgage as you would for the word Viagra.

…and when you are using SendBlaster, always remember to hit the Spam Check button before sending out your email. You will automatically get your “spam score” based on SpamAssassin filter rules.