[R] Off topic:Spam on R-help increase?
Marc Schwartz
marc_schwartz at comcast.net
Sat Mar 10 17:20:38 CET 2007
On Sat, 2007-03-10 at 10:17 -0500, François Pinard wrote:
> [Marc Schwartz]
>
> >The "Human Spam Filter" (aka Martin) [...]
>
> The R mailing list has, indeed, be remarkably spam-free, and
> well-managed so far that I can see. I do hope, however, that Martin
> does not have to do the filtering himself -- it would be just daunting!
>
> In any case, Martin, a lot of thanks from me!
The comment was somewhat "tongue-in-cheek".
While a major proportion of spam can be filtered using automated tools,
it takes a significant amount of manual effort to configure the tools to
achieve the level of cleansing that we observe here.
On my system (laptop running FC6 Linux), I am using SpamAssassin with
Bayesian filtering enabled, along with remote spam checks such as DCC,
Razor, Pyzor and some RBLs.
I also recently started using FuzzyOCR (as a plug-in to SA) to enhance
the filtering of spam containing only graphic content. These e-mails are
of course specifically designed to obviate the utility of text based
spam filtering.
However, I still get some that come through despite the above. There are
also 'borderline' e-mails that require manually running the spam/ham
learning scripts.
To increase the filtering effectiveness to the level we see here, I
would have to spend a fair amount of time writing custom rules for SA
and this is where I have no doubt, Martin spends a lot of his time with
list management.
HTH,
Marc Schwartz
More information about the R-help
mailing list