Path: news.net.uni-c.dk!newsfeeds.net.uni-c.dk!howland.erols.net!dispose.news.demon.net!demon!btnet-peer0!btnet!ctb-nntp1.saix.net!not-for-mail
From: goose<guse@hobbiton.org>
Newsgroups: comp.lang.basic.visual.misc,comp.lang.beta,comp.lang.c,comp.lang.c++
Subject: Re: RACIST RADIO HOST PIG - Off Topic
Date: Mon, 21 May 2001 15:44:08 GMT
Organization: The South African Internet Exchange
Lines: 46
Message-ID: <3b0937c8.7551@bolder.com.co.za>
References: <9e8cnc$mv14475@news.qualitynet.net> <3B0803BB.9D17B87E@eton.powernet.co.uk> <MPG.1573c212ba03770898968a@news-server> <3B09337E.B27906D5@eton.powernet.co.uk>
NNTP-Posting-Host: 196.25.192.192
X-Trace: ctb-nnrp2.saix.net 990459779 22438 196.25.192.192 (21 May 2001 15:42:59 GMT)
X-Complaints-To: abuse@saix.net
NNTP-Posting-Date: 21 May 2001 15:42:59 GMT
User-Agent: tin/1.4.5-20010409 ("One More Nightmare") (UNIX) (Linux/2.2.16 (i686))
Xref: news.net.uni-c.dk comp.lang.basic.visual.misc:478860 comp.lang.beta:12888 comp.lang.c:526209 comp.lang.c++:593099

In comp.lang.c++ Richard Heathfield <binary@eton.powernet.co.uk> sucked his thumb and expounded:
> Zy Baxos wrote:
>> 
>> I've always wondered what the random-looking garbage at
>> the end of stupid spam posts was - I always assumed it
>> was rot13 or a variant thereof. How does that work as a
>> counter-counter-spam measure?
> 
> Some newsfeeds will filter out duplicate articles. The pseudo-random
> gibberish inhibits this process.
> 
>> 
>> Ob on topic question:
>> 
>> How would one program a counter^3-spam measure against
>> this in VB?
> 
> Given the number of newsgroups to which this is cross-posted, I cannot
> imagine an answer which is both comprehensive and topical. As a minor
> observation, however, one possible avenue to explore would be to look up
> all words longer than, say, N characters, and check them against an
> English dictionary. If fewer than P percent are found, then reject the
> article, or at least pass it on for human inspection. The discovery of
> appropriate values of N and P is left as an exercise for the discerning
> ISP's anti-spam team. :-)
> 

wouldn't it be a good idea if you could look up a table which
stores frequencies of words found in postings, and then reject
an article if more than (say) 10 words each with a frequency of less
than (eg) .5% are found adjacent to each other ...

looking at the orginal post, it would certainly fail ...

the only difficulty would be getting it to run fast enuff
and not have every single posting slow the server down ...

of course in retaliation, the spammer could then choose his random
words out of some sort of an english dictionary ...

and we'd be back where we started from ...


-- 
goose
cyrnfr hfr ebg13 gb ernq guvf ...