Path: news.net.uni-c.dk!newsfeeds.net.uni-c.dk!howland.erols.net!dispose.news.demon.net!demon!btnet-peer0!btnet!ctb-nntp1.saix.net!not-for-mail From: goose Newsgroups: comp.lang.basic.visual.misc,comp.lang.beta,comp.lang.c,comp.lang.c++ Subject: Re: RACIST RADIO HOST PIG - Off Topic Date: Mon, 21 May 2001 15:44:08 GMT Organization: The South African Internet Exchange Lines: 46 Message-ID: <3b0937c8.7551@bolder.com.co.za> References: <9e8cnc$mv14475@news.qualitynet.net> <3B0803BB.9D17B87E@eton.powernet.co.uk> <3B09337E.B27906D5@eton.powernet.co.uk> NNTP-Posting-Host: 196.25.192.192 X-Trace: ctb-nnrp2.saix.net 990459779 22438 196.25.192.192 (21 May 2001 15:42:59 GMT) X-Complaints-To: abuse@saix.net NNTP-Posting-Date: 21 May 2001 15:42:59 GMT User-Agent: tin/1.4.5-20010409 ("One More Nightmare") (UNIX) (Linux/2.2.16 (i686)) Xref: news.net.uni-c.dk comp.lang.basic.visual.misc:478860 comp.lang.beta:12888 comp.lang.c:526209 comp.lang.c++:593099 In comp.lang.c++ Richard Heathfield sucked his thumb and expounded: > Zy Baxos wrote: >> >> I've always wondered what the random-looking garbage at >> the end of stupid spam posts was - I always assumed it >> was rot13 or a variant thereof. How does that work as a >> counter-counter-spam measure? > > Some newsfeeds will filter out duplicate articles. The pseudo-random > gibberish inhibits this process. > >> >> Ob on topic question: >> >> How would one program a counter^3-spam measure against >> this in VB? > > Given the number of newsgroups to which this is cross-posted, I cannot > imagine an answer which is both comprehensive and topical. As a minor > observation, however, one possible avenue to explore would be to look up > all words longer than, say, N characters, and check them against an > English dictionary. If fewer than P percent are found, then reject the > article, or at least pass it on for human inspection. The discovery of > appropriate values of N and P is left as an exercise for the discerning > ISP's anti-spam team. :-) > wouldn't it be a good idea if you could look up a table which stores frequencies of words found in postings, and then reject an article if more than (say) 10 words each with a frequency of less than (eg) .5% are found adjacent to each other ... looking at the orginal post, it would certainly fail ... the only difficulty would be getting it to run fast enuff and not have every single posting slow the server down ... of course in retaliation, the spammer could then choose his random words out of some sort of an english dictionary ... and we'd be back where we started from ... -- goose cyrnfr hfr ebg13 gb ernq guvf ...