Topic: Spam today on the web
Actually, I can see a lot of forums, implementing 'captcha' to fight spam bots and others web pollutants.
So, I believe there is a more intelligent and more robust solution, lighter to implement, because you don't need graphical library.
Again that be unreadable for some peoples, captcha has also the problem that it's not efficient against bad guys doing manual inscriptions, and using forums to reference their web site.
Often, the web site is pornographic, or sale money services or drugs. And I think this is the real problem. When you counter spam bots, you fight soldiers, else the commandment.
The next response against spam bots is fighting against peoples by analysing the contents and the links they provides.
PHP offers the possibility to download a web page. Coupled with the powerfuls regular expressions functions, you can do a lot of things :
- Use a customisable list of banned worlds, and compare the content of a web site, to perform a spam score.
- Track javascript redirection, Recognize pop-up open, virus and spyware, etc...
Of course, it's possible to change the UserAgent string of PHP to emulate a simple browser.
We can also use on-line database of blacklisted web sites addresses, the same as used against mails spams.
Regular expressions on urls is also a possibility.
What do you think about that ?