The Geeklog Spam-X plugin was created to fight the problem of comment spam for Geeklog systems. If you are unfamiliar with comment spam you might see the Comment Spam Manifesto.
Spam protection in Geeklog is mostly based on the Spam-X plugin, originally developed by Tom Willet. It has a modular architecture that allows it to be extended with new modules to fight the spammer's latest tricks, should the need arise.
Geeklog and the Spam-X plugin will check the following for spam:
The Spam-X plugin was built to be expandable to easily adapt to changes the comment spammers might make. There are three types of modules: Examine, Action, and Admin. A new module is contained in a file and can simply be dropped in and it will be added to the plugin.
Geeklog ships with the following examine modules:
The Personal Blacklist module lets you add keywords and URLs that typically exist in spam posts. When you're being hit by spam, make sure to add the URLs of those spam posts to your Personal Blacklist so that they can be filtered out automatically, should the spammer try to post them again.
This will also help you get rid of spam that made it through, as you can then use the Mass Delete Comments and Mass Delete Trackbacks modules to easily remove large numbers of spam posts from your database.
The Personal Blacklist also has an option to import the Geeklog censor list and ban all comments which contain one of those words. This or an expanded list might be useful for a website that caters to children. Then no comments with offensive language could be posted.
Sometimes you will encounter spam that is coming from one or only a few IP addresses. By simply adding those IP addresses to the IP Filter module, any posts from these IPs will be blocked automatically.
In addition to single IP addresses, you can also add IP address ranges, either in CIDR notation or as simple from-to ranges.
Please note that IP addresses aren't really a good filter criterion. While some ISPs and hosting services are known to host spammers, it won't help much to block an IP address by one of the well-known ISPs. Often, the spammer will get a new IP address the next time he connects to the internet, while the blocked IP address will be reused and may be used by some innocent user.
As of Geeklog 2.2.1, IPv6 addresses are supported.
This module is only useful in a few special cases: Here you enter the IP address of a webserver that is used to host domains for which you may see spam. Some spammers have a lot of their sites on only a few webservers, so instead of adding lots of domains to your blacklist, you only add the IP addresses of those webservers. The Spam-X module will then check all the URLs in a post to see if any of these is hosted on one of those blacklisted webservers.
This module lets you filter for certain HTTP headers. Every HTTP request sent to your site is accompanied by a series of headers identifying, for example, the browser that your visitors uses, their preferred language, and other information.
With the Header filter module, you can block HTTP requests with certain headers. For example, some spammers are using Perl scripts to send their spam posts. The user agent (browser identification) sent by Perl scripts is usually something like "libwww-perl/5.805" (the version number may vary). So to block posts made by this user agent, you would enter:
| Header: | User-Agent | 
| Content: | ^libwww-perl | 
This would block all posts from user agents beginning with "libwww-perl".
Stop Forum Spam is a centralized, server-based service that provides lists of ips, usernames and email addresses of know spammers of forums and blogs. With this module enabled, on new user registrations the ip of the user and email address will be checked against the SFS database. If found the Geeklog user account will not be created.
SFS is a free service and can be found at www.stopforumspam.com.
Privacy Notice: Enabling SFS means that user information (ip and email address) from your site is being sent to a third party. In some legislation you may have to inform your users about this fact - please check with your local privacy laws.
With this module enabled you can limit the number of links that appear in a post and user profile. To enable the module and set the number of links you need to update the Spam-X configuration. If enabled you should allow at least 1 link to take into account when a user creates a profile since Homepage is a default user field.
With this module enabled, you can use Akismet service provided at https://akismet.com/ to check submitted content to determine if it is spam or not. To enable this module, you have to sign up at this page, get your API key, and set it at the Configuration > Spam-X > Modules > Akismet > API Key. This module takes the author name, ip and text of the submitted content and sends it to the Akismet service which then returns a response indicating if it thinks it is spam or not.
Once one of the examine modules detects a spam post, the action modules will decide what to do with the spam. Most of the time, you will simply want to delete the post then, so this is what the Delete Action module does.
As the name implies, the Mail Admin Action module sends an email to the site admin when a spam post is encountered. Since this can cause quite a lot of emails being sent, it is disabled by default.
Action modules have to be enabled specifically before they are used (examine modules, on the other hand, are activated by simply dropping them into the Spam-X directory). For this, every action module has a unique number that needs to be added up with the number of the other action modules you want to enable and entered as the value for the spamx config variable in Geeklog's main configuration.
The Delete Action module has the value 128, while the Mail Admin Action module has the value 8. So to activate both modules, add 128 + 8 = 136 and enter that in the Configuration admin panel.
Modules like the SNL Examine module (Spam Number of Links) is complemented by a SNL Action module that ensures that SNL is notified of spam posts caught by other examine modules. These modules can be enabled or disabled in the Spam-X configuration.
Some modules may "piggyback" on the Delete Action module, i.e. when you activate the Delete Action module, you'll also enable these relevant Action module.
The Admin modules for the Personal Blacklist, IP Filter, IP of URL Filter, and HTTP Header Filter modules provide you with a form to add new entries. To delete an existing entry, simply click on it.
The Log View module lets you inspect and clear the Spam-X logfile. The logfile contains additional information about the spam posts, e.g. which IP address they came from, the user id (if posted by a logged-in user), and which of the examine modules caught the spam post.
In case a large number of spam posts made it through without being caught, the Mass Delete Comments and Mass Delete Trackbacks modules will help you get rid of them easily. Before you use these modules, make sure to add the URLs or keywords from those spams to your Personal Blacklist.
MT-Blacklist was a blacklist, i.e. a listing of URLs that were used in spam posts, originally developed for Movable Type (hence the name) and maintained by Jay Allen.
Maintaining a blacklist is a lot of work, and you're continually playing catch-up with the spammers. Therefore, Jay Allen eventually discontinued MT-Blacklist on the assumption that new and better methods to detect spam are now available.
Starting with Geeklog 1.4.1, Geeklog no longer uses MT-Blacklist. All MT-Blacklist entries are removed from the database when you upgrade to Geeklog 1.4.1 and the MT-Blacklist examine and admin modules are no longer included.
Trackbacks are also run through Spam-X before they will be accepted by Geeklog. There are also some additional checks that can be performed on trackbacks: Geeklog can be configured to check if the site that supposedly sent the trackback actually contains a link back to your site. In addition, Geeklog can also check if the IP address of the site in the trackback URL matches the IP address that sent the trackback. Trackbacks that fail any of these tests are usually spam. Please refer to the documentation for the configuration for more information.
The Spam-X plugin's configuration can be changed from the Configuration admin panel:
| Variable | Default Value | Description | 
|---|---|---|
| logging | true | Whether to log recognized spam posts in the spamx.log logfile
            (if set to true) or not (false). | 
| timeout | 5 | Timeout (in seconds) for contacting external services such as Akismet and SFS. | 
| notification_email | $_CONF['site_mail'] | Email address to which spam notifications are sent when the Mail Admin action module is enabled. | 
| spamx_action | 128 | This only exists as a fallback in case $_CONF['spamx'] in Geeklog's main
            configuration is not set. I.e. $_CONF['spamx']takes
            precedence. | 
| max_age | 0 | The max age in days to keep Spam-X records since there last update (0 = infinite). | 
| records_delete | 'email', 'IP' | The Spam-X record types to delete when max age is reached. Default types include: 
 | 
| Variable | Default Value | Description | 
|---|---|---|
| sfs_enabled | true | Whether the Stop Forum Spam (SFS) module is enabled or not. If enabled then email and ip addresses of new user registrations will be checked with StopForumSpam.com to see if they are spam. For more information see the SFS introduction. | 
| sfs_confidence | 25 | The threshold for the Stop Forum Spam confidence score (as a percentage) which is a reasonably good indicator that the field under test, would result in unwanted activity. The range of this inputed value must be from 1 to 100. | 
| snl_enabled | true | Whether the Spam Number of Links (SNL) module is enabled or not. If enabled it will only allow a specified number of links in a post or when a user creates or updates their profile. For more information see the SNL introduction. | 
| snl_num_links | 5 | The maximum number of links allowed in a post or profile before it is considered spam. | 
| akismet_enabled | false | Whether the Akismet module is enabled or not. If enabled it will check with the Akismet service to determine if the submitted content is spam (using the ip, author name, and text of the content). For more information see the Akismet introduction. | 
| akismet_api_key | (none) | The API key you got at this page. | 
Further information as well as a support forum for the Spam-X plugin can be found on the Spam-X Plugin's Homepage and in the Geeklog Wiki.