Path to this page:
NOTICE: This package has been removed from pkgsrc./
mail/lua-OSBF,
Lua C module for text classification
Branch: CURRENT,
Version: 2.0.4,
Package name: osbf-lua-2.0.4,
Maintainer: shannonjrOSBF-Lua (Orthogonal Sparse Bigrams with confidence Factor) is a Lua C module
for text classification. It is a port of the OSBF classifier implemented in
the CRM114 project. This implementation attempts to put focus on the
classification task itself by using Lua as the scripting language, a powerful
yet light-weight and fast language, which makes it easier to build and test
more elaborated filters and training methods.
The OSBF algorithm is a typical Bayesian classifier but enhanced with two
techniques originally developed for the CRM114 project: Orthogonal Sparse
Bigrams - OSB, for feature extraction, and Exponential Differential Document
Count - EDDC (a.k.a Confidence Factor), for automatic feature selection.
Combined, these two techniques produce a highly accurate classifier. OSBF
was developed focused on two classes, SPAM and NON-SPAM, so the performance
for more than two classes may not be the same.
spamfilter.lua is an anti-spam filter written in Lua using the OSBF-lua
module. It takes special advantage of EDDC to introduce TONE-HR, a highly
effective training method. The combination of OSB, EDDC and TONE-HR to
enhance a classical Bayesian classifier resulted in the best spam filtering
performance in TREC's Spam Track 2006 and the CEAS 2008 Live Spam Filter
Challenge.
Master sites:
SHA1: 6fd4fb6496c20e9340cdcff4820c50a793e2ea27
RMD160: ba808072739de2bcb40ce81f0177ef7588508670
Filesize: 80.413 KB
Version history: (Expand)
- (2008-10-14) Package deleted from pkgsrc
- (2008-09-08) Package added to pkgsrc.se, version osbf-lua-2.0.4 (created)
CVS history: (Expand)
2008-10-13 13:34:35 by John R. Shannon | Files touched by this commit (6) | |
Log message:
Remove lua-OSBF - renamed OSBF-lua
|
2008-10-13 08:03:52 by OBATA Akio | Files touched by this commit (1) |
Log message:
lib/lua/5.1 is owned by lang/lua.
|
2008-10-10 15:24:58 by Thomas Klausner | Files touched by this commit (1) |
Log message:
local is not a valid CATEGORY, use "mail".
|
2008-09-08 12:31:23 by John R. Shannon | Files touched by this commit (6) | |
Log message:
OSBF-Lua (Orthogonal Sparse Bigrams with confidence Factor) is a Lua C module
for text classification. It is a port of the OSBF classifier implemented in
the CRM114 project. This implementation attempts to put focus on the
classification task itself by using Lua as the scripting language, a powerful
yet light-weight and fast language, which makes it easier to build and test
more elaborated filters and training methods.
The OSBF algorithm is a typical Bayesian classifier but enhanced with two
techniques originally developed for the CRM114 project: Orthogonal Sparse
Bigrams - OSB, for feature extraction, and Exponential Differential Document
Count - EDDC (a.k.a Confidence Factor), for automatic feature selection.
Combined, these two techniques produce a highly accurate classifier. OSBF
was developed focused on two classes, SPAM and NON-SPAM, so the performance
for more than two classes may not be the same.
spamfilter.lua is an anti-spam filter written in Lua using the OSBF-lua
module. It takes special advantage of EDDC to introduce TONE-HR, a highly
effective training method. The combination of OSB, EDDC and TONE-HR to
enhance a classical Bayesian classifier resulted in the best spam filtering
performance in TREC's Spam Track 2006 and the CEAS 2008 Live Spam Filter
Challenge.
|