Archive for January, 2008

Spamassassin config

Update for a previous story of mine where I posted my user_pref file for spamassassin.

Some notes:

  • Spamassassin usually uses RBL (Realtime Blackhole Lists), you can enable/disable with skip_rbl_checks. If you use it, you might want to use a dnscacher like dnsmasq.
  • The bayesian filter works great here
  • check out Auto-Whitelisting
  • The SpamAssassin-Wiki is really good
  • if you have more than one user, spamd is probably worth checking out.
  • sa-compile for faster performance
  • language/locale checking sounds cool. Found the option at a config generator You will find other options explained there too, like report_safe.
use_auto_whitelist 0
score DATE_IN_FUTURE_12_24      5
score HTML_MESSAGE              3.3
score MIME_HTML_ONLY            5
score MULTIPART_ALT_NON_TEXT    3
score BAYES_00 -4
score BAYES_05 -2
score BAYES_95 6
score BAYES_99 9
score GAPPY_SUBJECT             5
score FROM_EXCESS_BASE64        3
score FROM_HAS_ULINE_NUMS       3
score UPPERCASE_25_50           3
score UPPERCASE_50_75           4
score SUBJ_ALL_CAPS             2
body EVIL_WORDS         /(software|petroleum|casino|kurs|profit|symbol)/i
score EVIL_WORDS                2
describe EVIL_WORDS     evil_words
body EVIL_WORDS2        /(price|oil|bonus|prognose|borse|preis)/i
score EVIL_WORDS2               2
describe EVIL_WORDS2    evil_words2
body EVIL_WORDS3        /(casino|lottery|lotto|euro|free|cheap|health|pharmacy)/i
score EVIL_WORDS3               3
describe EVIL_WORDS3    evil_words3
body DOLLAR_SIGN        /[\$]/
score DOLLAR_SIGN       3
body PERCENT_SIGN       /[\%]/
score PERCENT_SIGN      3
rewrite_header subject         ***SPAM***
report_safe             0
#skip_rbl_checks         1
ok_languages            en de
# Mail using locales used in these country codes will not be marked
# as being possibly spam in a foreign language.
ok_locales              en de

No Comments

Spamassassin and Sylpheed

Sylpheed is a really cool mail program. Less icons than thunderbird, but very fast and gets the job done well.

Junk mail filtering is not integrated, which makes sense since there are great products out there that we should reuse.

Sylpheed provides external program calls, and a preset option fills them out for bsfilter and bogofilter. However, my favorite, spamassassin is not a preset.

Learning command for Junk: sa-learn --spam
Learning command for Not Junk: sa-learn --ham
Classifying command: (see following text)

As a classifying command, sylpheed expects (like bogofilter):
0 for spam; 1 for non-spam; 2 for unsure
However, spamassassin returns:
0 for non-spam, 5 for spam

So, use save the following script in /home/user/.spamassassin/spamcheck.sh:

#!/bin/bash
# return
#       0 for spam; 1 for non-spam; 2 for unsure ; 3 for I/O or other errors.
# spamassessin returns
#       0 for non-spam, 5 for spam
spamassassin -e "$1"
RETURN="$?"
[[ "$RETURN" == "0" ]] && exit 1
[[ "$RETURN" == "5" ]] && exit 0
exit 2

Set it in sylpheed as classifying command:
bash /home/user/.spamassassin/spamcheck.sh

If you want to save 0.1 seconds per call, compile the attached C file spamcheck (c, 2 KB) with

gcc -Wall -Wextra -Werror -ansi -pedantic -o spamcheck spamcheck.c

and use the outcoming spamcheck executable instead.

No Comments

Rip audio files from Flash videos (youtube etc.)

This describes how you can rip the audio from FLV video files as found on youtube&Co and save them as ogg or mp3.
You need:

  • youtube-dl Python script for fetching the FLV-file from youtube. (optionally if you have the flv file already)
  • mplayer for extracting the audio data from the video
  • oggenc for encoding the raw audio to a ogg audio file or
  • lame for encoding the raw audio to a mp3 audio file

The commands:

python youtube-dl.py -t 'http://www.youtube.com/watch?v=<videohash>'
mplayer -ao pcm -vo null '<yourvideo>.flv'
oggenc audiodump.wav -o '<yourtargetoggfile>.ogg'
lame audiodump.wav -o '<yourtargetmp3file>.mp3'
rm audiodump.wav

No Comments

keeping old revisions

One might be to lazy to set up or use a revision control system like CVS, SVN or git for small projects or when working on 2-3 simple files.

This script creates a directory “backup”, and places there a archive of the files in the current directory (not recursive) with an increasing number.

e.g. you start it in a Folder “Bsp-3″, it will create “backup/Bsp3-1.tar.bz2″, next time “Bsp3-2.tar.bz2″, etc.
Also cleans up duplicates, so you cannot call it too often!

Small, clean, easy. I love it!

backup-point.sh

DIRNAME=$(basename $PWD)
EXT=tar.bz2
mkdir -p backup
cleanup_unneeded(){
cd backup
FILES=*.tar.bz2
UNIQ=$(md5sum $FILES|sort|uniq --check-chars=32 | cut -d ' ' -f 3-);
for i in $FILES
do
if ! echo "$UNIQ" | grep -wq "$i"
then
echo "deleting unnecessary $i."
rm "$i"
fi
done
cd ..
}
cleanup_unneeded
for((i=1;i<200;i++)); do
FILENAME="backup/$DIRNAME-$i.$EXT"
if ! test -f "$FILENAME"; then
find  -maxdepth 1 -type f|xargs tar -cjf "$FILENAME"
echo "backup-point $i made."
cleanup_unneeded
exit
fi
done
echo '200 points reached! Clean up a bit?'

,

1 Comment