I’ve finally cleaned up and released my Irssi irclog merging script. Find it at http://hyse.org/irssi-log-merge/ .
It reads two or more Irssi log file directories, usually found in ~/irclogs/ on whichever server you run your Irssi clients. If you have been using Irssi for a while, perhaps on different computers, you end up with more than one such log directory. If you prefer to keep your logs for later (unspecified) use, then merging them into a single log directory may make things easier.
The problem is that you have overlapping filenames (channels/privmsgs) in some log directories, so just copying them into the same directory will overwrite some of those precious log lines.
Here comes irssi-log-merge.py. Supply it with the two or more different log directories you have laying about, and let it sort them chronologically. The finished product is a single file per channel/dcc/query with all your carefully chosen words and insults. Perfect for storing for later or grepping in.
It’s pretty novel when done, but the UTF-8 vs. latin-1 vs. US-ASCII (vs. double-encoded UTF-8, I call it WTF-8) character set issues weren’t very fun to figure out. Now all logs are read to the best of ability, and stored as UTF-8. Also, as Irssi writes the date according to the current locale some nasty hacks are in there to make it readable by time.strptime(). Don’t know how well this will work for others, but it works for my logs stored with en_EN and no_NO.
As always, comments are welcome.