Sunday·08·March·2009
How to make identi.ca talk //at 15:08 //by abe
The listeners of yesterday’s episode of Venty’s Hackerfunk radio show on Radio LoRa already know and heard it: We made identi.ca talk. And we did it with help of other microbloggers. (The podcast version of this Hackerfunk episode will be online in a few days, too. Will link it here and either Venty or me will post it on identi.ca, too, as soon as it’s published.)
A few weeks ago we thought about how we could “show” microblogging on the radio. With identi.ca’s Jabber (XMPP) interface we have real time access, and so the idea was born to pipe all incoming ‘dents into a speech synthesis system.
Then we tried to figure out which tools would be appropriate. Quite fast, people on identi.ca as well as on the LUGS IRC (e.g. bones0) pointed us to festival and espeak. We found no support for German in festival, so we went for espeak – although festival would have had the advantage of the existence of a festival plugin for the popular multiprotocol messenger Pidgin.
Next step was more difficult than expected: How to make a “tail -f” of XMPP incoming messages? Something like rsstail, just for XMPP. Although using the IM to IRC gateway Bitlbee (as I use it myself) and using “tail -f” (or better “inotail -f”) on the IRC client’s log file (ii comes to my mind for such purposes) would have been an option, nobody had the idea at that time.
And since @deepspawn conjured xmpptail in less than two hours we happily took it. xmpptail (tar.gz) is written in Python and uses Twisted Words (Debian package python-twisted-words) as XMPP libraries.
I had to patch xmpptail slightly for unbuffered I/O, Unicode support and for removing things we don’t want to hear on the radio as follows, but it worked more or less out of the box.
--- xmmptail.py 2009-02-25 20:47:48.000000000 +0100 +++ xmpptail.py 2009-03-07 18:48:57.000000000 +0100 @@ -1,4 +1,4 @@ -#!/usr/bin/python +#!/usr/bin/python -u # -*- coding: utf-8 -*- # author: Carlos A. Perilla# This file is part of Jance bot. @@ -65,7 +65,8 @@ body = unicode(e.__str__()) break - print("%s: %s" % (from_id,body)) +# print("%s: %s" % (from_id,body)) + print("%s" % (body.encode('utf-8'))) def authfailedEvent(xmlstream): @@ -80,9 +81,9 @@ dprint('Got something: %s -> %s' % (el.name, str(el.attributes))) if __name__ == '__main__': - print "Starting" + #print "Starting" execfile('tailconf') - print USER_HANDLE + #print USER_HANDLE me = USER_HANDLE + "/xmpptail" myJid = jid.JID(me) server = USER_HANDLE[USER_HANDLE.find('@')+1:]
So after configuring xmpptail to use the hackerfunk Jabber account, we successfully ran the following script during the radio show:
./xmpptail.py | while read LINE; do if [ "$LINE" = "empty" ]; then continue; fi; echo $LINE echo $LINE | tee -a xmpp-espeak.log | espeak --stdin -v de; done
At the end of the show, @rebugger found this howto which describes very detailed how to get festival working together with the non-free (“non-free” as in DFSG) MBROLA project which offers also the appropriate files for German. But because of how much work this would be to get it running, I currently prefer to stay with espeak for German speech synthesis .
Next step would be to use mnoGosearch’s mguesser to detect the language of a dent and run espeak (or whatever text-to-speech system is appropriate for the guessed language) with the appropiate options for that language, because otherwise many ‘dents sound really funny. ;-)
Update, 15:02: Venty gave the whole system the
name “Identibla”.
Tagged as: Bitlbee, bones0, deepspawn, DFSG, espeak, festival, Hackerfunk, identi.ca, identibla, ii, IM, inotail, IRC, Jabber, language detection, LoRa, MBROLA, mguesser, microblogging, mnoGosearch, non-free, Pidgin, pipe, Python, radio, rebugger, speech synthesis, tail, text to speech, tts, Twisted Words, Venty, XMPP, xmpptail
0 comments // write a comment // comments off
Related stories
Comments
Your Comment
Spam Protection: To post a comment, you'll have to answer the following question: What is 42 minus 19?