Stoppt die Vorratsdatenspeicherung! Jetzt klicken &handeln! Willst du auch an der Aktion teilnehmen? Hier findest du alle relevanten Infos und Materialien:
Jump to menu and information about this site.

Sunday·08·March·2009

How to make identi.ca talk //at 15:08 //by abe

from the microblogging-to-speech dept.

The listeners of yesterday’s episode of Venty’s Hackerfunk radio show on Radio LoRa already know and heard it: We made identi.ca talk. And we did it with help of other microbloggers. (The podcast version of this Hackerfunk episode will be online in a few days, too. Will link it here and either Venty or me will post it on identi.ca, too, as soon as it’s published.)

A few weeks ago we thought about how we could “show” microblogging on the radio. With identi.ca’s Jabber (XMPP) interface we have real time access, and so the idea was born to pipe all incoming ‘dents into a speech synthesis system.

Then we tried to figure out which tools would be appropriate. Quite fast, people on identi.ca as well as on the LUGS IRC (e.g. bones0) pointed us to festival and espeak. We found no support for German in festival, so we went for espeak – although festival would have had the advantage of the existence of a festival plugin for the popular multiprotocol messenger Pidgin.

Next step was more difficult than expected: How to make a “tail -f” of XMPP incoming messages? Something like rsstail, just for XMPP. Although using the IM to IRC gateway Bitlbee (as I use it myself) and using “tail -f” (or better “inotail -f”) on the IRC client’s log file (ii comes to my mind for such purposes) would have been an option, nobody had the idea at that time.

And since @deepspawn conjured xmpptail in less than two hours we happily took it. xmpptail (tar.gz) is written in Python and uses Twisted Words (Debian package python-twisted-words) as XMPP libraries.

I had to patch xmpptail slightly for unbuffered I/O, Unicode support and for removing things we don’t want to hear on the radio as follows, but it worked more or less out of the box.

--- xmmptail.py 2009-02-25 20:47:48.000000000 +0100
+++ xmpptail.py 2009-03-07 18:48:57.000000000 +0100
@@ -1,4 +1,4 @@
-#!/usr/bin/python
+#!/usr/bin/python -u
 # -*- coding: utf-8 -*-
 # author: Carlos A. Perilla 
 # This file is part of Jance bot.
@@ -65,7 +65,8 @@
          body = unicode(e.__str__())
          break
 
-    print("%s: %s" % (from_id,body))
+#    print("%s: %s" % (from_id,body))
+    print("%s" % (body.encode('utf-8')))
 
 
 def authfailedEvent(xmlstream):
@@ -80,9 +81,9 @@
   dprint('Got something: %s -> %s' % (el.name, str(el.attributes)))
 
 if __name__ == '__main__':
-    print "Starting"
+    #print "Starting"
     execfile('tailconf')
-    print USER_HANDLE
+    #print USER_HANDLE
     me = USER_HANDLE + "/xmpptail"
     myJid = jid.JID(me)
     server = USER_HANDLE[USER_HANDLE.find('@')+1:]

So after configuring xmpptail to use the hackerfunk Jabber account, we successfully ran the following script during the radio show:

./xmpptail.py | while read LINE; do
        if [ "$LINE" = "empty" ]; then
                continue;
        fi;
        echo $LINE
        echo $LINE | tee -a xmpp-espeak.log | espeak --stdin -v de;
done

At the end of the show, @rebugger found this howto which describes very detailed how to get festival working together with the non-free (“non-free” as in DFSG) MBROLA project which offers also the appropriate files for German. But because of how much work this would be to get it running, I currently prefer to stay with espeak for German speech synthesis .

Next step would be to use mnoGosearch’s mguesser to detect the language of a dent and run espeak (or whatever text-to-speech system is appropriate for the guessed language) with the appropiate options for that language, because otherwise many ‘dents sound really funny. ;-)

Update, 15:02: Venty gave the whole system the name “Identibla”.

Comments

Your Comment

Spam Protection: To post a comment, you'll have to answer the following question: What is 42 minus 19?

Name:
URL or E-Mail: [http://... or mailto:you@example.com] (optional)
Title: (optional)
Spam Protection Answer:
Comment:

Tag Cloud

2CV, aha, Apache, APT, aptitude, ASUS, Automobiles, autossh, Berlin, bijou, Blogging, Blosxom, Blosxom Plugin, Browser, BSD, CDU, Chemnitz, Citroën, CLI, CLT, Conkeror, CX, deb, Debian, Doofe Parteien, E-Mail, eBay, EeePC, Emacs, Epiphany, Etch, ETH Zürich, Events, Experimental, Firefox, Fläsch, FreeBSD, FVWM, Galeon, Gecko, git, GitHub, GNOME, GNU, GNU Coreutils, GNU Screen, Google, GPL, grep, grml, gzip, Hackerfunk, Hacks, Hardware, Heise, HTML, identi.ca, IRC, irssi, Jabber, JavaShit, Kazehakase, Lenny, Liferea, Linux, LinuxTag, LUGS, Lynx, maol, Meme, Microsoft, Mozilla, Music, mutt, Myon, München, nemo, Nokia, nuggets, Open Source, Opera, packaging, Pentium I, Perl, Planet Debian, Planet Symlink, Quiz, Rant, ratpoison, Religion, RIP, Sarcasm, Sarge, Schweiz, screen, Shell, Sid, Spam, Squeeze, SSH, Stöckchen, SuSE, Symlink, Symlink-Artikel, Tagging, Talk, taz, Text Mode, ThinkPad, Ubuntu, USA, USB, UUUCO, UUUT, VCFe, Ventilator, Vintage, Wahlen, Wheezy, Wikipedia, Windows, WML, Woody, WTF, X, Xen, zsh, Zürich, ÖPNV

Calendar

← 2017 →
Months
Dec
 December →
Mo Tu We Th Fr Sa Su
       
17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

Tattletale Statistics

Blog postings by posting time
Blog posting times this month



Search


Advanced Search


Categories


Recent Postings

0 most recent of 0 postings total shown.


Recent Comments

Hackergotchi of Axel Beckert

About...

This is the blog or weblog of Axel Stefan Beckert (aka abe or XTaran) who thought, he would never start blogging... (He also once thought, that there is no reason to switch to this new ugly Netscape thing because Mosaïc works fine. That was about 1996.) Well, times change...

He was born 1975 at Villingen-Schwenningen, made his Abitur at Schwäbisch Hall, studied Computer Science with minor Biology at University of Saarland at Saarbrücken (Germany) and now lives in Zürich (Switzerland), working at the IT Support Group (ISG) of the Departement of Physics at ETH Zurich.

Links to internal pages are orange, links to related pages are blue, links to external resources are green and links to Wikipedia articles, Internet Movie Database (IMDb) entries or similar resources are bordeaux. Times are CET respective CEST (which means GMT +0100 respective +0200).


RSS Feeds


Identity Archipelago


Picture Gallery


Button Futility

Valid XHTML Valid CSS
Valid RSS Any Browser
GeoURL
This content is licensed under a Creative Commons License (SA 3.0 DE). Some rights reserved. Hacker Emblem
Get Mozilla Firefox! Powered by Linux!
Typed with GNU Emacs Listed at Tux Mobil
XFN Friendly Button Maker

Blogroll

Blog or not?


People I know personally


Other blogs I like or read


Independent News


Interesting Planets


Web comics I like and read

Stalled Web comics I liked


Blogging Software

Blosxom Plugins I use

Bedside Reading

Just read

  • Bastian Sick: Der Dativ ist dem Genitiv sein Tod (Teile 1-3)
  • Neil Gaiman and Terry Pratchett: Good Omens (borrowed from Ermel)

Currently Reading

  • Douglas R. Hofstadter: Gödel, Escher, Bach
  • Neil Gaiman: Keine Panik (borrowed from Ermel)

Yet to read

  • Neil Stephenson: Cryptonomicon (borrowed from Ermel)

Always a good snack

  • Wolfgang Stoffels: Lokomotivbau und Dampftechnik (borrowed from Ermel)
  • Beverly Cole: Trains — The Early Years (getty images)

Postponed