Stoppt die Vorratsdatenspeicherung! Jetzt klicken &handeln! Willst du auch an der Aktion teilnehmen? Hier findest du alle relevanten Infos und Materialien:
Jump to menu and information about this site.


deepgrep: grep nested archives with one command //at 02:00 //by abe

from the grep-revisited dept.

Several months ago, I wrote about grep everything and listed grep-like tools which can grep through compressed files or specific data formats. The blog posting sparked several magazine articles and talks by Frank Hofmann and me.

Frank recently noticed that we though missed one more or less mighty tool so far. We missed it, because it’s mostly unknown, undocumented and hidden behind a package name which doesn’t suggest a real recursive “grep everything”:


deepgrep is part of the Debian package strigi-utils, a package which contains utilities related to the KDE desktop search Strigi.

deepgrep especially eases the searching through tar balls, even nested ones, but can also search through zip files and documents (which are actually zip files).

deepgrep seems to support at least the following archive and compression formats:

  • tar
  • ar, and hence deb
  • rpm (but not cpio)
  • gzip/gz
  • bzip2/bz2
  • zip, and hence jar/war and documents
  • MIME messages (i.e. files attached to e-mails)

A search in an archive which is deeply nested looks like this:

$ deepgrep bar

deepgrep though neither seems to support any LZMA based compression (lzma, xz, lzip, 7z), nor does it support lzop, rzip, compress (.Z suffix), cab, cpio, xar, or rar.

Further current drawbacks of deepgrep:

  • Nearly no commandline options, especially none of the common grep options
  • No man-page or other documentation
  • Exit code not related to search results, you have to check the output to see if something has been found


If you just need the file names of the files in nested archives, the package also contains the tool deepfind which does nothing else than to list all files and directories in a given set of archives or directories:

$ deepfind

As with deepgrep, deepfind does not implement any common options of it’s normal sister tool find.

[The following part has been added on 17-Nov-2012]

As with deepgrep, it also doesn’t seem to support any of the more modern or more exotic compression formats, i.e. it fails on modern debian binary packages which use xz compression on the data part:

deepfind xulrunner-18.0_18.0\~a2+20121109042012-1_amd64.deb

[End of part added at 17-Nov-2012]


The package strigi-utils doesn’t pull in the complete Strigi framework (i.e. no daemon), just a few libraries (libstreams, libstreamanalyzer, and libclucene). On Wheezy it also pulls in some audio/video decoding libraries which may make some server administrators less happy.


Both tools are quite limited to some basic use cases, but can be worth a fortune if you have to work with nested archives. Nevertheless the claim in the Debian package description of strigi-utils that they’re “enhanced” versions of their well known counterparts is IMHO disproportionate.

Most of the missing features and documentation can be explained by the primary purpose of these tools: Being backend for desktop searches. I guess, there wasn’t much need for proper commandline usage yet. Until now. ;-)

And yes, I was curious enough to let deepfind have a look at (the one from SecurityFocus, unzip seems not able to unpack from due a missing version compatibility) and since it just traverses the archive sequentially, it has no problem with that, needing just about 5 MB of RAM and a lot of time:

deepfind  11644.12s user 303.89s system 97% cpu 3:24:02.46 total

I though won’t try deepgrep on ;-)

Tag Cloud

2CV, aha, Apache, APT, aptitude, ASUS, Automobiles, autossh, Berlin, bijou, Blogging, Blosxom, Blosxom Plugin, Browser, BSD, CDU, Chemnitz, Citroën, CLI, CLT, Conkeror, CX, deb, Debian, Doofe Parteien, E-Mail, eBay, EeePC, Emacs, Epiphany, Etch, ETH Zürich, Events, Experimental, Firefox, Fläsch, FreeBSD, FVWM, Galeon, Gecko, git, GitHub, GNOME, GNU, GNU Coreutils, GNU Screen, Google, GPL, grep, grml, gzip, Hackerfunk, Hacks, Hardware, Heise, HTML,, IRC, irssi, Jabber, JavaShit, Kazehakase, Lenny, Liferea, Linux, LinuxTag, LUGS, Lynx, maol, Meme, Microsoft, Mozilla, Music, mutt, Myon, München, nemo, Nokia, nuggets, Open Source, Opera, packaging, Pentium I, Perl, Planet Debian, Planet Symlink, Quiz, Rant, ratpoison, Religion, RIP, Sarcasm, Sarge, Schweiz, screen, Shell, Sid, Spam, Squeeze, SSH, Stöckchen, SuSE, Symlink, Symlink-Artikel, Tagging, Talk, taz, Text Mode, ThinkPad, Ubuntu, USA, USB, UUUCO, UUUT, VCFe, Ventilator, Vintage, Wahlen, Wheezy, Wikipedia, Windows, WML, Woody, WTF, X, Xen, zsh, Zürich, ÖPNV


Mo Tu We Th Fr Sa Su

Tattletale Statistics

Blog postings by posting time
Blog posting times this month


Advanced Search


Recent Postings

0 most recent of 0 postings total shown.

Recent Comments

Hackergotchi of Axel Beckert


Debian GNU/Linux is my favourite Linux distribution, being stable, flexible, consistent and having a great community. Although I'm not the biggest bug report writer, I try to contribute by staffing the Debian booth at events, carrying the necessary hardware there or even organising the whole booth.

RSS Feeds

Identity Archipelago

Picture Gallery

Button Futility

Valid XHTML Valid CSS
Valid RSS Any Browser
This content is licensed under a Creative Commons License (SA 3.0 DE). Some rights reserved. Hacker Emblem
Get Mozilla Firefox! Powered by Linux!
Typed with GNU Emacs Listed at Tux Mobil
XFN Friendly Button Maker


People I know personally

Other blogs I like or read

Independent News

Interesting Planets

Web comics I like and read

Stalled Web comics I liked

Blogging Software

Blosxom Plugins I use

Bedside Reading

Just read

  • Bastian Sick: Der Dativ ist dem Genitiv sein Tod (Teile 1-3)
  • Neil Gaiman and Terry Pratchett: Good Omens (borrowed from Ermel)

Currently Reading

  • Douglas R. Hofstadter: Gödel, Escher, Bach
  • Neil Gaiman: Keine Panik (borrowed from Ermel)

Yet to read

  • Neil Stephenson: Cryptonomicon (borrowed from Ermel)

Always a good snack

  • Wolfgang Stoffels: Lokomotivbau und Dampftechnik (borrowed from Ermel)
  • Beverly Cole: Trains — The Early Years (getty images)