Tuesday·05·June·2012
Automatically hardlinking duplicate files under /usr/share/doc with APT //at 20:43 //by abe
On my everyday netbook (a very reliable first generation ASUS EeePC 701 4G) the disk (4 GB as the product name suggests :-) is nearly always close to full.
TL;DWTR? Jump directly to the HowTo. :-)
So I came up with a few techniques to save some more disk space. Installing localepurge was one of the earliest. Another one was to implement aptitude filters to do interactively what deborphan does non-interactively. Yet another one is to use du and friends a lot – ncdu is definitely my favourite du-like tool in the meantime.
Using du and friends I often noticed how much disk space /usr/share/doc
takes up. But since I value the
contents of /usr/share/doc
a lot, I condemn
how Nokia solved that on the N900: They let APT delete all
files and directories under /usr/share/doc
(including the copyright files!) via some package named
docpurge. I also dislike Ubuntu’s “solution” to truncate the
shipped changelog files (you can still get the remainder of the files
on the web somewhere) as they’re an important source of information
for me.
So when aptitude showed me that some package suddenly wanted to use up
quite some more disk space, I noticed that the new package version
included the upstream changelog twice. So I started searching for
duplicate files under /usr/share/doc
.
There are quite some tools to find duplicate files in Debian. hardlink seemed most appropriate for this case.
First I just looked for duplicate files per package, which even on that less than four gigabytes installation on my EeePC found nine packages which shipped at least one file twice.
As recommended I rather opted for an according Lintian check (see bugs. Niels Thykier kindly implemented such a check in Lintian and its findings are as reported as tags “duplicate-changelog-files” (Severity: normal, from Lintian 2.5.2 on) and “duplicate-files” (Severity: minor, experimental, from Lintian 2.5.0 on).
Nevertheless, some source packages generate several binary packages
and all of them (of course) ship the same, in some cases quite large
(Debian) changelog file. So I found myself running hardlink /usr/share/doc
now and then to gain
some more free disk space. But as I run Sid and package upgrades
happen more than daily, I came to the conclusion that I should run
this command more or less after each aptitude run, i.e. automatically.
Having taken localepurge’s APT hook as example, I added the
following content as /etc/apt/apt.conf.d/98-hardlink-doc
to my system:
// Hardlink identical docs, changelogs, copyrights, examples, etc DPkg { Post-Invoke {"if [ -x /usr/bin/hardlink ]; then /usr/bin/hardlink -t /usr/share/doc; else exit 0; fi";}; };
So now installing a package which contains duplicate files looks like this:
~ # aptitude install perl-tk The following NEW packages will be installed: perl-tk 0 packages upgraded, 1 newly installed, 0 to remove and 0 not upgraded. Need to get 2,522 kB of archives. After unpacking 6,783 kB will be used. Get: 1 http://ftp.ch.debian.org/debian/ sid/main perl-tk i386 1:804.029-1.2 [2,522 kB] Fetched 2,522 kB in 1s (1,287 kB/s) Selecting previously unselected package perl-tk. (Reading database ... 121849 files and directories currently installed.) Unpacking perl-tk (from .../perl-tk_1%3a804.029-1.2_i386.deb) ... Processing triggers for man-db ... Setting up perl-tk (1:804.029-1.2) ... Mode: real Files: 15423 Linked: 3 files Compared: 14724 files Saved: 7.29 KiB Duration: 4.03 seconds localepurge: Disk space freed in /usr/share/locale: 0 KiB localepurge: Disk space freed in /usr/share/man: 0 KiB localepurge: Disk space freed in /usr/share/gnome/help: 0 KiB localepurge: Disk space freed in /usr/share/omf: 0 KiB Total disk space freed by localepurge: 0 KiB
Sure, that wasn’t the most space saving example, but on some
installations I saved around 100 MB of disk space that way – and
I still haven’t found a case where this caused unwanted damage. (Use
of this advice on your own risk, though. Pointers to potential
problems welcome. :-)
Tagged as: APT, aptitude, ASUS, changelog, docpurge, du, duff, duplicate, duplicates, EeePC, hardlink, HowTo, Lintian, localepurge, N900, ncdu, nemo, Netbook, Nokia, recursive, Ubuntu
// write a comment // comments off