Stoppt die Vorratsdatenspeicherung! Jetzt klicken &handeln! Willst du auch an der Aktion teilnehmen? Hier findest du alle relevanten Infos und Materialien:
Jump to menu and information about this site.

Wednesday·02·October·2013

How to make wget honour Content-Disposition headers //at 16:12 //by abe

from the DWIM dept.

Download links often point to CGI scripts which actually generate (or just fetch, i.e. proxy) the actual file to be downloaded, e.g. URLs like http://www.example.com/download.cgi?file=foobar.txt.

Most of such CGI scripts send the real file name in the Content-Disposition header as specified in the MIME Specification.

All browsers I know (well, at least those I use regularily :-) handle that perfectly and propose the file name sent in the Content-Disposition header as file name for saving the downloaded name which is usually exactly what I want.

All browsers do that, …, just not my favourite commandline download tool GNU Wget … Downloading the above URL with wget would look like this with default settings:

$ wget 'http://www.example.com/download.cgi?file=foobar.txt'
--2013-10-02 16:04:16--  http://www.example.com/download.cgi?file=foobar
Resolving www.example.com (www.example.com)... 93.184.216.119, 2606:2800:220:6d:26bf:1447:1097:aa7
Connecting to www.switch.ch (www.example.com)|2606:2800:220:6d:26bf:1447:1097:aa7|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2020 (2.0K) [text/plain]
Saving to: `download.cgi?file=foobar.txt'

100%[============================================>] 2,020       --.-K/s   in 0s

2013-10-02 16:04:24 (12.5 MB/s) - `download.cgi?file=foobar.txt' saved [2020/2020]

Meh!

But luckily Wget can do that, it’s just not enabled by default — because it’s an experimental and possibly buggy feature, at least according to the man page. Well, works for me! :-)

You can easily enabled it by default for either your user or the whole system by placing the following line in your ~/.wgetrc or /etc/wgetrc:

content-disposition = on

Given the CGI script sends an appropriate Content-Disposition header, the above output now looks like this:

$ wget 'http://www.example.com/download.cgi?file=foobar.txt'
--2013-10-02 16:04:16--  http://www.example.com/download.cgi?file=foobar
Resolving www.example.com (www.example.com)... 93.184.216.119, 2606:2800:220:6d:26bf:1447:1097:aa7
Connecting to www.switch.ch (www.example.com)|2606:2800:220:6d:26bf:1447:1097:aa7|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2020 (2.0K) [text/plain]
Saving to: `foobar.txt'

100%[============================================>] 2,020       --.-K/s   in 0s

2013-10-02 16:04:24 (12.5 MB/s) - `foobar.txt' saved [2020/2020]

Now Wget does what I mean!

You can also set this as flag on the commandline, but typing wget --content-disposition … everytime is surely not what I want. ;-)

Tag Cloud

2CV, aha, Apache, aptitude, ASUS, Automobiles, autossh, Berlin, bijou, Blogging, Blosxom, Blosxom Plugin, Browser, BSD, CDU, Chemnitz, Citroën, CLI, CLT, Conkeror, CX, deb, Debian, Doofe Parteien, E-Mail, eBay, EeePC, Emacs, Epiphany, Etch, ETH Zürich, Events, Experimental, Firefox, Fläsch, FreeBSD, FVWM, Galeon, Gecko, git, GitHub, GNOME, GNU, GNU Coreutils, GNU Screen, Google, GPL, grep, grml, gzip, Hacks, Hardware, Heise, HTML, identi.ca, IRC, irssi, Jabber, JavaShit, Kazehakase, Lenny, Liferea, Linux, LinuxTag, LUGS, Lynx, maol, Meme, Microsoft, Mozilla, Music, mutt, Myon, München, nemo, Nokia, nuggets, Open Source, Opera, Pentium I, Perl, Planet Debian, Planet Symlink, Quiz, Rant, ratpoison, Religion, RIP, Sarcasm, Sarge, Schweiz, screen, Shell, Sid, Spam, Squeeze, SSH, Stöckchen, SuSE, Symlink, Symlink-Artikel, Tagging, Talk, taz, Text Mode, ThinkPad, Ubuntu, USA, USB, UUUCO, UUUT, VCFe, Ventilator, Vintage, Wahlen, Wheezy, Wikipedia, Windows, WML, Woody, WTF, X, Xen, zsh, Zürich, ÖPNV

Calendar

 2013 
Months
Oct
 October 
Mo Tu We Th Fr Sa Su
  2
     

Tattletale Statistics

Blog postings by posting time
Blog posting times this month



Search


Advanced Search


Categories


Recent Postings

0 most recent of 0 postings total shown.


Recent Comments

Hackergotchi of Axel Beckert

About...

This is the blog or weblog of Axel Stefan Beckert (aka abe or XTaran) who thought, he would never start blogging... (He also once thought, that there is no reason to switch to this new ugly Netscape thing because Mosaïc works fine. That was about 1996.) Well, times change...

He was born 1975 at Villingen-Schwenningen, made his Abitur at Schwäbisch Hall, studied Computer Science with minor Biology at University of Saarland at Saarbrücken (Germany) and now lives in Zürich (Switzerland), working at the IT Support Group (ISG) of the Departement of Physics at ETH Zurich.

Links to internal pages are orange, links to related pages are blue, links to external resources are green and links to Wikipedia articles, Internet Movie Database (IMDb) entries or similar resources are bordeaux. Times are CET respective CEST (which means GMT +0100 respective +0200).


RSS Feeds


Identity Archipelago


Picture Gallery


Button Futility

Valid XHTML Valid CSS
Valid RSS Any Browser
GeoURL
This content is licensed under a Creative Commons License (SA 3.0 DE). Some rights reserved. Hacker Emblem
Get Mozilla Firefox! Powered by Linux!
Typed with GNU Emacs Listed at Tux Mobil
XFN Friendly Button Maker

Blogroll

Blog or not?


People I know personally


Other blogs I like or read


Independent News


Interesting Planets


Web comics I like and read

Stalled Web comics I liked


Blogging Software

Blosxom Plugins I use

Bedside Reading

Just read

  • Bastian Sick: Der Dativ ist dem Genitiv sein Tod (Teile 1-3)
  • Neil Gaiman and Terry Pratchett: Good Omens (borrowed from Ermel)

Currently Reading

  • Douglas R. Hofstadter: Gödel, Escher, Bach
  • Neil Gaiman: Keine Panik (borrowed from Ermel)

Yet to read

  • Neil Stephenson: Cryptonomicon (borrowed from Ermel)

Always a good snack

  • Wolfgang Stoffels: Lokomotivbau und Dampftechnik (borrowed from Ermel)
  • Beverly Cole: Trains — The Early Years (getty images)

Postponed