8 years ago
GNU Wget is a nice tool for downloading resources from the internet. The basic usage is
The power of wget is that you may download sites recursive, meaning you also get all pages (and images and other data) linked on the front page:
wget -r http://linuxreviews.org/
But many sites do not want you to download their entire site. To prevent this, they check how browsers identify. Many sites refuse you to connect or send a blank page if they detect you are not using a web-browser. You might get a message like:
Sorry, but the download manager you are using to view this site is not supported. We do not support use of such download managers as flashget, go!zilla, or getright
There is a very handy
-U option for sites like this. Use
to tell the site you are using some commonly accepted browser:
wget -r -p -U Mozilla http://www.stupidsite.com/restricedplace.html
A web-site owner will probably get upset if you attempt to download his entire site using a simple
command. However, the web-site owner will not even notice you if you limit the download transfer rate and pause between fetching files.
To make sure you are not manually added to a blacklist, the most important command line options are
To pause 20 seconds between retrievals you should add
and to limit the download rate use something like
as this option defaults to bytes, add K to set KB/s.
wget --wait=20 --limit-rate=20K -r -p -U Mozilla http://www.stupidsite.com/restricedplace.html
A very handy option that guarantees wget will not download anything from the folders beneath the folder you want to acquire is:
Use this to make sure wget does not fetch more than it needs to if you just want to download the files in a folder.
To install the Gnome front-end for wget click here.
The original version of this how-to is available at http://linuxreviews.org/quicktips/wget/wget.en.pdf
Copyright (c) 2000-2004 Øyvind Sæther. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License".
It’s still useful.
it's awesome i like it ;)
@troyM - GNU Wget is a command line utility that downloads files. By using only the terminal, the best browsers are Links2 and ELinks.
Also, you should know that GNU Wget can work in the background, even while the user is not logged on. This means that you can start a retrieval and disconnect from the system, while wget finishes the download. By contrast, most of the Web browsers require constant user's presence.
Alexio, what browser(s) utilize this GNU Wget best? Thanks for the tutorial.
Pretty awesome, this helped me a lot, ty good sir :)