How to Use Wget Command in crontab Ubuntu
The wget
command is a powerful utility for downloading files from the web, making it an essential tool for users of all experience levels. This command-line program is widely used for its versatility and robustness, supporting HTTP, HTTPS, and FTP protocols. In this comprehensive guide, we will explore how to use the wget
command on Ubuntu, including basic usage, advanced features, and practical examples.
Introduction to Wget
Wget, short for “web get,” is a free software package for retrieving files using HTTP, HTTPS, and FTP. It is designed for robustness over slow or unstable network connections. Wget supports downloading files non-interactively, which means it can run in the background, and you can use it in scripts and cron jobs.
Installing Wget on Ubuntu
Wget is often pre-installed on many Linux distributions, including Ubuntu. To check if it’s installed, open a terminal and type:
wget --version
If Wget is not installed, you can install it using the package manager:
sudo apt update
sudo apt install wget
Basic Usage of Wget
The basic syntax of the wget
command is:
wget [options] [URL]
To download a file, simply provide the URL. For example:
wget https://example.com/file.txt
This command downloads file.txt
to your current directory.
Advanced Usage of Wget
Wget provides a variety of options to customize your download experience. Here are some of the most useful and advanced features:
Downloading to a Specific Directory
By default, Wget saves files to the current directory. To specify a different directory, use the -P
option:
wget -P /path/to/directory https://example.com/file.txt
Resuming Downloads
If a download is interrupted, you can resume it using the -c
option:
wget -c https://example.com/file.txt
This is especially useful for downloading large files.
Downloading Multiple Files
To download multiple files, you can use a text file containing a list of URLs. Create a file named urls.txt
with one URL per line, then use the -i
option:
wget -i urls.txt
Limiting Download Speed
To limit the download speed, use the --limit-rate
option:
wget --limit-rate=100k https://example.com/file.txt
This limits the download speed to 100 KB/s.
Background Downloads
To download files in the background, use the -b
option:
wget -b https://example.com/file.txt
Recursive Downloads
Wget can download files recursively. To download an entire website, use the -r
option:
wget -r https://example.com
You can limit the depth of recursion with the -l
option:
wget -r -l 2 https://example.com
Mirroring Websites
To create a local mirror of a website, use the -m
option:
wget -m https://example.com
This option enables recursion and time-stamping, among other settings.
Handling Authentication
Some websites require authentication to access certain files. Wget supports basic HTTP authentication:
wget --user=username --password=password https://example.com/secure-file.txt
For more complex authentication scenarios, you might need to handle cookies. First, save the cookies from your browser to a file, then use the --load-cookies
option:
wget --load-cookies=cookies.txt https://example.com/secure-file.txt
Working with FTP
Wget also supports FTP downloads. To download a file from an FTP server:
wget ftp://example.com/file.txt
If the FTP server requires authentication, provide the username and password:
wget --ftp-user=username --ftp-password=password ftp://example.com/file.txt
Setting User Agent
Some servers might block non-browser user agents. You can change the user agent string to mimic a web browser:
wget --user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3" https://example.com
Handling Redirects
Wget automatically follows redirects. However, you can limit the number of redirects with the --max-redirect
option:
wget --max-redirect=5 https://example.com
Timeout and Retry Options
To set a timeout for your downloads, use the -T
option:
wget -T 30 https://example.com/file.txt
This sets a timeout of 30 seconds. If a download fails, you can specify the number of retries using the --tries
option:
wget --tries=5 https://example.com/file.txt
Logging Downloads
To save the output of Wget to a log file, use the -o
option:
wget -o download.log https://example.com/file.txt
For quieter output, you can use the -q
option:
wget -q https://example.com/file.txt
Practical Examples
Downloading a File
wget https://example.com/file.zip
Downloading to a Specific Directory
wget -P /home/user/downloads https://example.com/file.zip
Resuming a Download
wget -c https://example.com/largefile.zip
Downloading Multiple Files
Create files.txt
with the following content:
https://example.com/file1.zip
https://example.com/file2.zip
https://example.com/file3.zip
Then run:
wget -i files.txt
Limiting Download Speed
wget --limit-rate=50k https://example.com/file.zip
Downloading in the Background
wget -b https://example.com/largefile.zip
Check the status with:
tail -f wget-log
Recursive Download
wget -r https://example.com/directory/
Mirroring a Website
wget -m https://example.com
Downloading with Authentication
wget --user=myusername --password=mypassword https://example.com/protected-file.zip
Setting a Custom User Agent
wget --user-agent="Mozilla/5.0" https://example.com
Setting a Timeout and Retries
wget -T 30 --tries=5 https://example.com/unstable-file.zip
Conclusion
Wget is an invaluable tool for downloading files from the web, offering a plethora of features to handle various download scenarios. Whether you are downloading a single file, mirroring a website, or scripting automated downloads, Wget provides the flexibility and power needed to accomplish the task efficiently.
By mastering the options and capabilities of Wget, you can enhance your productivity and ensure smooth and reliable file retrievals from the web. This guide has covered the essentials and advanced usage of Wget, making it a comprehensive resource for anyone looking to leverage this powerful tool on Ubuntu.