Using curl
and wget
to Download Files
1. Introduction to curl
Ever been in a situation during a job interview where they ask you to download a file from the internet using the terminal, and you're just staring at the screen, realizing you forgot how to do it? Today, we’ll learn how to use curl
and wget
, which will become your trusty companions for working with network data.
These tools let you download web pages, fetch files, send HTTP requests, work with APIs, and even perform automation scripts. We’ll dig into their functionality, advantages, and common use cases.
curl
is a command-line tool for transferring data over network protocols. The main strength of curl
lies in its flexibility: it supports over 20 protocols (HTTP, HTTPS, FTP, SCP, and even SMTP). According to developers, it's a great "Swiss Army knife" for working with the internet.
Basic curl
Syntax
curl [options] URL
In simple terms, you type the curl
command, specify the desired address — and enjoy the results. Let’s figure out how it works.
Downloading a Web Page
Let’s say you want to download the main page of Google. Here’s how you do it:
curl http://www.google.com
You’ll see the HTML code of the page on your screen. Pretty useful if you want to explore a site’s structure or automate something related to it.
Saving Content to a File
If all that text output is annoying, you can save the result to a file:
curl -o google.html http://www.google.com
The -o
(output) flag tells curl
that we want to redirect the output to a file. Now the HTML code of the page is saved in google.html
. Want a laugh? Hand the file to a friend and say you’ve downloaded "the entire internet."
Downloading a File
Imagine you need to download a file from the internet (like some .zip
file). curl
handles it like a champ:
curl -O http://example.com/file.zip
Unlike -o
, the -O
flag saves the file with the original name specified in the URL. This is handy if you’re downloading multiple files from the same source.
HTTP Authentication
Sometimes, access to a file or an API resource is protected by a username and password. In this case, use curl
with the -u
flag:
curl -u username:password http://example.com/private-data
This is especially useful for working with private APIs, like GitHub or Docker Registry.
Downloading via API
One of the coolest features of curl
is working with APIs. Let’s say you need to send a request to a server that returns data in JSON format:
curl -X GET "https://api.exchangerate-api.com/v4/latest/USD"
Here, the -X
flag specifies the HTTP request method (GET, POST, DELETE, etc.). It’s a gem for automating integration with external services.
2. What is wget
?
If curl
is like a Swiss Army knife, then wget
is a bulldozer. Its main purpose is to download files. The key difference from curl
is that wget
is specifically designed for reliable downloading of large files, and it supports resuming downloads, which is awesome for unstable connections.
Basic Syntax of wget
wget [options] URL
Simple File Download
wget http://example.com/file.zip
This command will download the file and save it with its original name in the current directory. Simple and effective.
Save with a Different Name
If you're not a fan of the original file name, you can give it your own:
wget -O newfile.zip http://example.com/file.zip
Resume Download
Let’s say you’re downloading a massive file, but your connection drops. No worries: just use the -c
(continue) flag:
wget -c http://example.com/largefile.iso
wget
will pick up where it left off. And this even works days later, as long as the server supports it.
Download an Entire Website
Yep, you can use wget
to download an entire website (or a copy of it). Just use the --mirror
option:
wget --mirror http://example.com
This command downloads the website while keeping its directory structure intact. Now you have a "mirror" of the website for offline use.
3. Comparison of curl
and wget
Feature | curl | wget |
---|---|---|
Support for a large number of protocols | Yes | Only HTTP/HTTPS and FTP |
Automatic download resumption | No (can be scripted) | Yes |
Working with APIs | Yes | No |
File management simplicity | Average | Excellent |
Downloading entire websites | No | Yes |
Basically, if you need to work with APIs or something specific, go with curl
. But if you just need to download files, wget
is the better choice.
4. Practical Application
Downloading and Processing a File
Combining wget
with our text processing skills:
wget -O data.txt http://example.com/data.txt
cat data.txt | grep "keyword" | awk '{print $2, $4}'
Here, we downloaded a file, filtered lines by a keyword, and extracted the needed columns.
Working with APIs
Downloading exchange rates using curl
and finding the needed currency:
curl -s "https://api.exchangerate-api.com/v4/latest/USD" | grep "EUR"
This is useful if you want to build an automated currency exchange system.
Automating Updates
Imagine you need to download file updates every day. Here's an example of a simple script:
#!/bin/bash
wget -O updates.zip http://example.com/daily-updates.zip
unzip -o updates.zip -d /path/to/updates
Save the script and add it to cron
. Now it'll run automatically — nice, right?
5. Common Mistakes and Features
Error 403 (Forbidden):
This happens if the server requires additional headers (e.g., User-Agent). Fix it like this:
curl -A "Mozilla/5.0" http://example.com
Redirects:
If the server redirects you to another URL, add the -L
flag to curl
:
curl -L http://example.com
SSL Errors:
Sometimes wget
or curl
might complain about SSL. You can disable certificate verification (but it's not safe!):
wget --no-check-certificate https://example.com
curl -k https://example.com
With this powerful toolkit, you're ready to conquer the internet via the terminal. No file or API will escape you now – it's time to download, process, and automate!
GO TO FULL VERSION