Curl

Curl is a tool for downloading files off of the internet. Sometimes it is used as part of software installation instructions. Curl can also be used to perform basic reconnaissance such as banner grabbing to determine server software versions.

Learning Objectives

You should be able to:

Use curl to download files off of the internet
Use curl for banner grabbing

Video Walkthrough

Use this video to follow along with the steps in this lab.

Curl for Downloading

Connect to your Linux terminal.
Run the following command to ensure that you are in your home directory.

cd ~

Run the following command to make a new temporary directory called dl (download).

mkdir dl

Run the following command to change directories to the new dl directory.

cd dl

Run the following curl command to download google.com's web page.

curl google.com

Note the contents. Some of the text is human readable. You will see the hypertext markup language (HTML) that describes how the page should be displayed.

Curl download of google.com

Notice that the web page says "The document has moved." Google wants to send us to www.google.com instead of letting us access google.com (without the www at the beginning).
Run the following command to display www.google.com in the terminal.

curl www.google.com

Notice that the web page is much more complete. But there is an issue. The contents are too large to display on one terminal screen.
Run the following command to display the contents page by page by piping the output into the more command.

curl www.google.com | more

Press enter to move down line by line, or the space bar to move down page by page.
The less command is a more advanced version of more. Try it out.

curl www.google.com | less

You can use the arrow keys to move up and down. Or use enter or enter to go down line by line. Space bar will page through the content.
You can search the same way you would in vim. For example, type /goo and press enter to find instances of goo in the content.
Run ls to show that no files were created. Curl just displayed the web page in the browser.
Run the following command to download google.com to a file.

curl -o google.html www.google.com

The -o option tells curl to output the download to the file google.html.
You can display the file in the terminal with cat. Though unless you are familiar with HTML and JavaScript, it will be difficult to read.

cat google.html

Banner grabbing is the act of requesting information from a server to discover information about the server's configuration. Ethical hackers (and malicious hackers) use banner grabbing in the reconnaissance phase of an attack. Curl can be used to gather some basic information about the server.

In the terminal, run the following command. The -I switch loads the HTTP headers (the stuff that your web browser hides from you because it is not necessary for rendering a web page.)

curl -I google.com

One of the most useful pieces of information is that google.com is running the "gws" web server. This is a custom web server built by Google and run in-house.

Challenge

Determine what web servers the following website use:
- microsoft.com
- apple.com
- cnn.com
- foxnews.com
Download your school's web page to a local file on your computer. Open the file in a web browser.
wget is another tool used to download files from the internet. Search online for differences between the two commands. Test then in the Linux terminal.

Reflection

Why would it be useful to download files from a command-line interface?
Should companies try to hide the technologies they use? I.e., should they try to block banner grabbing?

Key Terms

Banner Grabbing: A technique used to gather information about a computer system or network service by capturing and analyzing the banners that are returned by services during connection attempts. Banners often contain details such as the software version and operating system, which can be useful for network reconnaissance and vulnerability assessment.
curl (Linux): A command-line tool used to transfer data to or from a server using various protocols, including HTTP, HTTPS, FTP, and more. curl is widely used for downloading files, testing APIs, and performing network requests. It is known for its versatility and ability to handle complex tasks such as authentication, proxy support, and data manipulation.