Requests — it is a versatile python HTTP library with various applications. One of its applications is to download a file from the Internet using the URL of the file.
Installation: First of all, you need to download the requests library. You can install it directly using pip by entering the following command:
pip install requests
Or download it directly from here and install manually.
This little piece of code written above will download the following image from the Internet. Now check your local directory (the folder where this script is located) and you will find this image:
All we need is the URL of the image source. (You can get the URL of the image source by right-clicking on the image and selecting the View Image option.)
Download Large Files
HTTP response content ( r.content ) — it is nothing more than a string that file data is stored in. Thus, it will not be possible to keep all data in one line in case of large files. To overcome this problem, we are making some changes to our program:
r = requests.get (URL, stream = True)
Setting the stream parameter to True will only download the response headers and the connection will remain open. This avoids concurrent reading of content into memory for large responses. The fixed chunk will be loaded every time the iteration of r.iter_content is running.
Here’s an example:
In this example, we are interested in downloading all the video lectures available on this ve b-page . All archives for this lecture are available here . So, first we clean up the web page to extract all the links to the video, and then we load the video one at a time.
| tr> |
Benefits of using a query library to load web files:
- You can easily download web directories by recursively browsing the website!
- This is a browser independent method and much faster!
- You can just flush the webpage to get all the file urls on the webpage, and hence load all files in one command.