Skip to content

Dear Internet Explorer user: Your browser is no longer supported

Please switch to a modern browser such as Microsoft Edge, Mozilla Firefox or Google Chrome to view this website's content.

Download and extract a tarball in Python

A simple method for downloading a tarball from the internet and then extracting its contents in Python.

This procedure will require the urllib and tarfile packages in Python.

Firstly, download the file of interest from the internet. In this example, I am targeting a training set of images for TensorFlow, saved as a TGZ file.

from urllib.request import urlretrieve
url = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz"
target = "/home/adamdimech/.keras/datasets/flower_photos.tgz"
dir = "/home/adamdimech/.keras/datasets/"
urlretrieve(url, target)

The target variable describes the name and location of the file that is to be saved, whilst dir contains the parent directory which will be used to describe where the contents of the tarball will be extracted.

Next, extract the contents of the tarball:

import tarfile
file = tarfile.open(target)
file.extractall(dir)
file.close()

In this case, the files will be saved to /home/adamdimech/.keras/datasets/flower_photos.

   

Comments

No comments have yet been submitted. Be the first!

Have Your Say

The following HTML is permitted:
<a href="" title=""> <b> <blockquote cite=""> <code> <em> <i> <q cite=""> <strike> <strong>

Comments will be published subject to the Editorial Policy.