Chapter #9: Compression and Archiving in Linux
In this chapter, you’ll learn how to use tar in Linux to archive, compress, extract, and back up files using gzip, bzip2, and xz.

A file archiving tool groups a set of files into a single standalone file that we can back up to several types of media, transfer across a network, or send via email. The most frequently used archiving utility in Linux is tar
.
When an archiving utility is used along with a compression tool, it allows you to reduce the disk size needed to store the same files and information.
The tar Utility
tar
bundles a group of files together into a single archive (commonly called a tar file or tarball). The name originally stood for tape archiver, but note that we can use this tool to archive data to any kind of writable media - not just tapes.
tar
is normally used with a compression tool such as gzip
, bzip2
, or xz
to produce a compressed tarball.
Basic Syntax:
tar [options] [path ...]
Where ...
represents the expression used to specify which files should be acted upon.
Most Commonly Used tar Commands
Long Option | Abbreviation | Description |
---|---|---|
--create |
c |
Creates a tar archive |
--concatenate |
A |
Appends tar files to an archive |
--append |
r |
Appends files to the end of an archive |
--update |
u |
Appends files newer than copy in archive |
--diff / --compare |
d |
Find differences between archive and filesystem |
--file ARCHIVE |
f |
Use archive file or device ARCHIVE |
--list |
t |
Lists the contents of a tarball |
--extract / --get |
x |
Extracts files from an archive |
--delete |
— | Deletes specified files from archive (uncompressed only) |
Common tar Operation Modifiers
Long Option | Abbreviation | Description |
---|---|---|
--directory dir |
C |
Changes to directory dir before performing operations |
--same-permissions |
p |
Preserves original permissions |
--verbose |
v |
Lists all files read or extracted. Shows file size, owner, timestamps |
--verify |
W |
Verifies the archive after writing it |
--exclude file |
— | Excludes a specific file from the archive |
--exclude=pattern |
X |
Exclude files given as a pattern |
--gzip / --gunzip |
z |
Processes an archive through gzip |
--bzip2 |
j |
Processes an archive through bzip2 |
--xz |
J |
Processes an archive through xz |
The gzip Utility
gzip
is the oldest compression tool and provides the least compression, while bzip2
provides improved compression. xz
is the newest and usually gives the best compression.
This advantages of best compression come at a price: the time it takes to complete the operation, and system resources used during the process.
Normally, tar files compressed with these utilities have .gz
, .bz2
, or .xz
extensions, respectively.
In the following examples we will be using these files: file1
, file2
, file3
, file4
, and file5
.