Solve real-world shell scripting problems with over 110 simple but incredibly effective recipes
gzip is a commonly used compression format in GNU/Linux platforms. Utilities such as gzip, gunzip, and zcat are available to handle gzip compression file types. gzip can be applied on a file only. It cannot archive directories and multiple files. Hence we use a tar archive and compress it with gzip. When multiple files are given as input it will produce several individually compressed (.gz) files. Let’s see how to operate with gzip.
In order to compress a file with gzip use the following command:
$ gzip filename $ ls filename.gz
Then it will remove the file and produce a compressed file called filename.gz.
Extract a gzip compressed file as follows:
$ gunzip filename.gz
It will remove filename.gz and produce an uncompressed version of filename.gz.
In order to list out the properties of a compressed file use:
$ gzip -l test.txt.gz compressed uncompressed ratio uncompressed_name 35 6 -33.3% test.txt
The gzip command can read a file from stdin and also write a compressed file into stdout.
Read from stdin and out as stdout as follows:
$ cat file | gzip -c > file.gz
The -c option is used to specify output to stdout.
We can specify the compression level for gzip. Use –fast or the –best option to provide low and high compression ratios, respectively.
The gzip command is often used with other commands. It also has advanced options to specify the compression ratio. Let’s see how to work with these features.
We usually use gzip with tarballs. A tarball can be compressed by using the –z option passed to the tar command while archiving and extracting.
You can create gzipped tarballs using the following methods:
$ tar -czvvf archive.tar.gz [FILES]
Or:
$ tar -cavvf archive.tar.gz [FILES]
The -a option specifies that the compression format should automatically be detected from the extension.
$ tar -cvvf archive.tar [FILES]
Compress it after tarballing as follows:
$ gzip archive.tar
If many files (a few hundreds) are to be archived in a tarball and need to be compressed, we use Method – 2 with few changes. The issue with giving many files as command arguments to tar is that it can accept only a limited number of files from the command line. In order to solve this issue, we can create a tar file by adding files one by one using a loop with an append option (-r) as follows:
FILE_LIST="file1 file2 file3 file4 file5" for f in $FILE_LIST; do tar -rvf archive.tar $f done gzip archive.tar
In order to extract a gzipped tarball, use the following:
Or:
$ tar -xavvf archive.tar.gz -C extract_directory
In the above command, the -a option is used to detect the compression format automatically.
zcat is a command that can be used to dump an extracted file from a .gz file to stdout without manually extracting it. The .gz file remains as before but it will dump the extracted file into stdout as follows:
$ ls test.gz
$ zcat test.gz
A test file
# file test contains a line “A test file”
$ ls
test.gz
We can specify compression ratio, which is available in range 1 to 9, where:
You can also specify the ratios in between as follows:
$ gzip -9 test.img
This will compress the file to the maximum.
bunzip2 is another compression technique which is very similar to gzip. bzip2 typically produces smaller (more compressed) files than gzip. It comes with all Linux distributions. Let’s see how to use bzip2.
In order to compress with bzip2 use:
$ bzip2 filename $ ls filename.bz2
Then it will remove the file and produce a compressed file called filename.bzip2.
Extract a bzipped file as follows:
$ bunzip2 filename.bz2
It will remove filename.bz2 and produce an uncompressed version of filename.
bzip2 can read a file from stdin and also write a compressed file into stdout.
In order to read from stdin and read out as stdout use:
$ cat file | bzip2 -c > file.tar.bz2
-c is used to specify output to stdout.
We usually use bzip2 with tarballs. A tarball can be compressed by using the -j option passed to the tar command while archiving and extracting.
Creating a bzipped tarball can be done by using the following methods:
$ tar -cjvvf archive.tar.bz2 [FILES]
Or:
$ tar -cavvf archive.tar.bz2 [FILES]
The -a option specifies to automatically detect compression format from the extension.
$ tar -cvvf archive.tar [FILES]
Compress it after tarballing:
$ bzip2 archive.tar
If we need to add hundreds of files to the archive, the above commands may fail. To fix that issue, use a loop to append files to the archive one by one using the –r option.
Extract a bzipped tarball as follows:
$ tar -xjvvf archive.tar.bz2 -C extract_directory
In this command:
Or, you can use the following command:
$ tar -xavvf archive.tar.bz2 -C extract_directory
-a will automatically detect the compression format.
bunzip has several additional options to carry out different functions. Let’s go through few of them.
While using bzip2 or bunzip2, it will remove the input file and produce a compressed output file. But we can prevent it from removing input files by using the –k option.
For example:
$ bunzip2 test.bz2 -k $ ls test test.bz2
We can specify the compression ratio, which is available in the range of 1 to 9 (where 1 is the least compression, but fast, and 9 is the highest possible compression but much slower).
For example:
$ bzip2 -9 test.img
This command provides maximum compression.
I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…
Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…
Once we learn how to deploy an Ubuntu server, how to manage users, and how…
Key-takeaways: Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…
While developing a web application, or setting dynamic pages and meta tags we need to deal with…
Software architecture is one of the most discussed topics in the software industry today, and…