Data compression is the conversion of data which takes large storing space to a format which takes lesser space. Data compression thus reduces the overall size of the data by using one of the several available techniques. Its need arose when in the beginning, data size was increasing at an exponential rate and the storing capacity wasn’t able to hold large data sizes. Data compression was thus introduced to hold large amounts of data in small hard drives. It is also useful in transmitting large sized data through low bandwidth networks in lesser time.
There are basically two types of data compression techniques, Lossy and Lossless data compression. In Lossy data compression reduces file size by removing bits of information which are unnecessary while in Lossless data compression no information is removed and data is just compressed using one of several algorithms.
Lossless data compression algorithms exploit the fact that there are statistically redundant bits and represent data without losing any information so that the process is reversible. For example if an image has red color over an entire region then instead of storing red for each pixel data compression may be achieved by encoding the data a “279 red pixels”. This is a very basic example of reducing file size by eliminating redundancy. The Lempel-Ziv (LZ) compression method is among the most popular lossless data compression algorithms. LZ method uses a table based compression where strings of data are replaced by their counterparts present in the table and the same table is used to decompress that data. DEFLATE is a variation of LZ which is optimized for decompression speed and compression ratio is used in PNG, Gzip and PKZIP.
Lossy data compression is another technique where unessential bits which are less perceived by humans are removed. JPEG for example is an example of Lossy image compression.