What happens when files that contain large amounts of info are zipped ( compressed)?

1 ⤊

What happens when files that contain large amounts of info are zipped ( compressed)?

at the technical level.

2006-10-30 06:40:25 · 9 answers · asked by Carissa P 1 in Computers & Internet ➔ Programming & Design

9 answers

This has to be very simplified, however, imagine that you have some data - an image or a spreadsheet, for example, where most of the bytes consist of zeros. Instead of a file full of zeroes, if you simply write the equivalent of 'put 1000 zeroes here', you have saved a great deal of space!

This is more or less what happens when you 'zip' a file. The zip software reads the file and if it can save space by recognising patterns, then it will make a new file which effectively saves space by listing the patterns and how many times and where each one occurs.

A database or a spreadsheet consist of a great deal of redundant or duplicated information (think of all the empty cells in a spreadsheet or, in a database, think of all the spaces needed to pad out a field to the set width - they have to be there in case one of the records needs that many but most records may not). A picture may have thousands of pixels representing white or black. When the file is zipped, instead of writing out all those values that are the same, all that is written is a few bytes that say 'x number of bytes of value n'.

Technically, it is far more complicated than this (a file that has been zipped may be inspected again to see if further patterns exist which can be written more concisely) but that is the basic principle.

2006-10-30 06:47:31 · answer #1 · answered by Owlwings 7 · 1⤊ 0⤋

Michael D is wrong mp3, and other audio/video compression techniques reduce the size at the expense of quality - they remove some of the information from the file, most of it you can't hear anyway, but some you could, and that's why quality is lost.

The regular data compression is completely different, because it is lossless - i.e., you can restore the exact same file from a compressed version, without losing any information that was originally there.

What happens when you compress the file is that the archiver program analyzes the data in it, and replaces various patterns with a shorter way to say the same thing.
For example, instead of "aaaaaaaaa", you could just say "9*a", which is 3 times shorter. Or instead of 1,000,000,000,000,000 you could say 10^15, which is about 4 times shorter.

This is not what the archiver really does, but that's the idea. In real life, it works even better, because it replaces not only direct repetitions like in my examples, but any common patterns in general. For example, if it notices, that, say a word "information" occurs 100000 times in some large file, it might deside to replace that whole word with a single character, and save you about a megabyte on that. It needs to keep a mapping table (that it usually inserts into the beginning of the compressed file), that will let it "uncompress" the data later - it just tells the uncompressing program to replace every occurence of such-and-such symbol with "information" etc.

2006-10-30 07:00:46 · answer #2 · answered by n0body 4 · 0⤊ 0⤋

File compression uses a variety of algorithms (some of which are proprietary and require paid licensing to use) that look for repetitions of bit patterns in a file. The compression tool then keeps track of how many of which pattern can be found where in the file and removes all the duplicate patterns, keeping one copy along with the locations for all the repetitions. This usually happens several times using a variety of patterns. The larger the patterns and the more repetitions there are, the more the file can be compressed. When the file is uncompressed, the tool looks at its index of repeated patterns and copies them all back to their proper locations within the file. Raw text files usually give you the best compression ratios (lots of patterns and lots of repetition) while complicated data like audio and video give you the worst (not nearly so many patterns or repetitions). This doesn't necessarily take into account special audio and video compression techniques or lossy compression formats that sacrifice data quality or accuracy to achieve smaller file sizes (like MPEG and JPEG).

2016-05-22 12:01:41 · answer #3 · answered by Anonymous · 0⤊ 0⤋

ok it's complicated but the siimple explanation is this:

say for example that you have a word page: space can be maximised on that through code by compartmentalising all similar characters e.g. all the 'a's in a document are reduced to just one 'a' and a recording is kept of where this one 'a' should be duplicated when the document opens.

that's why some files (including words and music) can really compress very well, to up to half their former size whereas others hardly compress at all.

TO SEE WHAT TYPICAL CHARACTERS LOOK LIKE

(if you want to see what the typical characters in a music file look like for example, just drag and drop one into an open text (.txt) file, but make sure that you use a copied file if you'll need to listen to it later.

- once you open it you will see how many similar and non-similar characters are in the document and better understand how well it can be compressed by assessing the ratio between the two (similar and non-similar characters)

it gets even more complicated once you start talking abou thte differences between algorithms & lossless/lossy file compression so ...

... if you want more technical info on general topics like hard drives, encryption, the different types of file compression, then let me direct you to

http://atschool.eduweb.co.uk/dyffryn/ICT%20Resources.htm

or if you want more specific info on this then this should cover everything i haven't elaborated on:

http://www.howstuffworks.com/file-compression.htm

2006-10-30 06:58:33 · answer #4 · answered by Can I Be Your Pet? 6 · 0⤊ 0⤋

When you zip a file, in fact you are about to make it smaller for different purposes such as attachment, when you want to email a file, A common software for doing that is WinZip. Of course for using that file again ,you have to unzip it.The result would be in a folder with the same name in your win's drive.

2006-10-30 06:52:48 · answer #5 · answered by Gorgeous 1 · 0⤊ 1⤋

Is its just a compression rate, much like an mp3. Mp3 files are taken off a cd, which has a high quality rate, and it find bits and pieces that match and combines them into one bit. Its very complex.

2006-10-30 06:43:16 · answer #6 · answered by Michael D 1 · 0⤊ 2⤋

the junk is deleted ( you know things like duplication and non-needed instruction - what is left is not the original file but a set of instructions describing the file and how to reproduce it - for instance you do not need font instructions for each individual 'A' )

2006-10-30 06:42:00 · answer #7 · answered by fact checker 3 · 0⤊ 1⤋

you'll need to be more clear than that.

2006-10-30 06:52:00 · answer #8 · answered by Anonymous · 0⤊ 1⤋

They save you disk space... lol...

2006-10-30 06:41:45 · answer #9 · answered by Anonymous · 0⤊ 2⤋