ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Main Area and Open Discussion > Living Room

File Size vs. Size on Disk: Why such a difference?

(1/5) > >>

Deozaan:
Last night I learned something new that the geek in me found interesting. I learned the difference between SI prefix names and IEC prefix names. The details are summarized on Ubuntu's Units Policy but really the only thing that has to do with this post is that it made me curious about the discrepancy I see when viewing a file's (or folder's) properties and it shows the size and size on disk to be sometimes quite different.

For example, I have several PortableApps on a 2GiB USB drive and I wanted to see how much space they took up. So viewing the PortableApps' folder properties, it shows:

Size: 602 MB (631,544,356 bytes)
Size on disk: 993 MB (1,041,301,504 bytes)

So my questions is: Can anyone explain to me why the files take up 40% more space on disk than their actual size? Are they retaining water? Wearing a girdle? Did they have plastic surgery?

wraith808:
I think it has to do with block size- at least that's what I always attributed it to.

Update: I thought more about it, and that maybe I needed to expand on my explanation.  The block size is the minimum size of data on a drive.  If there is a file that is smaller than the block size, that's the minimum size that can be taken up even if it's smaller, i.e. if you store a 200 byte file, but the minimum size is 1024 bytes, you lose the other 824 bytes because it has to take a whole 1024 bytes.  Also, since they are allocated in blocks, if something is not exactly a multiple of the block size, there is some waste in space.  That's what I've always attributed the difference to- and looking on wikipedia at least, it seems to be borne up by how they write to NAND drives.

MilesAhead:
wraith808 is right.  The file system has a cluster size that's set when the partition is formatted.  Here's some basic NTFS info on cluster size:
http://www.softwaretipsandtricks.com/windowsxp/articles/252/1/NTFS-Cluster-size

Since the last cluster allocated is rarely perfectly filled, on average you waste 1/2 the cluster size per file on that partition.  It's a trade-off. If you use the minimum cluster size you save space, but you lose performance since you will have to expend more resources tracking a greater number of clusters for the storage space used.

Deozaan:
Is this something defragging can help with, or is it that multiple files cannot occupy parts of the same block?

Also of note, though I doubt this makes a difference, the USB drive in question is formatted as FAT32.

Carol Haynes:
For example, I have several PortableApps on a 2GiB USB drive and I wanted to see how much space they took up.
-Deozaan (April 07, 2010, 01:26 PM)
--- End quote ---

What format is the USB drive in? FAT/FAT32 etc. If it isn't in one of the FAT formats try copying the files to a hard disk and reformatting the USB drive in FAT32 format and see if the file size is different when you copy it back.

It does seem like a very large discrepancy even given the way block sizes are used.

Navigation

[0] Message Index

[#] Next page

Go to full version