GMail`s strange perception of EMail size

I’ve just downloaded my whole GMail account via POP3. Each message is stored as-is, including headers and attachments in base64, in a seperate file.

du -ms on the folder with all emails tells me:

2117

While GMail tells me:

You are currently using 2079 MB

Why this difference, I asked myself. Was there a difference in the way e-mail was stored? Actually I stored my e-mail very inefficiently. E-Mail is 7bits encoded — every character is 7 bits — where my FS (like virtually every other) stores each character in a 8bits. Lets calculate the actual size of email on my disk:

>>> 2117 / 8 * 7
1848

It gets even more absurd considering that the most part of that space is used by attachments in emails, which are encoded in Base64, which uses 6bits per character. At least 50% of these 2GB are attachments, thus:

>>> 2117 / 8 * 6.5
1716

Why am I using 2079MB according to GMail?

GMail wouldn’t require 2079MB for my emails, they probably compress all attachments and old mails, they won’t even come near that 2079MB. It would therefore seem logical that they would use the real size of all emails — which should at least match the 1848MB or 2117MB, but it doesn’t.

Anyone got a good guess?

Disclaimer: This isn’t in any way meant to be anti-gmail — I love gmail, everyone loves gmail! I’m just curious.

2 thoughts on “GMail`s strange perception of EMail size”

  1. It appears that there is not problem. Computers read file sizes differently based on if 1MB = 1000 Bytes or 1MB=1024 Bytes. This is the same reason some harddrives appear to be smaller after hooking them up to your computer than what is listed on the box (sometimes you’ll get lucky and get one slightly larger). Google it, I remember a tech guru explaining this years ago.

  2. The difference shouldn’t be that great:

    The raw size would when measuring in MB (1000kB) instead of MiB (1024kiB) be 2219MB — this obviously is a bigger number. Now, as each email is only using 7 bits per character the ‘real’ size of all my email is: 2219 / 8 * 7 = 1939 — this is less than the 2079MB reported by GMail. It gets even worse when I add in that attachments are encoded 6bits.

    The MB/MiB issue does make a difference, but not that great as you have seen. There’s still something else :-).

Leave a Reply

Your email address will not be published. Required fields are marked *