Maildir mailbox format
From Syscore
Contents |
What is it?
Maildir is a format of email message storage. The standard storage method is called Berkeley mailbox format. This format stores all message in a given folder in a single file concatenated together. The maildir format uses a directory structure to manage the storage of messages. All information about the message that changes is stored in the name of the file and the file system.
Why did we switch?
The Berkeley mailbox format has been a standard in Unix since the birth of Unix. It was the best method of storing emails for its time. However, with the growth in the size of a standard email and the volume of emails received, this method starts to lose in performance. Another factor in our decision was our use of AFS as mail storage. After examining multiple formats, we decided that maildir was the best option. Its "one message per file" format makes all email operations atomic, or a single operation. What this means is that there is no locking involved, whether you're reading a message, moving it to a different folder, or deleting it completely. Each of these operations is the same as moving a file around or renaming it. This atomic operation allows for us to spread our mail reading across multiple servers without any complications. Since we converted to using this mail format, our server performance and mail delivery times have greatly improved.
I looked at my Mail/ directory in a shell, what is all this?
linux2[10]% ls -ld inbox drwx------ 5 105056 research 2048 Feb 3 11:02 inbox linux2[11]% ls -l inbox total 6 drwx------ 2 105056 research 2048 Feb 3 11:02 cur drwx------ 2 105056 research 2048 Feb 3 11:02 new drwx------ 2 105056 research 2048 Feb 3 11:02 tmp linux2[12]% ls -l inbox/new/ total 2 -rw------- 1 105056 research 1399 Feb 3 11:02 1075824124.18236_0.mx6in.umbc.edu linux2[14]% pine Pine finished -- Closed folder "INBOX". Kept single message. linux2[15]% ls -l inbox/cur/ total 2 -rw------- 1 105056 research 1399 Feb 3 11:02 1075824124.18236_0.mx6in.umbc.edu,U1075824336:2,S
Each email folder (including your inbox) is a directory. Under that directory there are three directories; "cur", "new", and "tmp". The "cur" directory is where all messages that you have seen are stored, whether you've actually read them or not. The "new" directory is where all new messages are delivered. Finally the "tmp" directory is used by the mail delivery system during delivery. As for the actual email message storage, instead of being stored as a small piece of a file, each message is a single file. Once written into the "new" directory, this file should never be edited. As you can see, after pine has read the message, it is moved into the "cur" directory and renamed with status information appended to the end of the name. You should also note that the size and date of the message did not change.
We do NOT recommend anyone manipulating their mailbox through any other means other than via an IMAP connection, or a supported version of pine.
What software are you running for this? I want more information!
We are using an enhanced maildir driver for c-client, as well as some simple modifications to procmail. Maildir patches
