Excellent Adventures in Email Archiving, Part 2: MBOX to Thunderbird
In Part 1 of this entry, I recounted a tale of personal email archiving that took us from Microsoft's PST format to the open MBOX. Here, we'll take a look at how MBOX can play nice with a piece of open source, cross-platform freeware such as Mozilla Thunderbird.
Thunderbird can be downloaded and run on Linux, Windows, and OS X systems. In my particular test, I wanted simply to see whether or not I could take variously-sourced PST email archives, convert them to the open MBOX, and then have these archives work on a Mac and a Windows system.
Once downloaded, a default email account needs to be set up. Thunderbird supports both POP and IMAP protocols to active email servers and is relatively easy to configure. In fact, the program can be used much like Outlook was employed in Part 1 to directly harvest data straight from email servers for archiving purposes. For one of my live personal accounts, I actually have configured Thunderbird (using IMAP) to work dynamically with my service provider in this way. In the illustration that follows, however, it is assumed that we are trying to mainly archive static email data that has already been previously harvested by Outlook into PST from closed accounts. In a business and enterprise world still dominated by Outlook, it is not uncommon to only have the ability to create a PST archive of one's old work emails when switching jobs. That is if you even have access to this data at all.
Thunderbird, much like its close Mozilla cousin, Firefox, supports downloadable Add-ons that broaden the program's functionality. In order to both import and export MBOX format in Thunderbird, the quite handy ImportExportTools Add-on should be downloaded and installed.
Next, the stage needs to be setup for MBOX import. This is accomplished, by account, within Thunderbird's local folders. In this example, I'm working with an old Yahoo Mail archive that went through a similar PST > MBOX conversion process as the UConn archive described in Part 1...
From here, converted MBOX files may be imported into the newly-created local folder archive with a right-click entry into the ImportExportTools' context menu and selections made from the following dialog boxes...
And finally, there we have an imported MBOX archive living inside a Thunderbird local folder...
One of the nice things about the way Thunderbird handles MBOX is that, unlike imported PST files in Outlook for Mac, imported MBOX files can subsequently be exported out again from the program.
As a result, Thunderbird doesn't act as a black box. Since the program itself can be installed on multiple platforms, it can function more like an email archive and/or data conduit tool where MBOX files can flow into and out of the program from one OS to another as needed.
A few general notes on MBOX and email archiving in closing. While we have seen how MBOX can store entire email folders, it also stores email attachments in their original MIME format. With regard to such attachments, Chris Prom notes in his excellent 2011 DPC Technology Watch Report: Preservng Email, that "action will likely need to be taken to migrate them, if they are to remain accessible in the future." As a result, MBOX is viewed by some within digital archiving circles as a bit of a half measure towards true robust preservation.
Possible solutions to this issue include the use of XML. One such example is the implementation of the Email Account Schema by such organizations as the Smithsonian Institution Archives. As Prom notes on this score, "Attachments can either be encoded in the xml file
itself or written in their original binary formats to externally referenced locations. The
latter feature is particularly useful because the preservation of the attachments may
require additional effort, including monitoring for format obsolescence and the
development of future migration actions."
Meanwhile, MBOX-related preservation tools continue to be developed and funded. Of recent note, Stanford University Libraries' ePADD software package was awarded a $685,000 National Leadership Grant by the Institute of Museum and Library Services (IMLS) this past summer.