Saturday, 14 January 2012

Email Archive; How do you do yours?

If you are anything like me, you have an ever growing pile of email. It sits on a server that does have IMAP4 access and you might even check it using a client like Thunderbird, but for the most part you just use the webmail interface. Since Roundcube and gmail the email client has rather had its day... or has it.

Why not install and fire up a copy of Thunderbird, (it is free). Well I'll tell you why not, (or rather why.) You can use the client to make a backup of all the messages that you received and sent. (If Google goes bust for trying to bind G+ into your searches, you will be thankful to have a backup! Remember: Nothing is forever.)
If you got a new computer in the last six months you will have oodles of disk space; Why not use some of it for something useful?

Remember than IMAP just "looks" at the messages on the server. You have to copy them into "Local Folders". Then you can safely go online and delete any of your old messages. Please make sure that you have a local copy, (heck, with the price of consumer disk-space make two!) before you delete any messages.

If this blog were a film, and you had been reading the subtitles, you will know that I'm a fan of fire-and-forget automation using scripting. So how do I do it?

If you are not using a computer that has UNIX or Linux installed then this may be as far as you need to go today, (if ~/backups/" is gibberish to you then you need to find a good  UNIX / Linux Terminal Tutorial.)

In 2005 I used http://freshmeat.net/projects/imapsync (now called http://freecode.com/projects/imapsync). That was a great project, but now it is a great product.. so it is no longer free (though https://github.com/imapsync/imapsync seems free.) So you look for something else and find mbsync http://isync.sourceforge.net/mbsync.html - maybe a little harder to understand at first, so here is how I use it, (with a cut-n-paste example config):


Install mbsync:

aptitude install isync

(if you are on debian/Ubuntu)

cron + wrapper_script + mbsync (I used to use imapsync, as I mentioned).

For this example we are going to backup two gmail accounts into ~/backups/gmail/

Configure

Get certificates

We want an encrypted connection, (remember that (since 2008) SSL/TLS will only protect you against casual network snooping!) so we willl have to collect the remote servers certificates


     mkdir -p ~/backups/gmail/; 
  openssl s_client -connect imap.gmail.com:993 -showcerts > ~/backups/gmail/certs.pem </dev/null

Then we have to write a config file called .mbsync and put it in the root of our home directory:

~/.mbsync

#example
MaildirStore local

# where do you want to keep the messages?
Path ~/backups/gmail/

# I think of this as the details for one remote account
IMAPStore just.a.label
Host imap.gmail.com

User test-example@gmail.com
Pass notVERYsecure

UseIMAPS yes
CertificateFile /usr/share/purple/ca-certs/thoughtcrime_CA.pem
CertificateFile ~/backups/gmail/certs.pem

# You can have the details for as many accounts as you like

IMAPStore work.email
Host imap.gmail.com

User test.work@gmail.com
Pass 4l50notVERYsecure
UseIMAPS yes
CertificateFile /usr/share/purple/ca-certs/Thawte_Premium_Server_CA.pem
CertificateFile ~/backups/gmail/certs.pem

# Now the "backup instructions"


Channel my.email
Master ":just.a.label:[Gmail]/All Mail"
Slave :local:test-example

Sync PullNew
Create Slave
SyncState *

# The "Channel" is used when invoking mbsync to tell it which Channel to "watch"
# Master tells it which IMAPStore to look at, (in this case look for the one called "just.a.label"
# Slave is where to put it, (this _can_ be a remote IMAP server! Cool for migrations)
# the last three are more example settings. Check the man for more.

# A Channel can pull from multiple accounts or server at the same time!

Channel both                                      # 
Master ":just.a.label:[Gmail]/All Mail"  # from
Master ":work.email:[Gmail]/All Mail"  # from
Slave :local:combined                        # to
Sync PullNew                                    # how
Create Slave                                      # do
SyncState *                                       # what

#End .mbsync config file



Add a crontab entry

58 */8 * * *  echo 'yes'|mbsync -q both 2>/dev/null 
#I know I should listen for errors but it warns about certs

The "echo 'yes'" hacks us past an ssl warning. There is probably a better way to solve this problem, but this worked for me.

But what of the wrapper_script that you mentioned?

Well by calling a wrapper script, (in my case perl) it is trivial to change the path based on the date:

s%^Path ~/backups/gmail/%Path ~/backups/gmail/`echo -n \$(date +%Y)`/%

that way each year gets a separate backup. I do end up with some duplication while I delete the old messages on the server, but duplicates are better than lost data, (and see fdupes.) 

Recovery

There isn't much point in having a backup if you can't use it. I did used to just use grep to search but I found it faster to index ~/.backup/gmail using Thunderbird and then search for messages within that.. which brings me full circle back to Thunderbird. Thank you Mozilla. Keep up the good work.

[update] I've moved to balsa as a client and that can read the Maildir++ format without having to add an IMAP4 server. I tried claws but they no longer support Maildir.

1 comment:

  1. Thank you so much sir for the information. I really appreciate it.
    ===============================
    Result Togel Sgp
    Result Togel Sgp

    ReplyDelete

About this blog

Sort of a test blog... until it isn't