Wednesday, 19 June 2013

Speeding up Exim

Hit a strange "feature" in exim today:

# time exim -bt $(whoami)
logcheck@dull.tl.dr
    <-- root@alexx.net
  router = virtual_local_mailbox, transport = virtual_user

real    0m0.047s
user    0m0.016s
sys     0m0.010s


47  milliseconds. Not too shabby.

# time exim -bt abuse@hotmail.com
abuse@hotmail.com
  router = dnslookup, transport = remote_smtp
  host mx4.hotmail.com [65.54.188.72]  MX=5
  host mx4.hotmail.com [65.54.188.110] MX=5
  host mx4.hotmail.com [65.54.188.94]  MX=5
  host mx4.hotmail.com [65.55.37.72]   MX=5
  host mx4.hotmail.com [65.55.37.120]  MX=5
  host mx4.hotmail.com [65.55.92.152]  MX=5
  host mx4.hotmail.com [65.55.92.168]  MX=5
  host mx4.hotmail.com [65.55.92.184]  MX=5
  host mx4.hotmail.com [65.55.37.104]  MX=5
  host mx4.hotmail.com [65.55.92.136]  MX=5
  host mx4.hotmail.com [65.55.37.88]   MX=5
  host mx4.hotmail.com [65.54.188.126] MX=5

real    0m45.128s
user    0m0.010s
sys     0m0.016s
 

45 SECONDS? to do a blind-test? I hear DNS alarm bells in my head. So it is off
to: we go.

exim -v -d -bt abuse@gmail.com 

(you could inject a test message if you want to do it for real:
exim -d yourself@gmail.com  
To: yourself@gmail.com  
This is a test
.
)
 
exim whipps through to
--------> dnslookup router <--------
and then grinds to a sticky slow as it does
2a00:1450:400c:c03::1a in "0.0.0.0 : 127.0.0.0/8"? no (end of list)
DNS lookup of gmail-smtp-in.l.google.com (A) succeeded
173.194.66.27 in "0.0.0.0 : 127.0.0.0/8"? no (end of list)
DNS lookup of alt1.gmail-smtp-in.l.google.com (AAAA) succeeded
2a00:1450:4001:c02::1b in "0.0.0.0 : 127.0.0.0/8"? no (end of list)
DNS lookup of alt1.gmail-smtp-in.l.google.com (A) succeeded
173.194.70.27 in "0.0.0.0 : 127.0.0.0/8"? no (end of list)
DNS lookup of alt2.gmail-smtp-in.l.google.com (AAAA) succeeded
2a00:1450:4008:c01::1a in "0.0.0.0 : 127.0.0.0/8"? no (end of list)
DNS lookup of alt2.gmail-smtp-in.l.google.com (A) succeeded
173.194.69.27 in "0.0.0.0 : 127.0.0.0/8"? no (end of list)
DNS lookup of alt3.gmail-smtp-in.l.google.com (AAAA) succeeded
2a00:1450:4010:c04::1a in "0.0.0.0 : 127.0.0.0/8"? no (end of list)
DNS lookup of alt3.gmail-smtp-in.l.google.com (A) succeeded
173.194.71.27 in "0.0.0.0 : 127.0.0.0/8"? no (end of list)
DNS lookup of alt4.gmail-smtp-in.l.google.com (AAAA) succeeded
2607:f8b0:400e:c03::1b in "0.0.0.0 : 127.0.0.0/8"? no (end of list)
DNS lookup of alt4.gmail-smtp-in.l.google.com (A) succeeded
74.125.25.27 in "0.0.0.0 : 127.0.0.0/8"? no (end of list)

Now just as bcrypt deliberatly slows things down, this could be considered an anti-span feature, but I want my mail servers to deliver mail rather than to slow things down.

So lets look at that part of my exim config:

dnslookup:
  driver = dnslookup
  domains = ! +local_domains
  transport = remote_smtp
  ignore_target_hosts = 0.0.0.0 : 127.0.0.0/8
  no_more

so is it catching on ignore_target_hosts, (that is important), or just the DNS lookup.
[ Why not test ignore_target_host on the first MX host, try to deliver, and if we fall back to the next host, _then_ do the next ignore_targe_host test? Because exim calls its ROUTERS (in order) and _then_ calls the triggered TRANSPORT. This means that exim wants to verify _all_ routes in the ROUTER /before/ any transport. ]

Lets REM that out... and it _still_ takes 55 seconds! So just

DNS lookup of gmail.com (MX) succeeded
DNS lookup of gmail-smtp-in.l.google.com (AAAA) succeeded
DNS lookup of gmail-smtp-in.l.google.com (A) succeeded
DNS lookup of alt1.gmail-smtp-in.l.google.com (AAAA) succeeded
DNS lookup of alt1.gmail-smtp-in.l.google.com (A) succeeded
DNS lookup of alt2.gmail-smtp-in.l.google.com (AAAA) succeeded
DNS lookup of alt2.gmail-smtp-in.l.google.com (A) succeeded
DNS lookup of alt3.gmail-smtp-in.l.google.com (AAAA) succeeded
DNS lookup of alt3.gmail-smtp-in.l.google.com (A) succeeded
DNS lookup of alt4.gmail-smtp-in.l.google.com (AAAA) succeeded
DNS lookup of alt4.gmail-smtp-in.l.google.com (A) succeeded

is the bottle-neck. So how do we sniff out the name server/network?

As I've mentioned over here:

tcpdump -lvi any "udp port 53" 2>/dev/null

So if exim -bt is slow for !+local_domains : !+relay_domains, (i.e. remote/DNS domains) but its usual fast speed for
exim -bt $(whoami)
then you have a DNS problem, (or feature, depending upon your exim queue).

I also installed unbound on port 43 and added nameserver 127.0.0.1:43 to /etc/resolve.conf (as the only nameserver) and that took exim back up to full speed. I'm not naming names, but XXX.XXX.80.26 should not be in handed out via DHCP for NetworkManager if it isn't going to do resolving.

SpiderOak 5.0.1 on CentOS 6.4 x86_64

SpiderOak is the best fit for backing up my life. First off they do security properly, (unlike the laughable "most popular" that I would "drop" in a heartbeat.) Secondly they seem like nice people, ( could go on...)

I wanted to add some of the files on my personal server that happened to be running CentOS 6.4 (the 64bit version). So I downloaded the RPM from the SpiderOak site and was hit with some cryptic python error[1]. ( I only have ssh access to that server, so I was only trying from the command line.) I expected that my version of python was missing something, so I tried to fix that. In the end I removed SpiderOak x86_64 and installed the 32bit version... and it worked![0]

Not sure why the 64bit version failed, but as long as it is secure I'm happy, (and when it "failed" it did so securely which is the most important thing.)


[0] well there were some errors:
Synchronizing with server (this could take a while)...
Error setting attribute: Setting attribute metadata::custom-icon not supported
Error setting attribute: Setting attribute metadata::custom-icon not supported
Error setting attribute: Setting attribute metadata::custom-icon not supported
Error setting attribute: Setting attribute metadata::custom-icon not supported


but after Ctrl+c
SpiderOak --include-dir=/var/lib/mysql
seemed to work
and 
SpiderOak -v --batchmode
did the job (the -v is useful to give you confidence as to what SpiderOak is doing. i.e. it is for you; SpiderOak will be fine without it.) 

[1]alexx@www ~$ SpiderOak --setup=-
Traceback (most recent call last):
  File "<string>", line 6, in <module>
  File "__main__.py", line 128, in <module>
  File "__main__SpiderOak__.py", line 12, in <module>
  File "ssl.py", line 60, in <module>
ImportError: libkrb5.so.3: cannot open shared object file: No such file or directory

Tuesday, 18 June 2013

selective rsync cracked --precurse-parents

We all know that rsync is one of the elite unix programs. It has no equal and it is so well written and so powerful why would anyone try?

So what is my problem?

I want to back up /var/lib/mysql/ and /etc/pki/ and I want to do it recursively so that I recreate the actual path, (none of that incestuous relative stuff of me!)

What I /think/ I'm after is:

rsync --precurse-parents -maPAX \
--filter='+ /var/lib/mysql/**' \

--filter='+ /var/www/sites/*.org/**' \--filter='+ /var/www/sites/notice.*/**' \--filter='- /**' \
--filter='- *' \--rsync-path='sudo rsync' 'rsync@server:/' /var/backup/server

Where the --precurse-parents   would be like  --prune-empty-dirs
 but would include the parent dir /var and /var/lib because of /var/lib/mysql while excluding /var/* and /var/lib/*.

It is something that I've fought with for over a decade. I've written perl scripts to solve the problem. I've written bash scripts. I've even been crazy enough to read the documentation, (man rsync), but it wasn't until today that I understood.

about 83% of the way through the man is:

       Note  that,  when  using  the  --recursive  (-r)  option (which is implied by -a), every subcomponent of every path is visited from the top down, so
       include/exclude patterns get applied recursively to each subcomponent’s full name (e.g. to  include  "/foo/bar/baz"  the  subcomponents  "/foo"  and
       "/foo/bar" must not be excluded).  The exclude patterns actually short-circuit the directory traversal stage when rsync finds the files to send.  If
       a pattern excludes a particular parent directory, it can render a deeper include pattern ineffectual because rsync  did  not  descend  through  that
       excluded section of the hierarchy.  This is particularly important when using a trailing ’*’ rule.  For instance, this won’t work:

              + /some/path/this-file-will-not-be-found
              + /file-is-included
              - *

       This  fails  because  the  parent  directory "some" is excluded by the ’*’ rule, so rsync never visits any of the files in the "some" or "some/path"
       directories.  One solution is to ask for all directories in the hierarchy to be included by using a single rule: "+ */" (put it somewhere before the
       "-  *" rule), and perhaps use the --prune-empty-dirs option.  Another solution is to add specific include rules for all the parent dirs that need to
       be visited.  For instance, this set of rules works fine:

              + /some/
              + /some/path/
              + /some/path/this-file-is-found
              + /file-also-included
              - *

And that solved the problem for me:

rsync -maPAX \--filter='- *.swp' \--filter='- .git/' \--filter='+ /var/' \
--filter='+ /var/lib/' \
--filter='+ /var/lib/mysql**' \

--filter='+ /var/www/sites/' \
--filter='+ /var/www/sites/*.org/' \
--filter='+ /var/www/sites/*.org/**' \--filter='+ /var/www/sites/notice.*/' \
--filter='+ /var/www/sites/notice.*/**' \--filter='- /var/www/sites/*' \
--filter='- /var/www/*' \
--filter='- /var/*/*' \
--filter='- /var/*' \
--filter='- /**' \
--filter='- /*' \--prune-empty-dirs \
--rsync-path='sudo rsync' 'rsync@server:/' /var/backup/server

I think of this as, "include /var/ {so that rsync can see /var/www}"
"include /var/www/sites/*.org/ {include all of the .org sites}"
"include /var/www/sites/*.org/** {and the files+dirs of those .org sites}"

The mysql line includes the desired dir and everything in it, but would also match /var/lib/mysql_archive_do_NOT_backup, so it is a little more risky.
 
So each time rsync has to chose it goes through the whole filter form the top down and includes/excludes things that it finds, and if it hasn't included /var/www then /var/www/sites is _never_ going to match. The usual advice is to try the following:
 rsync -maPAX \
--filter='+ */' \
--filter='+ /var/www/sites/*.org/' \
--filter='- /var/www/sites/*.org/**' \
--filter='- /var/www/*' \
--filter='- /var/*/*' \
--filter='- /var/*' \
--filter='- /**' \
rsync@remote ~rsync/backup/
but I think that the first filter line
has the hardest implication to comprehend.


also
rsync -mnavvPAX  from to
is really helpful (the -nvv does a dry-run and gives additional info.)

 This would then be:
 
rsync -dwim --filter='+ /var/www/sites/*.org/**'  server /var/backup/server/
 
I'm sure there is still a better way to get rsync to precurse-parents, as it were, but I'm happy with this solution, (until some kind person adds a comment suggesting an even easier or quicker way to do this.
 
 [dwim = Do What I mean; not a real rsync flag]
 



About this blog

Sort of a test blog... until it isn't