Wednesday, 19 June 2013

Speeding up Exim

Hit a strange "feature" in exim today:

# time exim -bt $(whoami)
logcheck@dull.tl.dr
    <-- root@alexx.net
  router = virtual_local_mailbox, transport = virtual_user

real    0m0.047s
user    0m0.016s
sys     0m0.010s


47  milliseconds. Not too shabby.

# time exim -bt abuse@hotmail.com
abuse@hotmail.com
  router = dnslookup, transport = remote_smtp
  host mx4.hotmail.com [65.54.188.72]  MX=5
  host mx4.hotmail.com [65.54.188.110] MX=5
  host mx4.hotmail.com [65.54.188.94]  MX=5
  host mx4.hotmail.com [65.55.37.72]   MX=5
  host mx4.hotmail.com [65.55.37.120]  MX=5
  host mx4.hotmail.com [65.55.92.152]  MX=5
  host mx4.hotmail.com [65.55.92.168]  MX=5
  host mx4.hotmail.com [65.55.92.184]  MX=5
  host mx4.hotmail.com [65.55.37.104]  MX=5
  host mx4.hotmail.com [65.55.92.136]  MX=5
  host mx4.hotmail.com [65.55.37.88]   MX=5
  host mx4.hotmail.com [65.54.188.126] MX=5

real    0m45.128s
user    0m0.010s
sys     0m0.016s
 

45 SECONDS? to do a blind-test? I hear DNS alarm bells in my head. So it is off
to: we go.

exim -v -d -bt abuse@gmail.com 

(you could inject a test message if you want to do it for real:
exim -d yourself@gmail.com  
To: yourself@gmail.com  
This is a test
.
)
 
exim whipps through to
--------> dnslookup router <--------
and then grinds to a sticky slow as it does
2a00:1450:400c:c03::1a in "0.0.0.0 : 127.0.0.0/8"? no (end of list)
DNS lookup of gmail-smtp-in.l.google.com (A) succeeded
173.194.66.27 in "0.0.0.0 : 127.0.0.0/8"? no (end of list)
DNS lookup of alt1.gmail-smtp-in.l.google.com (AAAA) succeeded
2a00:1450:4001:c02::1b in "0.0.0.0 : 127.0.0.0/8"? no (end of list)
DNS lookup of alt1.gmail-smtp-in.l.google.com (A) succeeded
173.194.70.27 in "0.0.0.0 : 127.0.0.0/8"? no (end of list)
DNS lookup of alt2.gmail-smtp-in.l.google.com (AAAA) succeeded
2a00:1450:4008:c01::1a in "0.0.0.0 : 127.0.0.0/8"? no (end of list)
DNS lookup of alt2.gmail-smtp-in.l.google.com (A) succeeded
173.194.69.27 in "0.0.0.0 : 127.0.0.0/8"? no (end of list)
DNS lookup of alt3.gmail-smtp-in.l.google.com (AAAA) succeeded
2a00:1450:4010:c04::1a in "0.0.0.0 : 127.0.0.0/8"? no (end of list)
DNS lookup of alt3.gmail-smtp-in.l.google.com (A) succeeded
173.194.71.27 in "0.0.0.0 : 127.0.0.0/8"? no (end of list)
DNS lookup of alt4.gmail-smtp-in.l.google.com (AAAA) succeeded
2607:f8b0:400e:c03::1b in "0.0.0.0 : 127.0.0.0/8"? no (end of list)
DNS lookup of alt4.gmail-smtp-in.l.google.com (A) succeeded
74.125.25.27 in "0.0.0.0 : 127.0.0.0/8"? no (end of list)

Now just as bcrypt deliberatly slows things down, this could be considered an anti-span feature, but I want my mail servers to deliver mail rather than to slow things down.

So lets look at that part of my exim config:

dnslookup:
  driver = dnslookup
  domains = ! +local_domains
  transport = remote_smtp
  ignore_target_hosts = 0.0.0.0 : 127.0.0.0/8
  no_more

so is it catching on ignore_target_hosts, (that is important), or just the DNS lookup.
[ Why not test ignore_target_host on the first MX host, try to deliver, and if we fall back to the next host, _then_ do the next ignore_targe_host test? Because exim calls its ROUTERS (in order) and _then_ calls the triggered TRANSPORT. This means that exim wants to verify _all_ routes in the ROUTER /before/ any transport. ]

Lets REM that out... and it _still_ takes 55 seconds! So just

DNS lookup of gmail.com (MX) succeeded
DNS lookup of gmail-smtp-in.l.google.com (AAAA) succeeded
DNS lookup of gmail-smtp-in.l.google.com (A) succeeded
DNS lookup of alt1.gmail-smtp-in.l.google.com (AAAA) succeeded
DNS lookup of alt1.gmail-smtp-in.l.google.com (A) succeeded
DNS lookup of alt2.gmail-smtp-in.l.google.com (AAAA) succeeded
DNS lookup of alt2.gmail-smtp-in.l.google.com (A) succeeded
DNS lookup of alt3.gmail-smtp-in.l.google.com (AAAA) succeeded
DNS lookup of alt3.gmail-smtp-in.l.google.com (A) succeeded
DNS lookup of alt4.gmail-smtp-in.l.google.com (AAAA) succeeded
DNS lookup of alt4.gmail-smtp-in.l.google.com (A) succeeded

is the bottle-neck. So how do we sniff out the name server/network?

As I've mentioned over here:

tcpdump -lvi any "udp port 53" 2>/dev/null

So if exim -bt is slow for !+local_domains : !+relay_domains, (i.e. remote/DNS domains) but its usual fast speed for
exim -bt $(whoami)
then you have a DNS problem, (or feature, depending upon your exim queue).

I also installed unbound on port 43 and added nameserver 127.0.0.1:43 to /etc/resolve.conf (as the only nameserver) and that took exim back up to full speed. I'm not naming names, but XXX.XXX.80.26 should not be in handed out via DHCP for NetworkManager if it isn't going to do resolving.

No comments:

Post a Comment

About this blog

Sort of a test blog... until it isn't