MavEtJu's Distorted View of the World - 2007-11
Database driven black/whitelist daemons
The things I take for granted tftpd -W option Equinix AusNOG2007 present Back to index Database driven black/whitelist daemonsPosted on 2007-11-29 09:00:00 About two years ago we re-implemented greylisting, sender verification and RBL blacklisting on our MTAs. As a result, we had a very steep drop in unsolicited email, but also very interesting mailbounces from applications which didn't set a proper envelope-from address. It taught us that we needed two things: a whitelist service and a blocked-email reporting service. The blocked-email reporting service was easy, just parse the Postfix log files once a day and check all the NOQUEUE entries. Then match the temporary failed deliveries against the successful deliveries and the leftover is a failed delivery. The whitelist service is not too difficult, thanks to the check_policy_service feature which gives the black/whitelist daemon the sending MTA, and the envelope-from and envelope-to email addresses. A lookup against the black/whitelist database and all goes fine. It is not really one lookup, it is a set of seven lookups:
Recently we were getting strange errors in the blocked-email reporting service, talking about Internal server configuration errors which were happening on our primary MTA. Not that the email got blocked by it, it just was delivered to our backup MTA. But still it was something which didn't make sense. We found out very soon that it was caused by a slow black/whitelist daemon. Not really a slow black/whitelist daemon, more a slow database. And not really a slow database, just a busy database. A busy database? All it is doing is euhmmm... let's see... 70 queries per second?!?!?! A quick look at the Cacti graphs showed me that over the last three weeks mail servers went from an average of two messages per second to an average of six messages per second (message delivery attempts that is), and since every message has seven database queries it kind of shot through the roof and the black/whitelisting daemon we had couldn't handle it fast enough. Time for a redesign. We have about 1500 entries in the black/whitelist database, and a hit-rate of about zero (most of the rejecting of email is done via sender verification and via DNS-RBLs). With such a low hit-rate, it shouldn't be a bad thing to locally cache a copy of it and refresh it every 15 minutes. That will save about 5400 database queries during that time period. Three hours later, some rewriting of the black/whitelist daemon and the database server is happy again with the two queries per 15 minutes now. And the mail? It kept flowing... No comments | Share on Facebook | Share on Twitter The things I take for grantedPosted on 2007-11-28 23:00:00 Yesterday I had a chat with a friend about computer networks, hardware upgrades and system monitoring and I found out that I had created in the last couple of years a very robust and detailed network monitoring and systems monitoring system, and that it has made my life a lot easier than what I could have gotten. For example, in Nagios we monitor nearly all aspects of our FreeBSD based servers: Not only the standard memory, CPU and diskspace, but also the answer from the DNS server on it, the presence of the crond, snmpd, inetd, sshd and syslogd. Not only do we monitor if all required processes are running, but also if their PID files are there and if the processes in these PID files do exist. And we monitor the status of the RAID cards, the status of the ethernet cards and were the default gateway points to. And the uptime of the server and the offset of the NTP synced time of the server. With regarding to network devices (routers, switches) we monitor the uptime of the device (these things reboot faster than Nagios can detect), we monitor the status of all ports (duplex, speed, operational status), temperature and status of the power supplies. And the status of the OSPF neighbours and BGP neighbours, plus a list of expected networks in the routing table. Network link devices (antennas, fibre convertors, laser heads) which support some form of remote management are checked the same: ethernet link status, radio link status, uptime. Anything which will display possible problems with it. For our PABX's we monitor the status of the PRIs, the status of the IAX and SIP destinations. Call it overdone, call it wasted too much time on monitoring... But when I replace a server or a device on the network, I would like to know without too much hassle if everything is back in order once I turn it on without having to go through too much hassle: When my monitoring program says everything is fine, I know everything went fine. No comments | Share on Facebook | Share on Twitter tftpd -W optionPosted on 2007-11-26 10:00:00 Commit of the day for tftpd(8): Add the -W options, which acts the same as -w but will generate unique names based on the submitted filename, a strftime(3) format string and a two digit sequence number. By default the strftime(3) format string is %Y%m%d (YYYYMMDD), but this can be changed by the -F option. What does this mean? That you don't have to worry about overwriting your precious previous saved router configuration files: [/tftpboot] root@tftp>ls -al -rw-r--r-- 1 nobody wheel 44048 Jun 22 08:52 hs2-bd8806.20070622.00 -rw-r--r-- 1 nobody wheel 45973 Jul 21 17:24 hs2-bd8806.20070721.00 -rw-r--r-- 1 nobody wheel 49140 Oct 4 21:49 hs2-bd8806.20071004.00 -rw-r--r-- 1 nobody wheel 49176 Oct 4 21:53 hs2-bd8806.20071004.01 -rw-r--r-- 1 nobody wheel 49177 Oct 4 21:54 hs2-bd8806.20071004.02
This will be availabe in FreeBSD >7.0, >6.3 and >5.5.
No comments | Share on Facebook | Share on Twitter Equinix AusNOG2007 presentPosted on 2007-11-16 20:00:00 Today I visited the second day of the AusNOG2007 conference (I missed the first day because of an unfortunate illness which caused me to sleep for about 48 hours in three days) and received a present from Equinix: A notebook (a paper one, not a computerized one) with on the first page all the Essential Telephone Numbers they could think of. What was missing was the phone number of the Equinix NOC... No comments | Share on Facebook | Share on Twitter |