1 / 25

Providing Campus Mail Services with a Linux Cluster

Providing Campus Mail Services with a Linux Cluster. Giles Malet University of Waterloo gdmalet@uwaterloo.ca. Overview. What our department does Our mail problems Our proposed solutions What we have done so far Problems we’re aware of What to do next

kirti
Download Presentation

Providing Campus Mail Services with a Linux Cluster

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Providing Campus Mail Services with a Linux Cluster Giles Malet University of Waterloo gdmalet@uwaterloo.ca

  2. Overview • What our department does • Our mail problems • Our proposed solutions • What we have done so far • Problems we’re aware of • What to do next • 25 slides… but feel free to interrupt! OUCC 2004

  3. Introductions • Information Systems & Technology (IST) • Provide services & expertise to campus • Project members • Dawn Keenan - sendmail, MRTG • Giles Malet - project lead, software • Rob Schmidt - ClamAV, J-chkmail • Jeff Voskamp - LDAP, systems stuff • plus assistance from others…. OUCC 2004

  4. Why do this? • More and more spam and viruses • More demand on IST for solution • Want to centralize the problem (xhier) • Want something everyone can use • Old system overwhelmed • See project charter OUCC 2004

  5. Services Desired • Robust user@uwaterloo.ca mail server • Virus scanning with immediate rejection • Refuse executables etc. by file extension • Spam identification so people can filter • Also DNS blacklists • People can opt out of (only) spam processing • LDAP needed internally – perhaps allow user lookups? OUCC 2004

  6. What we considered • Using Solaris and RedHat Linux already….. • “Cluster” problem – needed scalability • OpenMosix • OpenKnoppix • Linux Cluster Project • Linux HA • RedHat Enterprise • …and so on. • But what does “cluster” mean? Vague. OUCC 2004

  7. Decisions • Start simple, try more involved setup if load is too high • Keep detailed statistics so we know what’s changing (more later) • Ask for (some) input from campus • Do our own load balancing (else Cisco) • 4 cheap systems or 3 “good” systems? • Spread load, reduce impact of failure OUCC 2004

  8. Hardware Purchased • 4 Dell servers • 1 with mirrored SCSI disks, 3.2 Ghz CPU • 3 with single IDE disk, 3.0 Ghz CPU • 1 gig memory • 1x100 + 1x1000 Mbps ethernet • Rack-mounted, serial consoles (Annex, Cyclades) OUCC 2004

  9. Hardware Configuration OUCC 2004

  10. Hardware – ‘head’ server • Most powerful, most robust • Runs LDAP, MySQL, web servers, incoming mail • Mirrored disk, NFS shared to slaves • all cluster data in one place (mail queues) • Only machine that is backed up • Firewall / load balancing OUCC 2004

  11. Hardware - slaves • 3 identical machines, run all services • Only software difference is IP configuration (fix with DHCP) • Increasing the number provides more CPU, less exposure to software failure • Local disk only stores O/S • logs copied up to cerberus overnight • Firewall: only incoming connection is ssh from maintenance server • No user accounts: ssh as root from head server OUCC 2004

  12. Software Details • Will try OpenSource first, spend money second. • Looked at AFS etc, went to NFS (simple) • Use things we know, plus some experimentation • Emphasis was to get this going quickly, will fine-tune it later. OUCC 2004

  13. Sendmail • We know sendmail, and it works • Wanted “stock” system – no more phlookup --- thus LDAP • Something flexible: “milter” interface allows addons; can direct TCP connections from campus sendmails back to cluster OUCC 2004

  14. Clam Anti-Virus • Open source • Auto-updating from remote server • allows submission of ‘fingerprints’ • 3 components (freshclam, milter, clamd) • Some stability problems with latter two • Too many threads: 375 * 8 megs = 3 gigs • Deeply nested messages are problematic OUCC 2004

  15. J-Checkmail • Disallow incoming mail based on contents (regex) and extension • Also a milter interface • Lots of ongoing development (integrate virus scanning etc.) OUCC 2004

  16. SpamAssassin • Only marks spam – up to you to filter • Configurable preferences • must be on host initiating scan, thus problems with MX’d machines • Use MySQL internally • Not foolproof: lose mail on false positive OUCC 2004

  17. OpenLDAP • Sendmail understands LDAP • It is fast! (2 hours versus 5 mins) • Used only for mail address lookups, thus rebuild every few hours • “Hidden” users have minimal details • Starting to need LDAP for other systems, and ADS is tricky (Oracle Calendar) OUCC 2004

  18. IPTables firewall & routing • Route incoming connections to available hosts – DNAT (load balancing) • SNAT outgoing mail connections • Firewall the rest – reduce patching • nodewatch does auto-updating • runs on head server, talks multicast • simple polling of available servers • written in-house (C program + shell scripts) OUCC 2004

  19. Statistics • Important to know what “normal” is • Heavy use of MRTG and friends • See graphs: http://mailservices.uwaterloo.ca • Who’s using it? connections.txt OUCC 2004

  20. Gotchas • 3 gigs is not enough virtual memory • 8 megs stack / thread • Set ulimits: memory, number of processes • Logging is main load on disk – separate from mail spools • System will get a lot of unwanted attention • Dictionary attacks on sendmail – rate limit • LDAP scans – limit to campus, limit number of results, CPU per request • Firewall heavily, and hide the slaves • How to test without losing mail? OUCC 2004

  21. More gotchas… • What if a slave machine dies? • others can handle the load • What if the head machine dies? • Lose NFS, MySQL, LDAP • Could rebuild in a few hours from backups • Backup MX gets to do the work • Need a similar system somewhere, to share • Need better way to distribute configs • It helps if a single netmask covers all hosts • Duplicate scanning (next slide) OUCC 2004

  22. Duplicate scanning • Machines tux and ist MX’d to cluster • mail to user@ist goes: cluster -> ist -> cluster -> tux when .forward on ist to tux. • Also, mail to user@cluster gets forwarded to destination, which also scans. • Currently it’s not worth the effort to prevent this OUCC 2004

  23. Undeliverable postmaster mail • Sendmail aborts when postmaster mail is undeliverable, queues grow and grow • Mail containing virus from off-campus goes to user@machine-1 • tries to forward to user@machine-2 but is blocked, so • tries to bounce, but bogus From: header • tries to send to postmaster@machine-1, which is forwarded to machine-2…. OUCC 2004

  24. Where we’re going • 3 system cluster for development • try new ideas: RH cluster, others • new sendmail, scanners etc. • Disaster recovery – head or slave dies • Centralised LDAP server, but need to deal with MS Active Directory • Document all this, and how to use it • hand it over to Production Support OUCC 2004

  25. Winding down • Spam / virus problems are getting worse, so we’ll be busy for a while. • Contact us if you want more info, exchange ideas, give advice • Slides will be made available OUCC 2004

More Related