1 / 21

Python at Elemental Security EuroPython - June 29, 2005 Guido van Rossum Elemental Security, Inc.

Python at Elemental Security EuroPython - June 29, 2005 Guido van Rossum Elemental Security, Inc. guido@elementalsecurity.com guido@python.org. Elemental Security, Inc. Enterprise security software product: Elemental Compliance System (ECS)

rmateo
Download Presentation

Python at Elemental Security EuroPython - June 29, 2005 Guido van Rossum Elemental Security, Inc.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Python atElemental SecurityEuroPython - June 29, 2005Guido van RossumElemental Security, Inc. guido@elementalsecurity.comguido@python.org

  2. Elemental Security, Inc. • Enterprise security software • product: Elemental Compliance System (ECS) • express, monitor and enforce security policies for any computer connecting to the network (cross-platform) • scored 9.3 in recent InfoWorld Test Center • Startup (no longer in stealth mode!) • C round just closed; 11M led by Lehman Brothers • Using lots of Python (and Java!) • We're always hiring! • See http://www.elementalsecurity.com • Now a real website :-)

  3. ECS Application Structure • One Central Server • Java, J2EE (Tomcat), some Python, Oracle • front-end: rich web UI (JavaScript + XML-RPC) • back-end: agent connector (HTTP+SSL) • Many Agents • Python and C • runs on Windows, Solaris, Linux, ... • main components: • scheduler • server connector • policy engine – I'll get back to this later • packet filter – nearly the only part written in C

  4. Why Does Elemental Use Python? A. Because I'm There :-) B. Python is the best tool for the job • small footprint • runs everywhere (or almost runs :-) • access to platform-specific APIs (e.g. registry) • much of what we do is "script-like" • gather various configuration information about the host • check specific policy rules • this is so important we have a custom language for it! • application changes frequently • we continually learn to understand the problem better • quickly refactor code as needed

  5. ElementClass – a Simpler XML API • Use cases: • exchange data with central server • policies, reports, etc. • persist structured data within agent • policies, schedule, etc. • tool to manage policy definitions (Tkinter UI) • XML an obvious choice • Want better mapping between Python & XML • example: • XML: <schedule start="1" offset="100" /> • Py: sch.start+sch.offset #not int(sch.getattr("start"))

  6. ElementClass – Example Input <group name="PSF"> <employee name="Guido" age="49" /> <employee name="Tim" age="99" /> <employee name="Ben" age="17" /> <employee name="Dan" age="15" /> </group>

  7. ElementClass – Example Code • from xmlparse import ElementClass, String, Integer • class Employee(ElementClass): __element__ = "employee" __attributes__ = {"name": String, "age": Integer} • class Group(ElementClass): __element__ = "group" __attributes__ = {"name": String} __children__ = {Employee: "employees[]"} • group = Group.__parseFile__(filename) • minors = [e for e in group.employees if e.age < 18] • group.employees = minors • f = open(filename, "w"); group.__render__(f); f.close()

  8. Element Class – Example Output <group name="PSF"> <employee age="17" name="Ben" /> <employee age="15" name="Dan" /> </group>

  9. ElementClass – Limitations, Features • No namespace support • attribute names must be Python identifiers • (except '-' mapped to '_') • Can have CDATA or subelements but not both • Subelement choices for #occurrences: • zero or once: Python attribute is None or object • any number: Python attribute is a list, may be empty • Ordering of attributes and subelements is lost • except for relative ordering of similar elements • All attributes and elements are optional • Optionally, can ignore unrecognized attrs/elements

  10. ElementClass – What's Next? • Improve the API a bit? • use lists of tuples instead of dicts for metadata • this allows specifying attribute/subelement ordering • decide what to do with Unicode values • convert to str if ASCII only, or not? • add more attribute data types? • currently String, Integer, Boolean, Timestamp • add Float; what else? enumerations? • add required attributes, subelements? (which API?) • tidy up output (fewer line breaks) • Document it • Contribute it to the PSF in time for Python 2.5! • ESI lawyers to look at PSF Contribution Agreement

  11. Really Hammering The Server • Server scalability requirement: support 4000 agents • Available: a few dozen test machines • How to do server load testing? • Solution 1: run 50 agents on one test machine • test machines overloaded • test machines look too similar • can't quite reach scalability requirement • Solution 2: run 500 synthetic agents on one box • skips work that doesn't affect what the server sees • started out as a private hack, adopted very quickly • full potential not yet reached (next: 20K agents!) • can easily inject additional test data into server

  12. The Approach • Share as much code as possible with real agent • fortunately, most agent code is in library modules • N agent objects, K worker threads (K ≤ N) • 1 scheduler thread • real-time event queue managed using heapq module • main loop sleeps until next event ready • beware: event queue may be updated while sleeping! • distributes events to workers via Queue.Queue • worker main loop: • while True: callable, args = workQueue.get() callable(*args) • callable is typically a bound method of an agent object

  13. The Outcome • Works really well despite its simplicity • didn't have to use asynchronous I/O • Randomized synthetic data sent to server • example: simulate all agents being "nmapped" • Probably bounded by number of threads • can't have too many agents per thread • Inexplicable slow memory leak (not M2Crypto!)

  14. A Policy Implementation Language • ECS is all about policy compliance • each host has a policy compliance score: 0-100% • composed of individual (Boolean) policy rule scores • some (not all) policy rules can also be enforced • So what's a policy rule? Examples: • all passwords must be at least 6 characters • ftpd should be disabled • all email must go through server X • Elemental has a library of 1000+ policy rules • user selects some and deploys to group of hosts • agent gets rule list, executes rules, uploads results • repeat on user-selected schedule (30 min – 7 days)

  15. How To Implement Policy Rules • Requirements: • Cost to add another rule must be low • Some rules are relatively complex programming tasks • Rule authors are security experts, not programmers • Some possibilities: • shell scripts (Titan) • Perl, Python, etc. • XML • custom language

  16. Why Write Another Language • Need a library of policy-checking methods, e.g.: • assert that a file has a specific mode, owner, group • assert that a registry entry has a specific value • parse a configuration file using "name = value" syntax and then check a specific name/value pair • Ideal: constraint-based (declarative) language • execution order doesn't matter • compiler can check for conflicts between rules • Python would be fine if I were writing all the rules • still fairly low-level; risk of using the wrong approach • Compromise: nearly-declarative language • resembles Python except where it doesn't

  17. How Fuel Differs From Python • func has_localhost(host: Host, group: str): bool:for ip in host.gethostgroup(group):if substr(ip, 0, 4) == "127.":return truereturn false • Declarations required; all code is type-checked • interfaces used for library code written in Python • Single-assignment language with immutable values • let var [: type] = expr • Argument defaults computed dynamically • Many Python features left out (e.g. slicing!) • Container types: immutable set and struct • Fuel is not Turing-complete!

  18. Implementing Fuel • Process grammar with pgen • eventually reimplemented pgen in Python • Use tokenize.py for tokenization • Implemented pgen parsing automaton • as-we-go parse tree reduction • Use visitor pattern to translate to Python source • Parse tree node classes have grammar in docstrings • Run-time library in Python • defines some mutable object types

  19. Challenges in Writing Fuel • Not enough users yet to know we're doing it right • yes, we should open-source it! • Main challenge is to keep the language expressive without compromising its declarative nature • Fuel 2.0 will tweak the design quite a bit • host.runscript("userdel", "-r", acct.name) • admission of defeat – but unavoidable some times • Source code organization • linkage between source & hierarchical menu of rules • metadata repeated in source & XML • same rule implemented differently per platform

  20. How We Use Fuel • ~1400 policy rules implemented in Fuel • Written by about 4 people part-time over 1 year • Rules cover Solaris, Linux, Windows (2k+), ... • Rules cover all areas of security: • accounts, network, filesystem, system, hardware, software, packet filter, trust, authentication, logging

  21. Question Time

More Related