80 likes | 224 Views
Applications : Network Monitoring. Theodore Johnson AT&T Labs – Research johnsont@research.att.com. Contributors: Chuck Cranor Vladislav Shkapenyuk Oliver Spatscheck. What is the challenge in network monitoring?. Very high data rates.
E N D
Applications : Network Monitoring Theodore Johnson AT&T Labs – Research johnsont@research.att.com Contributors: Chuck Cranor Vladislav Shkapenyuk Oliver Spatscheck
What is the challenge in network monitoring? • Very high data rates. • Optical links : gigabit/sec and higher (to OC192), Millions of packets/sec. • Complex queries. • Extract dynamic substreams (flows, TCP sessions, etc.). • Many sequence numbers, types of sequencing • Simulate network protocols (IP defragmentation, etc.). • Multiple data sources. • SNMP, Netflow, BGP, packet sniffers, router tables, etc. • Many layered protocols: multimedia, VPN, etc. • Dozens to hundreds of heterogeneous deployments. • Every bad thing happens (most interesting part). • Overcome a prejudice that database technology is too slow and rigid for network monitoring. • Years-long lead time for core deployments.
Gigascope • Gigascope is a lightweight and flexible stream database specialized for network monitoring. • Current deployments: • Research’s link to Internet (100 Mbit/sec). • Large on-line game site (Gigabit Ethernet). • Ongoing deployment: • AT&T backbone internet routers (OC48). • Developed in close cooperation with network analysts.
PROTOCOL GAMEPROTOCOL (UDP) { ullonggp_header gp_header (snap_len 134); boolgp_is_ack_request gp_is_ack_request (snap_len 134); boolgp_is_ack_response gp_is_ack_response (snap_len 134); uintgp_ack_id gp_ack_id (snap_len 134); uintgp_sequence_number gp_sequence_number (snap_len 134); } selecttimestamp, sourceIP, destIP, source_port, dest_port, len, total_length, gp_header from GAMEPROTOCOL wheresample_hash[50, sourceIP, destIP] and protocol=17 and offset=0 select tb, lowIP, highIP, protocol, SUM(len), COUNT(*) from IPV4 wheresample_rand[75] groupby protocol, time/60 as tb, UMIN(sourceIP, destIP) as lowIP, UMAX(sourceIP, destIP) as highIP
select tb*60, MIN(timestamp), MAX(timestamp), destIP, dest_port, hostheader, count(*) from O.TCP whereipversion=4 and offset=0 and protocol=6 and str_match_start[TCP_data,'GET'] groupby time/60 as tb, str_extract_regex(TCP_data,'[Hh][Oo][Ss][Tt]:[0-9A-Za-z\\.: ]*') as hostheader, destIP, dest_port select tb, localip, peerid, asid, count(*), sum(len) from I.DataProtocol whereipversion=4 and (dest_port=25 or source_port=25) groupby time/60 as tb, destIP as localip, getlpmid(sourceIP, 'peerid.tbl') as peerid, getlpmid(sourceIP, 'asnid.tbl') as asid
Second App: Intrusion Detection • Current technology (e.g. Snort): • Low-speed only, cumbersome specification language. • Can’t trigger alarms/recording from aggregates (e.g. recent Ping attack on root DNS servers). • Needed: fast trigger processing. • Post alarm to network manager (not challenging). • Run query Q on the next 500 packets (challenging). • Run query Q on the previous 500 to next 500 packets (very challenging). • Potential solutions: • Push trigger processing to lowest query processing levels. • Snap backwards on input buffers to run in-the-past queries.
Summary • Very high data rates. • Fast and lightweight architecture. • Early data reduction is critical. • Complex queries. • Language constructs to capture substreams (e.g., netflow, TCP session, gaming session). • Extensive support for user-defined functions (getlpmid). • Support user-defined operators (e.g., IP defragmentation) • Currently working on a view mechanism • Multiple data sources • The current naming scheme is quite limited. • Collaboration across disciplines • Database, Networking, Systems, Software, Algorithms • Accept that you can’t do everything. • But you can do most of the tedious work.