200 likes | 352 Views
Scalable Kernel Performance for Internet Servers under Realistic Loads. Gaurav Banga, etc... Western Research Lab : Research Report 1998/06 (Proceedings of the 1998 USENIX Annual Technical Conference). Computer Architecture Lab. CS Dept. KAIST 2000/11/ Kim, Sung-Wan. Contents. Introduction
E N D
Scalable Kernel Performance for Internet Servers under Realistic Loads.Gaurav Banga, etc... Western Research Lab : Research Report 1998/06(Proceedings of the 1998 USENIX Annual Technical Conference) Computer Architecture Lab. CS Dept. KAIST 2000/11/ Kim, Sung-Wan
Contents • Introduction • Problems of select() & ufalloc() in event-driven servers • Scalable select() & ufalloc() • Experimental evaluation • Performance of a live system • Conclusions
Introduction • Event-driven servers • A single thread • manage all connections • Lower context-switching & synchronization overhead • faster than a thread-per-connection or pre-forked system • But, perform poorly under real conditions • select() & ufalloc() • select() • Asynchronous I/O • ufalloc() • Allocation of a new file descriptor for a process
Problems in select() & ufalloc() • WAN environments • Larger round-trip time and packet losses than LAN environments • Many open connections • select() • select() -> do_scan() -> selscan() -> soo_select() • select_wakeup() -> do_scan() -> selscan() -> soo_select() • soo_select() • check to see if the condition is true • Linear search for all opened socket • ufalloc() • Single bitmap (first lower descriptor number) • Too cost
Environment • Server • AlphaStation 500(400Mhz), 192 MB of main memory • Digital UNIX 4.0B • Squid 1.1.11, NetCache 3.1.2c-OSF • Client • AlphaStation 500(333Mhz) • Digital UNIX 3.2C • S-Client • Network • 100Mbps FDDI • Profiling • DCPI
Scalable select() & ufalloc() 1 0 Level 0 map 1 1 1 1 0 0 1 1 Level 1 map • select() • READY, INTERESTED, HINTS set • sowakeup() • Records a hint in the HINTS sets of each of the threads in the referencing processes for which this socket is present in the INTERESTED set of the thread. • ufalloc() • 2-level bitmap INTERESTEDnew = SELECTING U INTERESTEDold READYnew = C (INTERESTEDnew ^ (!INTERESTEDold U READYold U HINTS)) READYto_user = SELECTING ^ READYnew
Experimental Evaluation- Scalability with respect to connection rate * 750 infinitely slow connections
Experimental Evaluation- Scalability with respect to connection rate
Experimental Evaluation- Scalability with respect to connection count
Performance of a live system • Server • A Web proxy system at DEC • AlphaStation 500 (500 MHz), 512 MB of RAM • Running the system for an entire day • Proxy • Squid • NetCache
Performance of a live system- NetCache with caching disabled
Performance of a live system- NetCache with caching disabled
Conclusions • WAN delays • Linear scaling in the select() & ufalloc() • lead to excessive kernel CPU computation • Scalable versions • improve the performance of Web servers and proxies
1008 for (i = 0; i < maxfd; i++) { 1009 /* Check each open socket for a handler. */ 1010 if (fd_table[i].read_handler) { 1011 if (fd_table[i].stall_until <= squid_curtime) { 1012 nfds++; 1013 FD_SET( i, &readfds); 1014 } 1015 } 1016 if (fd_table[i].write_handler) { 1017 nfds++; 1018 FD_SET(i, &writefds); 1019 } 1020 } select(maxfd, &readfds, &writefds, …, …);