240 likes | 623 Views
Introduction to Writing Apache Modules. Badi Kumar S Technical Yahoo!. Outline. The Apache Project The three question paradigm: What? Why? How? The Apache Server Architecture The Module Architecture Apache Server Life Cycle A comparative study Introduction to the Apache API
E N D
Introduction to Writing Apache Modules Badi Kumar S Technical Yahoo!
Outline • The Apache Project • The three question paradigm: What? Why? How? • The Apache Server Architecture • The Module Architecture • Apache Server Life Cycle • A comparative study • Introduction to the Apache API • Apache Module API: Success stories
The Apache Project • Time line • 1995: Started as an enhancement to the NCSA Server • 1997: Port to Windows • Currently, the most popular web server in the market • Project • Adopted the Open Source Methodology • Used as the code base for many commercial server products • Features to Burn • Technical • Fast & Efficient, Portable, Stable & Reliable, Extensible, Easy to administer, Well supported • Philosophical • It won’t go away! • Makes you part of the community
What are Apache Modules? • Definitions • A paradigm to make Apache work the way you want • A way to add the features that you think are missing in Apache • A tool to intercept in the HTTP protocol in order to customize how Apache processes requests. • In short,A high-performance Web programming Paradigm
Why? • Options that a Web Programmer now has: • CGI • Server API • Server Side Includes • Script Co-processing • Embedded Interpreters • Client Side Scripting • Trade-offs:
How? • The Apache Server Architecture • Apache Module Architecture • The Apache Life Cycle • A comparative study
Apache Server Life Cycle Sever Startup And Conf Mod Init Child Init Child Init Child Init Req Loop Req Loop Req Loop Child Exit Child Exit Child Exit
Apache Life Cycle(contd…) • Response Loop • URI Translation-What? • Access Control-Where? • Authentication-Who? • Authorization • MIME type checking • RESPONSE PHASE • Logging • Cleanup
Comparative Example • Purpose: To display the current date and time • Alternatives: Apache SSI, PHP,CGI, Apache Module • Apache SSI • <!--#config timefmt="%a, %d %B %Y" --><!--#echo var="DATE_LOCAL" --> • PHP • <?php echo date(‘F, d B Y'); ?> • CGI • #!/bin/sh echo "Content-Type: text/plain" echo "" /bin/dateexit 0
Comparative Study(contd…) • Apache Module: #include <httpd.h> #include <http_protocol.h> #include <http_config.h> #include <http_log.h> #include <time.h> static int current_time_handler(request_rec *r) { time_t t; char year[10]; const char *date; r->content_type = "text/html"; ap_send_http_header(r); ap_rvputs(r, "<html><head><title>example</title></head><body>\n", "Hello, world from DSO!<p>\n", NULL); t = time(NULL); date = ctime(&t); ap_rvputs(r, "<p>The current time is <b>", date, "</b></p>\n", NULL); return OK; }
Display current time(contd…) static const handler_rec current_time_handlers[] = { {"current-time", current_time_handler}, {NULL} }; module MODULE_VAR_EXPORT current_time_module = { STANDARD_MODULE_STUFF, NULL, /* initializer */ NULL, /* create per-directory config structure */ NULL, /* merge per-directory config structures */ NULL, /* create per-server config structure */ NULL, /* merge per-server config structures */ NULL, /* command table */ current_time_handlers, /* handlers */ NULL, /* translate_handler */ NULL, /* check_user_id */ NULL, /* check auth */ NULL, /* check access */ NULL, /* type_checker */ NULL, /* pre-run fixups */ NULL, /* logger */ NULL, /* header parser */ NULL, /* child_init */ NULL, /* child_exit */ NULL /* post read-request */ };
Why bother to write in C ? Why not Perl or PHP • Performance & Power • Portability • Comparison: Using http_load150 fetches, 1023 max parrallel connections for 30 seconds
Options for a module writer • Statically link the module to the server binary • Configure your module as a Dynamic Shared Object(DSO) • Advantage: No need to compile the whole server everytime your module undergoes a change • Use APache eXtenstion(APXS) tool to do the job of compiling, installing and configuring for you!
Intro to Apache Module API • Data Structures • Module Record • Request Record • Server Record • Connection Record • Memory Management and Resource Pools • The Array & Table API • API for Processing Requests • Server Core Routines • Advanced Features
The module record module MODULE_VAR_EXPORT current_time_module = { STANDARD_MODULE_STUFF, NULL, /* module initializer */ NULL, /* per-directory config creator */ NULL, /* dir config merger */ NULL, /* server config creator */ NULL, /* server config merger */ NULL, /* command table */ current_time_handlers, /* [7] content handlers */ NULL, /* [2] URI-to-filename translation */ NULL, /* [5] check/validate user_id */ NULL, /* [6] check user_id is valid *here* */ NULL, /* [4] check access by host address */ NULL, /* [7] MIME type checker/setter */ NULL, /* [8] fixups */ NULL, /* [9] logger */ NULL, /* [3] header parser */ NULL, /* process initialization */ NULL, /* process exit/cleanup */ NULL /* [1] post read_request handling */ };
Data structures(contd…) • Request Record • Contains info about the current request, also has links to the next and prev requests and also stuff used by the core routines only. • Info about current request: • Protocol, hostname,request time,method(GET,POST…), MIME header environments, what object is being requested • other config info which may change with the .htaccess file • A linked list of conf directives in the .htaccess file • Server Record • Contains bits of info about the server and its operations. • Different server_rec for different vhost • Contains info that’s useful to module writers intermixed with info that only core routines use • Read-only
Memory Mgmt & Resource Pools • Need for memory management in C Programs • Pools & their lifetime • Global Pool • Request Pool • Connection Pool • Sub-pool • Memory-related Routines provided by Apache • Memory & String Allocation Routines • Sub-pool Management • Querying about pools
The Array & Table API • HTTP Protocol is filled with lists, hence an Array API. • Array API lets you create lists of arbitrary type & length • The Table API is built on top of the Array API. • API to create and maintain lookup tables. • Examples: /*Make Array*/array_header *arr = ap_make_array(p,5,sizeof(char *)); /*Add elements to the array*/*(char**)ap_push_array(arr)=ap_strdup(r->pool,”text/html”);/*Creating Tables*/table* my_table = ap_make_table(p,25);/*Getting & Setting Table Valuesap_table_set(table *t, const char * key, char *value);ap_table_get(table *t, const char *key);
Processing Requests • Tasks involved in processing Requests • Getting info about the transaction • Getting Info about the Server • Sending data to the client • Sending Files to the Client • Reading the Request Body • Example: /*Reading Request Body*/int ap_setup_client_block(request_rec *r, int read_policy); int ap_should_client_block(request_rec *r); int ap_get_client_block(request_rec *r, char *buffer, int bufsize);
Advanced Features • Implementing Configuration directives in C • Customizing the Configuration Process • String and URI Manipulation • File and Directory Management • Time and Date Functions • Message Digest Algo functions • UID GID Info Routines • Data Mutex Locking • Launching Subprocess
Success Stories • http://www.imdb.com/ • On-campus housing renewal system • Scripting Languages- PHP, ePerl, Embperl… • Mod_perl: • A dynamic map server • World’s largest commodities trading system Lind-Waldock & Co. uses mod_perl to generate live quotes • Document management system • Summary: Its hard for a high-performance website to survive without Apache Modules( provided they are running on Apache)
References • “Writing Apache Modules in Perl and C”-Lincoln Stein, Doug MacEachern • http://www.modperl.com • The Apache Modeling Portal:http://apache.hpi.uni-potsdam.de/