|
||||||
Chapter 8: Customizing the Apache Configuration ProcessThis chapter covers an important but complex aspect of the Apache Perl API, the process of controlling and customizing the Apache configuration process itself. Using the techniques shown in this chapter, you will be able to define new configuration file directives that provide runtime configuration information to your modules. You will also be able to take over all or part of the Apache configuration process and write Perl code to dynamically configure the server at startup time.
Simple Configuration with the PerlSetVar DirectiveThe Apache Perl API provides a simple mechanism for passing information from configuration files to Perl modules using the PerlSetVar directive. As we've seen, the directive takes two arguments, the name of a variable and its value:
PerlSetVar FoodForThought apples Because Perl is such a whiz at parsing text, it's trivial to pass an array, or even a hash in this way. For example, here's one way (out of a great many) to pass an array:
# in configuration file PerlSetVar FoodForThought apples:oranges:kiwis:mangos
# in Perl module @foodForThought = split ":", $r->dir_config('FoodForThought'); And here's a way to pass an associative array:
# in configuration file PerlSetVar FoodForThought apples=>23,kiwis=>12
# in Perl module %foodForThought = split /\s*(?:=>|,)\s*/, $r->dir_config('FoodForThought); Notice that the pattern match allows whitespace to come before or after the comma or arrow operators, just as Perl does. By modifying the pattern match appropriately, you can pass more complex configuration information. The only trick is to remember to put double quotes around the configuration value if it contains whitespace, and not to allow your text editor to wrap it to another line. You can use backslash as a continuation character if you find long lines a pain to read:
PerlSetVar FoodForThought "apples => 23,\ kiwis => 12,\ rutabagas => 0" If you have a really complex configuration, then you are probably better off using a separate configuration file and pointing to it using a single PerlSetVar directive. The server_root_relative() method is useful for specifying configuration files that are relative to the server root:
# in server configuration file PerlSetVar FoodConfig conf/food.conf
# in Perl module $conf_file = $r->server_root_relative($r->dir_config('FoodConfig')); Despite the simplicity of this approach, there are times when you may prefer to create your own ``first class'' configuration directives. This becomes particularly desirable when you have many different directives, when the exact syntax of the directives is important and you want Apache to check the syntax at startup time, or when you are planning to distribute your module and want it to appear polished. There is also a performance penalty associated with parsing PerlSetVar configuration at request time, which is avoided using first class configuration directives because they are parsed once at server startup time.
The Apache Configuration Directive APIApache provides an API for defining configuration directives. You provide the directive's name, syntax, and a string briefly summarizing the directive's intended usage. You may also limit the applicability of the directive to certain parts of the configuration files. Apache parses the directive and passes the parsed structure to your module for processing. Your module will then use this information to set up global variables or do whatever initialization it needs to.The process of defining new configuration directives is not as simple to use as other parts of the Perl API. This is because configuration directives are defined in a compiled C structure that cannot be built dynamically at run time. In order to work with this restriction, mod_perl takes the following roundabout route:
As you may recall, Apache::PassThru used a single PerlSetVar variable named PerlPassThru, which in turn contained a series of local=>remote URI pairs stored in one long string. Although this strategy is adequate, it's not particularly elegant. Our goal here will be to create a new first class configuration directive named PerlPassThru. PerlPassThru will take two arguments, a local URI and a remote URI to map it to. To map several local URIs to remote servers, you'll be able to repeat the directive. Because it makes no sense for the directory to appear in directory sections or .htaccess files, PerlPassThru will be limited to the main parts of the httpd.conf, srm.conf and access.conf files, as well as to <VirtualHost> sections. First we'll need something to start with, so we use h2xs to create a skeletal module directory:
% h2xs -Af -n Apache::PassThru Writing Apache/PassThru/PassThru.pm Writing Apache/PassThru/PassThru.xs Writing Apache/PassThru/Makefile.PL Writing Apache/PassThru/test.pl Writing Apache/PassThru/Changes Writing Apache/PassThru/MANIFEST The -A and -f command-line switches turn off the generation of autoloader stubs and the C header file conversion steps, respectively. -n gives the module a name. We'll be editing the files Makefile.PL and PassThru.pm. PassThru.xs will be overwritten when we go to make the module, so there's no need to worry about it. The next step is to edit the Makefile.PL script to add the declaration of the PerlPassThru directive and to arrange for Apache::ExtUtils' command_table() function to be executed at the appropriate moment. Listing 8.1 shows a suitable version of the file. We've made multiple modifications to the Makefile.PL originally produced by h2xs. First, we've placed a package declaration at the top, putting the whole script in the Apache::PassThru namespace. Then, after the original line use ExtUtils::MakeMaker, we load two mod_perl-specific modules, Apache::ExtUtils, which defines the command_table() function, and Apache::src, a small utility class that can be used to find the location of the Apache header files. These will be needed during the make.
package Apache::PassThru; # File: Apache/PassThru/Makefile.PL
use ExtUtils::MakeMaker;
use Apache::ExtUtils qw(command_table); use Apache::src ();
my @directives = ( { name => 'PerlPassThru', errmsg => 'a local path and a remote URI to pass through to', args_how => 'TAKE2', req_override => 'RSRC_CONF' } );
command_table(\@directives);
WriteMakefile( 'NAME' => __PACKAGE__, 'VERSION_FROM' => 'PassThru.pm', 'INC' => Apache::src->new->inc, 'INSTALLSITEARCH' => '/home/httpd/lib/perl', 'INSTALLSITELIB' => '/home/httpd/lib/perl', ); __END__
Next comes the place where we define the new configuration directives
themselves. We create a list named Each element of the list is an anonymous hash containing one or more of the keys name, errmsg, args_how and req_override (we'll see later how to implement the most common type of directive using a succinct anonymous array form). name corresponds to the name of the directive, ``PerlPassThru'' in this case, and errmsg corresponds to a short usage message that will be displayed in the event of a configuration syntax error. args_how tells Apache how to parse the directive's arguments. In this case we specify TAKE2, which tells Apache that the directive takes two (and only two) arguments. We'll go over the complete list of parsing options later, and also show you a shortcut for specifying parsing options using Perl prototypes. The last key, req_override tells Apache what configuration file contexts the directive is allowed in. In this case we specify the most restrictive context, RSRC_CONF, which limits the directive to apearing in the main part of the configuration files or in virtual host sections. Notice that RSRC_CONF is an ordinary string, not a bareword function call! Having defined our configuration directive array, we pass a reference to it to the command_table() function. When run, this routine writes out a file named PassThru.xs to the current directory. command_table() uses the package information returned by the Perl caller() function to figure out the name of the file to write. This is why it was important to include a package declaration at the top of the script.
The last part of Makefile.PL is a call WriteMakefile(), a routine provided by
ExtUtils::MakeMaker and automatically placed in Makefile.PL
by h2xs. However we've modified the autogenerated call in three slight but
important ways. The The next step is to modify PassThru.pm to accommodate the new configuration directive. We start with the stock file from Listing 7.10, and add the following lines to the top of the file:
use Apache::ModuleConfig (); use DynaLoader (); use vars qw($VERSION);
$VERSION = '1.00';
if($ENV{MOD_PERL}) { no strict; @ISA = qw(DynaLoader); __PACKAGE__->bootstrap($VERSION); }
This brings in code for fetching and modifying the current configuration
settings, and loads the DynaLoader module, which provides the bootstrap() routine for loading shared library code. We test the Next, we add the following configuration processing callback routine to the file:
sub PerlPassThru ($$$$) { my($cfg, $parms, $local, $remote) = @_; $cfg->{PassThru}{$local} = $remote; }
The callback (known for short as the ``directive handler'') is a subroutine
will be called each time Apache processes a PerlPassThru
directive. It is responsible for stashing the information into a
configuration record where it can be retrieved later by the
handler() subroutine. The name of the subroutine must exactly match the name of the
configuration directive, capitalization included. It should also have a
prototype that correctly matches the syntax of the configuration directive.
All configuration callbacks are called with at least two scalar arguments
($$). The first argument, $cfg, is the per-directory or per-server object where the configuration data
will be stashed. As we will explain shortly, this object can be recovered
later during startup or request time. The second argument, Depending on the syntax of the directive, callbacks will be passed other parameters as well, corresponding to the arguments of the configuration directive that the callback is responsible for. In the case of PerlPassThru(), which is a TAKE2 directive, we expect two additional arguments, so the complete function prototype is ($$$$). The body of the subroutine is trivial. For all intents and purposes the configuration object is a hash reference in which you can store arbitrary key/value pairs. The convention is to choose a key with the same name as the configuration directive. In this case we use an anonymous hash to store the current local and remote URIs into the configuration object at a key named PassThru. This allows us to have multiple mappings while guaranteeing that each local URI is unique. The handler() subroutine needs a slight modification as well. We remove the line
my %mappings = split /\s*(?:,|=>)\s*/, $r->dir_config('PerlPassThru'); and substitute the following:
my %mappings = (); if(my $cfg = Apache::ModuleConfig->get($r)) { %mappings = %{ $cfg->{PassThru} } if $cfg->{PassThru}; }
We call the Apache::ModuleConfig class method get() to retrieve the configuration object corresponding to the current request.
We then fetch the value of the configuration object's
PassThru key. If the key is present, we dereference it and store it into The last step is to arrange for Apache::PassThru to be loaded at server startup time. The easiest way to do this is to load the module with a PerlModule directive:
PerlModule Apache::PassThru The only trick to this is that you must be careful that the PerlModule directive is called before any PerlPassThru directives appear. Otherwise Apache won't recognize the new directive and will abort with a configuration file syntax error. The other caveat is that PerlModule only works to bootstrap configuration directives in mod_perl versions 1.17 and higher. If you are using an earlier version, use this configuration section instead:
<Perl> use Apache::PassThru (); </Perl> <Perl> sections are described in more detail towards the end of this chapter. Now change the old Apache::PassThru configuration to use the first-class PerlPassThru directive:
PerlModule Apache::PassThru PerlTransHandler Apache::PassThru
PerlPassThru /CPAN http://www.perl.com/CPAN PerlPassThru /search http://www.altavista.com After restarting the server, you should now be able test the Apache::PassThru handler to confirm that it correctly proxies the /CPAN and /search URIs. If your server has the mod_info module configured, you should be able to view the entry for the Apache::PassThru module. It should look something like this:
Module Name: Apache::PassThru
Content handlers: none
Configuration Phase Participation: Create Directory Config, Create Server Config Request Phase Participation: none Module Directives: PerlPassThru - a local path and a remote URI to pass through to Current Configuration: httpd.conf PerlPassThru /CPAN http://www.perl.com/CPAN PerlPassThru /search http://www.altavista.com Now try changing the syntax of the PerlPassThru directive. Create a directive that has too many arguments, or one that has too few. Try putting the directive inside a <Directory> section or .htaccess file. Any attempt to violate the syntax restrictions we specified in Makefile.PL with the args_how and req_override keys should cause a syntax error at server startup time.
package Apache::PassThru; # file: Apache/PassThru.pm; use strict; use vars qw($VERSION); use Apache::Constants qw(:common); use Apache::ModuleConfig (); use DynaLoader (); $VERSION = '1.00'; if($ENV{MOD_PERL}) { no strict; @ISA = qw(DynaLoader); __PACKAGE__->bootstrap($VERSION); } sub handler { my $r = shift; return DECLINED if $r->proxyreq; my $uri = $r->uri; my %mappings = ();
if(my $cfg = Apache::ModuleConfig->get($r)) { %mappings = %{ $cfg->{PassThru} } if $cfg->{PassThru}; } foreach my $src (keys %mappings) { next unless $uri =~ s/^$src/$mappings{$src}/; $r->proxyreq(1); $r->uri($uri); $r->filename("proxy:$uri"); $r->handler('proxy-server'); return OK; } return DECLINED; } sub PerlPassThru ($$$$) { my($cfg, $parms, $local, $remote) = @_; unless ($remote =~ /^http:/) { die "Argument `$remote' is not a URL\n"; } $cfg->{PassThru}{$local} = $remote; }
1; __END__
Designing Configuration DirectivesWe'll now look in more detail at how you can precisely control the behavior of configuration directives.As you recall, a module's configuration directives are declared in an array of hashes passed to the command_table() function. Each hash contains the required keys name and errmsg. In addition, there many be any of four optional keys func, args_how, req_override and cmd_data. For example, this code fragment defines two configuration directives named TrafficCopSpeedLimit and TrafficCopRightOfWay:
@directives = ( { name => 'TrafficCopSpeedLimit', errmsg => 'an integer specifying the maximum allowable kilobytes per second', func => 'right_of_way', args_how => 'TAKE1', req_override => 'OR_ALL', }, { name => 'TrafficCopRightOfWay', errmsg => 'list of domains that can go as fast as they want', args_how => 'ITERATE', req_override => 'OR_ALL', cmd_data => '[A-Z_]+', }, ); command_table(\@directives); The required name key points to the name of the directive. It should have exactly the same spelling and capitalization as the directive you want to implement (Apache doesn't actually care about the capitalization of directives, but Perl does when it goes to call your configuration processing callbacks). Alternatively, you can use the optoinal func key to specify a subroutine with a different name than the configuration directive. The mandatory errmsg key should be a short but succinct usage statement that summarizes the arguments that the directive takes. The optional args_how key tells Apache how to parse the directive. There are 11 (!) possibilities corresponding to different numbers of mandatory and optional arguments. Because the number of arguments passed to the Perl callback function for processing depends on the value of args_how, the callback function must know in advance how many arguments to expect. The optional cmd_data key can be used to pass arbitrary information to the directive handler. The handler can retrieve this information by calling the $parms object's info() method. In our example, we use this information to pass a pattern match expression to the callback. This is how it might be used:
sub TrafficCopRightOfWay ($$@) { my($cfg, $parms, $domain) = @_; my $pat = $parms->info; unless ($domain =~ /^$pat$/i) { die "Invalid domain: $domain\n"; } $cfg->{RightOfWay}{$domain}++; } req_override, another optional key, is use to restrict the directive so that it can only legally appear in certain sections of the configuration files.
Specifying Configuration Directive SyntaxMost configuration-processing callbacks will declare function prototypes that describe how they are intended to be called. Although in the current implementation Perl does not check callbacks' prototypes at runtime, they serve a very useful function nevertheless. The command_table() function can use callback prototypes to choose the correct syntax for the directive on its own. If no args_how key is present in the definition of the directive, command_table() will pull in the .pm file containing the callback definitions and attempt to autogenerate the args_how field on its own, using the Perl prototype() builtin function. By specifying the correct prototype, you can forget about args_how entirely and let command_table() take care of choosing the correct directive parsing method for you.If both an args_how and a function prototype are provided, command_table() will use the value of args_how in case of a diagreement. If neither an args_how nor a function prototype is present, command_table() will choose a value of TAKE123, which is a relatively permissive parsing rule. Apache supports a total of 11 different directive parsing methods. This section lists their symbolic constants and the Perl prototypes to use if you wish to take advantage of configuration definition shortcuts.
DECLINE_CMD , a constant that must be explicitly imported from Apache::Constants. This is used in the rare circumstance in which a module redeclares
another module's directive in order to override it. The directive handler
can then return DECLINE_CMD when it wishes the directive to fall through to the original module's
handler.
Restricting Configuration Directive UsageIn addition to specifying the syntax of your custom configuration directives, you can establish limits on how they can be used by specifying the req_override key in the data passed to command_table(). This option controls which parts of the configuration files the directives can appear in, something that is called the directive's ``context'' in the Apache manual pages. This key should point to a bitmap formed by combining the values of several C-language constants:
'req_override' => 'RSRC_CONF | ACCESS_CONF' As in the case of args_how, the value of the req_override key is actually not evaluated. It is simply a string that is written into the .xs file and eventually passed to the C compiler. This means that any errors in the string you provide for req_override will not be caught until the compilation phase.
Directive Definition ShortcutsWe've already seen one way to simplify your configuration directives by allowing command_table() to deduce the correct args_how from the callback's function prototype. One other shortcut is available to you as well.If you pass command_table() a list of array references rather than hash references, then it will take the first item in each array ref to be the name of the configuration directive, and the second item to be the error/usage message. req_override will default to OR_ALL (allow this directive anywhere), and args_how will be derived from the callback prototype, if present, TAKE123 if not. By taking advantage of this shortcut, we can rewrite the list of configuration directives at the beginning of this section more succinctly:
@directives = ( [ 'TrafficCopSpeedLimit', 'an integer specifying the maximum allowable bytes per second', ], [ 'TrafficCopRightOfWay', 'list of domains that can go as fast as they want', ], ); command_table(\@directives);
You can also mix and match the two configuration styles. The
Configuration Creation and MergingDigging deeper, the process of module configuration is more complex than you'd expect because Apache recognizes multiple levels of configuration directives. There are global directives such as those contained within the main httpd.conf file, per-server directives specific to virtual hosts contained within <VirtualHost> sections, and per-directory configuration directives contained within <Directory> sections and .htaccess files.To understand why this issue is important, consider this series of directives:
TrafficCopSpeedLimit 55
<Location /I-95> TrafficCopRightOfWay .mil .gov TrafficCopSpeedLimit 65 </Location>
<Location /I-95/exit-13> TrafficCopSpeedLimit 30 </Location> When processing URLs in /I-95/exit13 there's a potential source of conflict because the TrafficCopSpeedLimit directive appears in several places. Intuitively, the more specific directive should take precedence over the one in its parent directory, but what about TrafficCopRightOfWay? Should /I-95/exit13 inherit the value of TrafficCopRightOfWay or ignore it? On top of this, there is the issue of per-server and per-directory configuration information. Some directives, such as HostName, clearly apply to the server as a whole and have no reason to change on a per-directory basis. Other directives, such as Options, apply to individual directories or URIs. Per-server and per-directory configuration information should be handled separately from each other. To handle these issues, modules may declare as many as four subroutines to handle configuration issues: SERVER_CREATE(), DIR_CREATE(), SERVER_MERGE() and DIR_MERGE(). The SERVER_CREATE() and DIR_CREATE() routines are responsible for creating per-server and per-directory configuration records. If present, they are invoked before Apache has processed any of the module's configuration directives in order to create a default per-server or per-directory configuration record. Provided that at least one of the module's configuration directives appears in the main part of the configuration file, SERVER_CREATE() will be called once for the main server host, and once for each virtual host. Similarly, DIR_CREATE() will be called once for each directory section (including <Location> and .htaccess files) in which at least one of the module's configuration directives appears.
As Apache parses and processes the module's custom directives, it invokes
the directive callbacks to add information to the per-server and/or
per-directory configuration records. Since the vast majority of modules act
at a per-directory level, Apache passes the per-directory record to the
callbacks as the first argument. This is the Later in the configuration process, one or both of the SERVER_MERGE() and DIR_MERGE() subroutines may be called. These routines are responsible for merging a parent per-server or per-directory configuration record with a configuration that is lower in the hierarchy. For example, merging will be required when one or more of a module's configuration directives appear in both a <Location /images> section and a <Location /images/PNG> section. In this case, DIR_CREATE() be called to create default configuration records for each of the /images and /images/PNG directories, and the configuration directives' callbacks will be called to set up the appropriate fields in these newly-created configurations. After this, the DIR_MERGE() subroutine is called once to merge the two configuration records together. The merged configuration now becomes the per-directory configuration for /images/PNG. This merging process is repeated as many times as needed. If a directory or virtual host section contains none of a particular module's configuration directives, then the configuration handlers are skipped and the configuration for the closest ancestor of the directory is used instead. In addition to being called at server startup time, the DIR_CREATE() function may be invoked again at request time, for example whenever Apache processes a .htaccess file. The DIR_MERGE() functions are always invoked at request time in order to merge the current directory's configuration with its parents.
When C modules implement configuration directive handlers they must, at the
very least, define a per-directory or per-server constructor for their
configuration data. However if a Perl modules does not implement a
constructor, mod_perl uses a default constructor that creates a hash reference blessed into the
current package's class. Later Apache calls your module's directive
callbacks to fill in this empty hash, which is, as usual, passed in as the Neither C nor Perl modules are required to implement merging routines. If they do not, merging simply does not happen and Apache uses the most specific configuration record. In the example at the top of this section, the configuration record for the URI location /I-95/exit-13 would contain the current value of TrafficCopSpeedLimit, but no specific value for TrafficCopRightOfWay. Depending on your module's configuration system, you may wish to implement one or more of the configuration creation and merging methods described below. The method names use the all upper-case naming convention as they are never called by other user code; instead they are invoked by mod_perl from the C level.
The Apache::CmdParms and Apache::ModuleConfig ClassesThe configuration mechanism uses two auxiliary classes, Apache::CmdParms and Apache::ModuleConfig to pass information between Apache and your module.Apache::ModuleConfig is the simpler of the two. It provides just a single method, get(), which retrieves a module's current configuration information. The return value is the object created by the module DIR_CREATE() or SERVER_CREATE() methods. The get() method is called with the current request object or server object and an optional additional argument indicating which module to retrieve the configuration from. In the typical case, you'll omit this additional argument to indicate that you want to fetch the configuration information for the current module. For example, we saw this in the Apache::PassThru handler() routine:
my $cfg = Apache::ModuleConfig->get($r); Had we used a SERVER_CREATE() method, the configuration data would be obtained using the request server object:
my $cfg = Apache::ModuleConfig->get($r->server); As a convenience, the per-directory configuration object for the current module is always the first argument passed to any configuration processing callback routine. Directive processing callbacks that need to operate on server-specific configuration data should ignore this hash and fetch the configuration data themselves using a technique we will discuss shortly. It is also possible for one module to peek at another module's configuration data by naming its package as the second argument to get():
my $friends_cfg = Apache::ModuleConfig->get($r, 'Apache::TrafficCop'); You can now read and write the other module's configuration information! Apache::CmdParms is a helpful class that Apache uses to pass a variety of configuration information to modules. A Apache::CmdParms object is the second argument passed to directive handler routines. The various methods available from Apache::CmdParms are listed fully in the next chapter. The two you are most likely to use in your modules are server() and path(). server() returns the Apache::Server object corresponding to the current configuration. From this object you can retrieve the virtual host's name, its configured port, the document root, and other core configuration information. For example, this code retrieves the administrator's name from within a configuration callback and adds it to the module's configuration table:
sub TrafficCopActiveSergeant ($$$) { my($cfg, $parms, $arg) = @_; $cfg->{Sergeant} = $arg; my $chief_of_police = $parms->server->server_admin; $cfg->{ChiefOfPolice} = $chief_of_police; } Another place where the server() method is vital is when directive processing callbacks need to set server-specific configuration information. In this case, the per-directory configuration passed as the first callback argument can be ignored, and the per-server configuration fetched by calling the Apache::ModuleConfig get() with the server object as its argument. Here's an example:
sub TrafficCopDispatcher ($$$) { my($cfg, $parms, $arg) = @_; my $scfg = Apache::ModuleConfig->get($parms->server) $scfg->{Dispatcher} = $arg; } If the configuration-processing routine is being called to process a container directive such as <Location> or <Directory>, the Apache::CmdParms path() method will return the directive's argument. Depending on the context this might be a URI, a directory path, a virtual host address, or a filename pattern. See Chapter 9 for the details on other methods that Apache::ModuleConfig and Apache::CmdParms makes available.
Reimplementing mod_mime in PerlAs a full example of creating custom configuration directives, we're going to reimplement the standard mod_mime module in Perl. It has a total of seven different directives, each with a different argument syntax. In addition to showing you how to handle a moderately complex configuration setup, this example will show you in detail what goes on behind the scenes as mod_mime associates a content handler with each URI request.This module replaces the standard mod_mime module. You do not have to remove mod_mime from the standard compiled-in modules in order to test this module. However if you wish to remove mod_mime anyway in order to convince yourself that the replacement actually works, the easiest way to do this is to compile mod_mime as a dynamically loaded module and then comment out the lines in httpd.conf that load it. In either case, install Apache::MIME as the default MIME checking phase handler by putting this line in perl.conf or one of the other configuration files:
PerlTypeHandler Apache::MIME Like the previous example, the configuration information is contained in two files. Makefile.PL (Listing 8.3) describes the directives, and Apache/MIME.pm (Listing 8.4) defines the callbacks for processing the directives at runtime. In order to reimplement mod_mime, we need to reimplement a total of seven directives, including SetHandler, AddHandler, AddType and AddEncoding. Makefile.PL defines the seven directives using the anonymous hash method. All but one of the directives is set to use the OR_FILEINFO context, which allows the directives to appear anywhere in the main configuration files, and in .htaccess files as well provided that Override FileInfo is also set. The exception, TypesConfig, is the directive that indicates where the default table of MIME types is to be found. It only makes sense to process this directive during server startup, so its context is given as RSRC_CONF, limiting the directive to the body of any of the *.conf files. We don't specify the args_how key for the directives, instead allowing command_table() to figure out the syntax for us by looking at the function prototypes in MIME.pm. Running perl Makefile.PL will now create a .xs file, which will be compiled into a loadable object file during make. Turning to Listing 8.4, we start by bringing in the DynaLoader and Apache::ModuleConfig modules as we did in the overview example at the beginning of this section:
package Apache::MIME; # File: Apache/MIME.pm use strict; use vars qw($VERSION @ISA); use LWP::MediaTypes qw(read_media_types guess_media_type add_type add_encoding); use DynaLoader (); use Apache (); use Apache::ModuleConfig (); use Apache::Constants qw(:common DIR_MAGIC_TYPE DECLINE_CMD); @ISA = qw(DynaLoader); $VERSION = '0.01';
if($ENV{MOD_PERL}) { no strict; @ISA = qw(DynaLoader); __PACKAGE__->bootstrap($VERSION); }
We also bring in Apache, Apache::Constants and an LWP library called LWP::MediaTypes. The Apache and Apache::Constants
libraries will be used within the handler() subroutine, while the LWP library provides utilities for guessing MIME
types, languages and encodings from file extensions. As before, Apache::MIME needs to call bootstrap immediately after loading other modules in order to bring in its compiled
.xs half. Notice that we have to explicitly import the Let's skip over handler() for the moment and look at the seven configuration callbacks, TypesConfig(), AddType(), AddEncoding() and so on.
sub TypesConfig ($$$) { my($cfg, $parms, $file) = @_; my $types_config = Apache->server_root_relative($file); read_media_types($types_config); #to co-exist with mod_mime.c return DECLINE_CMD if Apache->module("mod_mime.c"); }
TypesConfig() has a function prototype of ``$$$'', indicating a directive syntax of TAKE1. It will be called with the name of the file holding the MIME types table
as its third argument. The callback retrieves the file name, turns it into
a server-relative path, and stores the path into a lexical variable. The
callback then calls the LWP function read_media_types() to parse the file and add the MIME types found there to an internal table
maintained by LWP::MediaTypes. When the LWP::MediaTypes
function guess_media_type() is called subsequently, this table will be consulted. Note that there is no
need, in this case, to store the configuration information into the
Another important detail is that the TypesConfig handler will return
sub AddType ($$@;@) { my($cfg, $parms, $type, $ext) = @_; add_type($type, $ext); } The AddType() directive callback is even shorter. Its function prototype is ``$$@;@'', indicating an ITERATE2 syntax. This means that if the AddType directive looks like this:
AddType application/x-chicken-feed .corn .barley .oats the function will be called three times. Each time the callback is invoked its third argument will be ``application/x-chicken-feed'', and the fourth argument will be successively set to ``.corn'', ``.barley'' and ``.oats''. The function recovers the third and fourth parameters and passes them to the LWP::MediaTypes function add_type(). This simply adds the file type and extension to LWP's internal table.
sub AddEncoding ($$@;@) { my($cfg, $parms, $enc, $ext) = @_; add_encoding($enc, $ext); } AddEncoding() is similar to AddType(), but uses the LWP::MediaTypes add_encoding() function to associate a series of file extensions with a MIME encoding. More interesting are the SetHandler() and AddHandler() callbacks:
sub SetHandler ($$$) { my($cfg, $parms, $handler) = @_; $cfg->{'handler'} = $handler; } sub AddHandler ($$@;@) { my($cfg, $parms, $handler, $ext) = @_; $cfg->{'handlers'}->{$ext} = $handler; }
The job of the SetHandler directive is to force requests for the specified path to be passed to the
indicated content handler, no questions asked. AddHandler(), in contrast, adds a series of file extensions to the table consulted by
the MIME type checker when it attempts to choose the proper content handler
for the request. In both cases, the configuration information is needed
again at request time, so we have to keep it in long term storage within
the
SetHandler() is again a ``TAKE1'' type of callback. It recovers the content handler name
from its third argument and stores it in the
sub ForceType ($$$) { my($cfg, $parms, $type) = @_; $cfg->{'type'} = $type; }
The ForceType directive is used to force all documents in a path to be a particular MIME
type, regardless of its file extension. It's often used within a <Directory> section to force all documents contained within the directory to be a
particular MIME type, and is helpful for dealing with legacy documents that
don't have informative file extensions. The ForceType() callback uses TAKE1 syntax in which the required argument is a MIME type. The callback recovers
the MIME type and stores it in the
sub AddLanguage ($$@;@) { my($cfg, $parms, $language, $ext) = @_; $ext =~ s/^\.//; $cfg->{'language_types'}->{$ext} = lc $language; }
The last directive handler, AddLanguage(), implements the
AddLangauge directive, in which a series of file extensions are associated with a
language code (e.g. ``fr'' for French, ``en'' for English). It is an ITERATE2 callback and works just like
AddHandler(), except that the dot is stripped off the file extension before storing it
into the Now we turn our attention to the handler() subroutine itself. This code will be called at request time during the MIME type checking phase. It has four responsibilities:
sub handler { my $r = shift; if(-d $r->finfo) { $r->content_type(DIR_MAGIC_TYPE); return OK; }
handler() begins by shifting the Apache request object off the subroutine stack. The
subroutine now does a series of checks on the requested document. First, it
checks whether
my($type, @encoding) = guess_media_type($r->filename); $r->content_type($type) if $type; unshift @encoding, $r->content_encoding if $r->content_encoding; $r->content_encoding(join ", ", @encoding) if @encoding; If the file is not a directory, then we try to guess its MIME type and encoding. We call on the LWP::MediaTypes function guess_media_type() to do the work, passing it the filename and receiving a MIME type and list of encodings in return. Although unusual, it is theoretically possible for a file to have multiple encodings and LWP::MediaTypes allows this. The returned type is immediately used to set the MIME type of the requested document by calling the request object's content_type() method. Likewise, the list of encodings is added to the request using content_encoding() after joining them together into a comma-delimited string. The only subtlety here is that we honor any previously-defined encoding for the requested document by adding it to the list of encodings returned by guess_media_type(). This is in case the handler for a previous phase happened to add some content encoding.
Now comes some processing that depends on the values in the configuration
hash, so we recover the
my $cfg = Apache::ModuleConfig->get($r); The next task is to parse out the requested file's extensions and use them to set the file's MIME type and/or language.
for my $ext (LWP::MediaTypes::file_exts($r->filename)) { if(my $type = $cfg->{'language_types'}->{$ext}) { my $ltypes = $r->content_languages; push @$ltypes, $type; $r->content_languages($ltypes); } Using the LWP::MediaTypes function <file_exts()>, we split out all the extensions in the requested document's filename and loop through them. This allows a file named ``travel.html.fr'' to be recognized and dealt with appropriately. We first whether the extension matches one of the extensions in the configuration object's language_types key. If so, we use the extension to set the language code for the document. Although it is somewhat unusual, the HTTP specification allows a document to specify multiple languages in its Content-Language field, so we go to some lengths to merge multiple language codes into one long list which we then set with the request object's content_languages() method.
if(my $type = $cfg->{'handlers'}->{$ext} and !$r->proxyreq) { $r->handler($type); }
} While still in the loop, we deal with the content handler for the request. We check whether the extension is among the ones defined in the configuration variable's handlers hash. If so, we call the request object's handler() method to set the content handler to the indicated value. The only catch is that if the current transaction is a proxy request, we do not want to alter the content handler, because another module may have set the content handler during the URI translation phase.
$r->content_type($cfg->{'type'}) if $cfg->{'type'}; $r->handler($cfg->{'handler'}) if $cfg->{'handler'}; After looping through the file extensions, we handle the ForceType and SetHandler directives, which have the effect of overriding file extensions. If the configuration key type is non-empty, we use it to force the MIME type to the specified value. Likewise, if handler, is non-empty, we again call handler(), replacing whatever content handler was there before.
return OK; } At the end of handler() we return OK to tell Apache that the MIME type checking phase has been handled successfully. Although this module was presented mainly as an exercise, with minimal work it can be used to improve on mod_mime. For example, you might have noticed that the standard mod_mime has no ForceEncoding or ForceLanguage directives that allow you to override the file extension mappings in the way that you can with ForceType. This is easy enough to fix in Apache::MIME by adding the appropriate directive definitions and callbacks.
package Apache::MIME; # File: Makefile.PL use ExtUtils::MakeMaker; # See lib/ExtUtils/MakeMaker.pm for details of how to influence # the contents of the Makefile that is written. use Apache::src (); use Apache::ExtUtils qw(command_table); my @directives = ( { name => 'SetHandler', errmsg => 'a handler name', req_override => 'OR_FILEINFO' }, { name => 'AddHandler', errmsg => 'a handler name followed by one or more file extensions', req_override => 'OR_FILEINFO' }, { name => 'ForceType', errmsg => 'a handler name', req_override => 'OR_FILEINFO' }, { name => 'AddType', errmsg => 'a mime type followed by one or more file extensions', req_override => 'OR_FILEINFO' }, { name => 'AddLanguage', errmsg => 'a language (e.g., fr), followed by one or more file extensions', req_override => 'OR_FILEINFO' }, { name => 'AddEncoding', errmsg => 'an encoding (e.g., gzip), followed by one or more file extensions', req_override => 'OR_FILEINFO' }, { name => 'TypesConfig', errmsg => 'the MIME types config file', req_override => 'RSRC_CONF' }, ); command_table \@directives; WriteMakefile( 'NAME' => __PACKAGE__, 'VERSION_FROM' => 'MIME.pm', 'INC' => Apache::src->new->inc, ); __END__
package Apache::MIME; # File: Apache/MIME.pm use strict; use vars qw($VERSION @ISA); use LWP::MediaTypes qw(read_media_types guess_media_type add_type add_encoding); use DynaLoader (); use Apache (); use Apache::ModuleConfig (); use Apache::Constants qw(:common DIR_MAGIC_TYPE DECLINE_CMD); @ISA = qw(DynaLoader); $VERSION = '0.01';
if($ENV{MOD_PERL}) { no strict; @ISA = qw(DynaLoader); __PACKAGE__->bootstrap($VERSION); }
sub handler { my $r = shift; if(-d $r->finfo) { $r->content_type(DIR_MAGIC_TYPE); return OK; } my($type, @encoding) = guess_media_type($r->filename); $r->content_type($type) if $type; unshift @encoding, $r->content_encoding if $r->content_encoding; $r->content_encoding(join ", ", @encoding) if @encoding; my $cfg = Apache::ModuleConfig->get($r);
for my $ext (LWP::MediaTypes::file_exts($r->filename)) { if(my $type = $cfg->{'language_types'}->{$ext}) { my $ltypes = $r->content_languages; push @$ltypes, $type; $r->content_languages($ltypes); } if(my $type = $cfg->{'handlers'}->{$ext} and !$r->proxyreq) { $r->handler($type); }
}
$r->content_type($cfg->{'type'}) if $cfg->{'type'}; $r->handler($cfg->{'handler'}) if $cfg->{'handler'}; return OK; } sub TypesConfig ($$$) { my($cfg, $parms, $file) = @_; my $types_config = Apache->server_root_relative($file); read_media_types($types_config); #to co-exist with mod_mime.c return DECLINE_CMD if Apache->module("mod_mime.c"); } sub AddType ($$@;@) { my($cfg, $parms, $type, $ext) = @_; add_type($type, $ext); } sub AddEncoding ($$@;@) { my($cfg, $parms, $enc, $ext) = @_; add_encoding($enc, $ext); } sub SetHandler ($$$) { my($cfg, $parms, $handler) = @_; $cfg->{'handler'} = $handler; } sub AddHandler ($$@;@) { my($cfg, $parms, $handler, $ext) = @_; $cfg->{'handlers'}->{$ext} = $handler; } sub ForceType ($$$) { my($cfg, $parms, $type) = @_; $cfg->{'type'} = $type; } sub AddLanguage ($$@;@) { my($cfg, $parms, $language, $ext) = @_; $ext =~ s/^\.//; $cfg->{'language_types'}->{$ext} = lc $language; } 1; __END__
Configuring Apache with PerlWe've just seen how you can configure Perl modules using the Apache configuration mechanism. Now we turn it around to show you how to configure Apache from within Perl. Instead of configuring Apache by hand editing a set of configuration files, the Perl API allows you to write a set of Perl statements to dynamically configure Apache at run time. This gives you limitless flexibility. For example, you can create create complex configurations involving hundreds of virtual hosts without manually typing hundreds of <VirtualHost> sections into httpd.conf. Or you can write a master configuration file that will work without modification on any machine in a ``server farm.'' You could even look up configuration information at run time from a relational database.The key to Perl-based server configuration is the <Perl> directive. Unlike the other directives defined by mod_perl, this directive is paired to a corresponding </Perl> directive, forming a Perl section. When Apache hits a Perl section during startup time, it passes everything within the section to mod_perl. mod_perl in turn, compiles the contents of the section by evaluating it inside the Apache::ReadConfig package. After compilation is finished, mod_perl walks the Apache::ReadConfig symbol table looking for global variables with the same names as Apache's configuration directives. The values of those globals are then fed into Apache's normal configuration mechanism as if they'd been typed directly into the configuration file. The upshot of all this is that instead of setting the account under which the server runs with the User directive:
User www you can write this:
<Perl> $User = 'www'; </Perl> This doesn't look like much of a win until you consider that you can set this global using any arbitrary Perl expression, as for example:
<Perl> my $hostname = `hostname`; $User = 'www' if $hostname =~ /^papa-bear/; $User = 'httpd' if $hostname =~ /^momma-bear/; $User = 'nobody' if $hostname =~ /^goldilocks/; </Perl> The Perl global that you set must match the spelling of the corresponding Apache directive. Globals that do not match known Apache directives are silently ignored. Capitalization is not currently significant. In addition to single-valued directives such as User, Group and ServerRoot, you can use <Perl> sections to set multivalued directives such as DirectoryIndex and AddType. You can also configure multipart sections such as <Directory> and <VirtualHost>. Depending on the directive, the Perl global you need to set may be a scalar, an array or a hash. To figure out which type of Perl variable to use, follow these rules:
PERL_SECTIONS configuration variable set (Appendix B). They are evaluated in the order in
which they appear in httpd.conf, srm.conf and
access.conf. This allows you to use later <Perl> sections to override values declared in earlier parts of the configuration
files.
Debugging <Perl> SectionsIf there is a syntax error in the Perl code causing it to fail during compilation, Apache will report the problem and the server will not start.One way to catch Perl syntax errors ahead of time is to structure your <Perl> sections like this:
<Perl> #!perl
#... code here ...
__END__ </Perl>
You can now directly syntax check the configuration file using the Perl
interpreter's -cx switches. -c makes Perl perform a syntax check, and -x tells the interpreter to ignore all junk prior to the
% perl -cx httpd.conf httpd.conf syntax OK If the Apache configuration generated from your Perl code produces a syntax error, this message will be sent to the server error log, but the server will still start. In general, it is always a good to look at the error log after starting the server to make sure startup went smoothly. If you have not picked up this good habit already, we strongly recommend you do so when working with <Perl> configuration sections.
Another helpful trick is to build mod_perl with the Another tool that is occasionally useful is the Apache::PerlSections module. It defines two public routines named dump() and store(). dump() dumps out the current contents of the <Perl> section as a pretty-printed string. store does the same, but writes the contents to the file of your choice. Both methods are useful for making sure that the configuration you are getting is what you expect. Apache::PerlSections requires the Perl Devel::Symdump and Data::Dumper modules, both available on CPAN. Here is a simple example of its use:
<Perl> #!perl use Apache::PerlSections(); $User = 'nobody'; $VirtualHost{'192.168.2.5:80'} = { ServerName => 'www.fishfries.org', DocumentRoot => '/home/httpd/fishfries/htdocs', ErrorLog => '/home/httpd/fishfries/logs/error.log', TransferLog => '/home/httpd/fishfries/logs/access.log', ServerAdmin => 'webmaster@fishfries.org', }; print STDERR Apache::PerlSections->dump(); __END__ </Perl> This will cause the following to appear on the command line at server startup time:
package Apache::ReadConfig; #scalars:
$User = 'nobody';
#arrays:
#hashes:
%VirtualHost = ( '192.168.2.5:80' => { 'ServerAdmin' => 'webmaster@fishfries.org', 'ServerName' => 'www.fishfries.org', 'DocumentRoot' => '/home/httpd/fishfries/htdocs', 'ErrorLog' => '/home/httpd/fishfries/logs/error.log', 'TransferLog' => '/home/httpd/fishfries/logs/access.log' } );
1; __END__ The output from dump() and store() can be stored to a file and reloaded with a require statement. This allows you to create your configuration in a modular fashion:
<Perl> require "standard_configuration.pl"; require "virtual_hosts.pl"; require "access_control.pl"; </Perl> More information about Apache::PerlSections can be found in Appendix A.
Simple Dynamic ConfigurationIf the Perl configuration syntax seems a bit complex for your needs, there is a simple alternative. The special variables$PerlConfig
and @PerlConfig are treated as raw Apache configuration data. Their values are fed directly
to the Apache configuration engine, and treated just as if it was static
configuration data.
Examples:
<Perl> $PerlConfig = "User $ENV{USER}\n"; $PerlConfig .= "ServerAdmin $ENV{USER}\@$hostname\n"; </Perl>
<Perl> for my $host (qw(one red two blue)) { $host = "$host.fish.net"; push @PerlConfig, <<EOF;
Listen $host
<VirtualHost $host>
ServerAdmin webmaster\@$host ServerName $host # ... more config here ... </VirtualHost>
EOF } </Perl>
One more utility method is available, Apache->httpd_conf which simply pushes each argument into the
Apache->httpd_conf( "User $ENV{USER}", "ServerAdmin $ENV{USER}\@$hostname", );
A Real Life ExampleFor a complete example of an Apache configuration constructed with <Perl> sections, we'll look at Doug's setup. As a freelance contractor, Doug must often configure his development server in a brand new environment. Rather than creating a customized server configuration file each time, Doug uses a generic configuration that can be brought up anywhere simply by running:
% httpd -f $HOME/httpd.conf This one step automatically creates the server and document roots if they don't exist, as well as the log and configuration directories. It also detects the user that it is being run as, and configures the User and Group directives to match. Listing 8.5 shows a slightly simplified version of Doug's httpd.conf. It contains only two hard-coded Apache directives:
# file: httpd.conf PerlPassEnv HOME Port 9008 There's a PerlPassEnv directive with the value of ``HOME'', required in order to make the value of this environment variable visible to the code contained within the <Perl> section, and there's a Port directive set to Doug's favorite port number. The rest of the configuration file is entirely written in Perl:
<Perl> #!perl $ServerRoot = "$ENV{HOME}/www";
The <Perl> section begins by choosing a path for the server root. Doug likes to have
his test enviroment set up under his home directory in ~/www, so the variable
unless (-d "$ServerRoot/logs") { for my $dir ("", qw(logs conf htdocs perl)) { mkdir "$ServerRoot/$dir", 0755; } require File::Copy; File::Copy::cp($0, "$ServerRoot/conf"); }
Next, the code detects whether the server root has been properly
initialized, and if not, creates the requisite directories and
subdirectories. It looks to see whether C$ServerRoot/logs> exists and is
a directory. If not, the code proceeds to create the directories, calling mkdir() repeatedly to create first the server root and subsequently logs, conf, htdocs and perl subdirectories beneath it. The code then copies the generic httpd.conf file that is currently running into the newly-created conf subdirectory, using the File::Copy module's cp() routine. Somewhat magically,
mod_perl arranges for the Perl global variable
if(-e "$ServerRoot/startup.pl") { $PerlRequire = "startup.pl"; }
Next, the code checks whether there is a startup.pl present in the configuration directory. If this is the first time the
server is being run, the file won't be present, but there may well be one
there later. If the file exists, the code sets the
$User = getpwuid($>) || $>; $Group = getgrgid($)) || $); $ServerAdmin = $User;
The code sets the User, Group, and ServerAdmin directives next. The user and group are taken from the Perl magic variables
$ServerName = `hostname`; $DocumentRoot = "$ServerRoot/htdocs"; my $types = "$ServerRoot/conf/mime.types"; $TypesConfig = -e $types ? $types : "/dev/null";
The server name is set to the current host's name by setting the
push @Alias, ["/perl" => "$ServerRoot/perl"], ["/icons" => "$ServerRoot/icons"];
Next, the <Perl> section declares some directory aliases. The URI /perl is aliased to $ServerRoot/perl, and /icons is aliased to $ServerRoot/icons. Notice how the
my $servers = 3; for my $s (qw(MinSpareServers MaxSpareServers StartServers MaxClients)) { $$s = $servers; }
Following this the code sets the various parameters controlling Apache's
preforking. The server doesn't need to handle much load, since it's just
Doug's development server, so
for my $l (qw(LockFile ErrorLog TransferLog PidFile ScoreBoardFile)) { $$l = "logs/$l"; #clean out the logs local *FH; open FH, ">$ServerRoot/$$l"; close FH; } We use a similar trick to configure the LockFile, ErrorLog, TransferLog and other logfile-related directives. A few additional lines of code truncate the various log files to zero length if they already exist. Doug likes to start with a clean slate every time he reconfigures and restarts a server.
my @mod_perl_cfg = qw{ SetHandler perl-script Options +ExecCGI }; $Location{"/perl-status"} = { @mod_perl_cfg, PerlHandler => "Apache::Status", }; $Location{"/perl"} = { @mod_perl_cfg, PerlHandler => "Apache::Registry", };
The remainder of the configuration file sets up some directories for
running and debugging Perl API modules. We create a lexical variable named
use Apache::PerlSections (); Apache::PerlSections->store("$ServerRoot/ServerConfig.pm"); The very last thing that the <Perl> section does is to write out the current configuration into the file $ServerRoot/ServerConfig.pm. This snapshots the current configuration in a form that Doug can review and edit, if necessary. Just the configuration variables set within the <Perl< section are snapshot. The PerlPassEnv and Port directives, which are outside the section, are not captured and will have to be added manually. This technique makes possible the following interesting trick:
% httpd -C "PerlModule ServerConfig"
The -C switch tells httpd to process the directive
PerlModule, which in turn loads the module file ServerConfig.pm. Provided that Perl's
# file: httpd.conf PerlPassEnv HOME Port 9008
<Perl> #!perl $ServerRoot = "$ENV{HOME}/www"; unless (-d "$ServerRoot/logs") { for my $dir ("", qw(logs conf htdocs perl)) { mkdir "$ServerRoot/$dir", 0755; } require File::Copy; File::Copy::cp($0, "$ServerRoot/conf"); } if(-e "$ServerRoot/startup.pl") { $PerlRequire = "startup.pl"; } $User = getpwuid($>) || $>; $Group = getgrgid($)) || $); $ServerAdmin = $User;
$ServerName = `hostname`; $DocumentRoot = "$ServerRoot/htdocs"; my $types = "$ServerRoot/conf/mime.types"; $TypesConfig = -e $types ? $types : "/dev/null"; push @Alias, ["/perl" => "$ServerRoot/perl"], ["/icons" => "$ServerRoot/icons"]; my $servers = 3; for my $s (qw(MinSpareServers MaxSpareServers StartServers MaxClients)) { $$s = $servers; } for my $l (qw(LockFile ErrorLog TransferLog PidFile ScoreBoardFile)) { $$l = "logs/$l"; #clean out the logs local *FH; open FH, ">$ServerRoot/$$l"; close FH; } my @mod_perl_cfg = qw{ SetHandler perl-script Options +ExecCGI }; $Location{"/perl-status"} = { @mod_perl_cfg, PerlHandler => "Apache::Status", }; $Location{"/perl"} = { @mod_perl_cfg, PerlHandler => "Apache::Registry", }; use Apache::PerlSections (); Apache::PerlSections->store("$ServerRoot/ServerConfig.pm"); __END__ </Perl>
Documenting Configuration FilesWhen mod_perl is configured with the server, configuration files can be documented with POD. There are only a handful of POD directives that mod_perl recognizes, but enough so you can mix POD with actual server configuration. The recognized directives are as follows:
=pod
=head1 NAME
httpd.conf - The main server configuration file
=head2 Standard Module Configuration
=over 4
=item mod_status
=over to apache
#Apache will process directives in this section <Location /server-status> SetHandler server-status ... </Location>
=back to pod
=item ...
...
=back
=cut
__END__ The server will not try to process anything here We've now covered the entire Apache module API, at least as far as Perl is concerned. The next chapter presents a complete reference guide to the Perl API, organized by topic. This is followed in Chapter 10 by a reference guide to the C language API, which fills in the details that C programmers need to know about.
|
||||||
|