Apache Configuration for DataONE Services ========================================= This document refers specifically to configuration directives that must be enabled to ensure Apache correctly processes the REST URLs used by the DataONE service interfaces. Parameters in question: :`AllowEncodedSlashes`_: ``(Off)|On`` The AllowEncodedSlashes directive allows URLs which contain encoded path separators (%2F for / and additionally %5C for \ on according systems) to be used. Normally such URLs are refused with a 404 (Not found) error. :`AcceptPathInfo`_: ``Off|On|(Default)`` This directive controls whether requests that contain trailing pathname information that follows an actual filename (or non-existent file in an existing directory) will be accepted or rejected. Both of these must be set to *On* for Member Node and Coordinating Node services to ensure that URLs containing identifiers as path element (e.g. for :func:`MN_crud.get`) are not rejected or mishandled by the Apache web server. These parameters **must** be in effect for the section of the web server configuration handling DataONE service requests. Examples -------- The following examples provide an indication of Apache response for different configurations. The version of Apache being examined was:: Apache/2.2.14 (Unix) DAV/2 mod_ssl/2.2.14 OpenSSL/0.9.8l PHP/5.3.1 mod_perl/2.0.4 Perl/v5.10.1 A simple Perl CGI script was installed in the web server root content folder, which was ExecCGI enabled. The script:: $ cat htdocs/test.cgi #!/usr/bin/perl print "Content-type: text/html\n\n"; foreach $key (keys %ENV) { print "$key --> $ENV{$key}\n"; } Only relevant output from the script is provided in the examples below. ---- :AllowEncodedSlashes: Off :AcceptPathInfo: Off :Request: http://localhost/test.cgi/bogus%2Fstuff :PID Equivalent: "bogus/stuff" :Error Message: Mon Dec 13 15:45:00 2010] [info] [client ::1] found %2f (encoded '/') in URI (decoded='/test.cgi/bogus/stuff'), returning 404 :Response: Default 404 ---- :AllowEncodedSlashes: On :AcceptPathInfo: Off :Request: http://localhost/test.cgi/bogus%2Fstuff :PID Equivalent: "bogus/stuff" :Error Message: Mon Dec 13 15:46:08 2010] [error] [client ::1] AcceptPathInfo off disallows user's path: /Applications/XAMPP/xamppfiles/htdocs/test.cgi :Response: Default 404 ---- :AllowEncodedSlashes: Off :AcceptPathInfo: On :Request: http://localhost/test.cgi/bogus%2Fstuff :PID Equivalent: "bogus/stuff" :Error Message: Mon Dec 13 15:46:48 2010] [info] [client ::1] found %2f (encoded '/') in URI (decoded='/test.cgi/bogus/stuff'), returning 404 :Response: Default 404 ---- :AllowEncodedSlashes: On :AcceptPathInfo: On :PID Equivalent: "bogus/stuff" :Request: http://localhost/test.cgi/bogus%2Fstuff :Error Message: None :Response: :: SCRIPT_NAME --> /test.cgi SERVER_NAME --> localhost SERVER_ADMIN --> you@example.com PATH_INFO --> /bogus/stuff REQUEST_METHOD --> GET HTTP_ACCEPT --> */* SCRIPT_FILENAME --> /Applications/XAMPP/xamppfiles/htdocs/test.cgi VERSIONER_PERL_PREFER_32_BIT --> no SERVER_SOFTWARE --> Apache/2.2.14 (Unix) DAV/2 mod_ssl/2.2.14 OpenSSL/0.9.8l PHP/5.3.1 mod_perl/2.0.4 Perl/v5.10.1 QUERY_STRING --> REMOTE_PORT --> 50155 HTTP_USER_AGENT --> curl/7.19.7 (universal-apple-darwin10.0) libcurl/7.19.7 OpenSSL/0.9.8l zlib/1.2.3 SERVER_SIGNATURE --> SERVER_PORT --> 80 REMOTE_ADDR --> ::1 SERVER_PROTOCOL --> HTTP/1.1 PATH --> /usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/opt/local/bin:/usr/local/git/bin REQUEST_URI --> /test.cgi/bogus%2Fstuff GATEWAY_INTERFACE --> CGI/1.1 SERVER_ADDR --> ::1 DOCUMENT_ROOT --> /Applications/XAMPP/xamppfiles/htdocs PATH_TRANSLATED --> /Applications/XAMPP/xamppfiles/htdocs/bogus/stuff HTTP_HOST --> localhost VERSIONER_PERL_VERSION --> 5.10.0 UNIQUE_ID --> TQaGaEprSyIAAFOcw20AAAAB ---- :AllowEncodedSlashes: On :AcceptPathInfo: On :Request: http://localhost/test.cgi/bogus%2Fstuff%3Fvar%3Dvalue :PID Equivalent: "bogus/stuff?var=value" :Error Message: None :Response: :: SCRIPT_NAME --> /test.cgi SERVER_NAME --> localhost SERVER_ADMIN --> you@example.com PATH_INFO --> /bogus/stuff?var=value REQUEST_METHOD --> GET HTTP_ACCEPT --> */* SCRIPT_FILENAME --> /Applications/XAMPP/xamppfiles/htdocs/test.cgi VERSIONER_PERL_PREFER_32_BIT --> no SERVER_SOFTWARE --> Apache/2.2.14 (Unix) DAV/2 mod_ssl/2.2.14 OpenSSL/0.9.8l PHP/5.3.1 mod_perl/2.0.4 Perl/v5.10.1 QUERY_STRING --> REMOTE_PORT --> 64650 HTTP_USER_AGENT --> curl/7.19.7 (universal-apple-darwin10.0) libcurl/7.19.7 OpenSSL/0.9.8l zlib/1.2.3 SERVER_SIGNATURE --> SERVER_PORT --> 80 REMOTE_ADDR --> ::1 SERVER_PROTOCOL --> HTTP/1.1 PATH --> /usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/opt/local/bin:/usr/local/git/bin REQUEST_URI --> /test.cgi/bogus%2Fstuff%3Fvar%3Dvalue GATEWAY_INTERFACE --> CGI/1.1 SERVER_ADDR --> ::1 DOCUMENT_ROOT --> /Applications/XAMPP/xamppfiles/htdocs PATH_TRANSLATED --> /Applications/XAMPP/xamppfiles/htdocs/bogus/stuff?var=value HTTP_HOST --> localhost VERSIONER_PERL_VERSION --> 5.10.0 UNIQUE_ID --> TQaK80prSyIAAFOexIUAAAAD ---- :AllowEncodedSlashes: On :AcceptPathInfo: On :Request: http://localhost/test.cgi/bogus%2Fstuff%3Fvar%3Dvalue?var2=value2 :PID Equivalent: "bogus/stuff?var=value" with query string at the end. :Error Message: None :Response: :: SCRIPT_NAME --> /test.cgi SERVER_NAME --> localhost SERVER_ADMIN --> you@example.com PATH_INFO --> /bogus/stuff?var=value REQUEST_METHOD --> GET HTTP_ACCEPT --> */* SCRIPT_FILENAME --> /Applications/XAMPP/xamppfiles/htdocs/test.cgi VERSIONER_PERL_PREFER_32_BIT --> no SERVER_SOFTWARE --> Apache/2.2.14 (Unix) DAV/2 mod_ssl/2.2.14 OpenSSL/0.9.8l PHP/5.3.1 mod_perl/2.0.4 Perl/v5.10.1 QUERY_STRING --> var2=value2 REMOTE_PORT --> 49339 HTTP_USER_AGENT --> curl/7.19.7 (universal-apple-darwin10.0) libcurl/7.19.7 OpenSSL/0.9.8l zlib/1.2.3 SERVER_SIGNATURE --> SERVER_PORT --> 80 REMOTE_ADDR --> ::1 SERVER_PROTOCOL --> HTTP/1.1 PATH --> /usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/opt/local/bin:/usr/local/git/bin REQUEST_URI --> /test.cgi/bogus%2Fstuff%3Fvar%3Dvalue?var2=value2 GATEWAY_INTERFACE --> CGI/1.1 SERVER_ADDR --> ::1 DOCUMENT_ROOT --> /Applications/XAMPP/xamppfiles/htdocs PATH_TRANSLATED --> /Applications/XAMPP/xamppfiles/htdocs/bogus/stuff?var=value HTTP_HOST --> localhost VERSIONER_PERL_VERSION --> 5.10.0 UNIQUE_ID --> TQaLPEprSyIAAFOdxIcAAAAC ---- :AllowEncodedSlashes: On :AcceptPathInfo: On :Request: http://localhost/test.cgi/bogus%2Fstuff%3Fvar=value?var2=value2 :PID Equivalent: "bogus/stuff?var=value" with query string at the end :Error Message: None :Response: :: SCRIPT_NAME --> /test.cgi SERVER_NAME --> localhost SERVER_ADMIN --> you@example.com PATH_INFO --> /bogus/stuff?var=value REQUEST_METHOD --> GET HTTP_ACCEPT --> */* SCRIPT_FILENAME --> /Applications/XAMPP/xamppfiles/htdocs/test.cgi VERSIONER_PERL_PREFER_32_BIT --> no SERVER_SOFTWARE --> Apache/2.2.14 (Unix) DAV/2 mod_ssl/2.2.14 OpenSSL/0.9.8l PHP/5.3.1 mod_perl/2.0.4 Perl/v5.10.1 QUERY_STRING --> var2=value2 REMOTE_PORT --> 59889 HTTP_USER_AGENT --> curl/7.19.7 (universal-apple-darwin10.0) libcurl/7.19.7 OpenSSL/0.9.8l zlib/1.2.3 SERVER_SIGNATURE --> SERVER_PORT --> 80 REMOTE_ADDR --> ::1 SERVER_PROTOCOL --> HTTP/1.1 PATH --> /usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/opt/local/bin:/usr/local/git/bin REQUEST_URI --> /test.cgi/bogus%2Fstuff%3Fvar=value?var2=value2 GATEWAY_INTERFACE --> CGI/1.1 SERVER_ADDR --> ::1 DOCUMENT_ROOT --> /Applications/XAMPP/xamppfiles/htdocs PATH_TRANSLATED --> /Applications/XAMPP/xamppfiles/htdocs/bogus/stuff?var=value HTTP_HOST --> localhost VERSIONER_PERL_VERSION --> 5.10.0 UNIQUE_ID --> TQaNjkprSyIAAFOfxYgAAAAE ---- :AllowEncodedSlashes: On :AcceptPathInfo: On :Request: http://localhost/test.cgi/bogus%2Fstuff/something/else :PID Equivalent: "bogus/stuff" with additional path at the end :Error Message: None :Response: :: SCRIPT_NAME --> /test.cgi SERVER_NAME --> localhost SERVER_ADMIN --> you@example.com PATH_INFO --> /bogus/stuff/something/else REQUEST_METHOD --> GET HTTP_ACCEPT --> */* SCRIPT_FILENAME --> /Applications/XAMPP/xamppfiles/htdocs/test.cgi VERSIONER_PERL_PREFER_32_BIT --> no SERVER_SOFTWARE --> Apache/2.2.14 (Unix) DAV/2 mod_ssl/2.2.14 OpenSSL/0.9.8l PHP/5.3.1 mod_perl/2.0.4 Perl/v5.10.1 QUERY_STRING --> REMOTE_PORT --> 57774 HTTP_USER_AGENT --> curl/7.19.7 (universal-apple-darwin10.0) libcurl/7.19.7 OpenSSL/0.9.8l zlib/1.2.3 SERVER_SIGNATURE --> SERVER_PORT --> 80 REMOTE_ADDR --> ::1 SERVER_PROTOCOL --> HTTP/1.1 PATH --> /usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/opt/local/bin:/usr/local/git/bin REQUEST_URI --> /test.cgi/bogus%2Fstuff/something/else GATEWAY_INTERFACE --> CGI/1.1 SERVER_ADDR --> ::1 DOCUMENT_ROOT --> /Applications/XAMPP/xamppfiles/htdocs PATH_TRANSLATED --> /Applications/XAMPP/xamppfiles/htdocs/bogus/stuff/something/else HTTP_HOST --> localhost VERSIONER_PERL_VERSION --> 5.10.0 UNIQUE_ID --> TQaQiEprSyIAAFOixfMAAAAF Configuration ------------- As of Apache 2.2.14, there are some bugs that affect the AllowEncodedSlashes setting. `Bug 46830`_: If "AllowEncodedSlashes On" is set in the global context, it is not inherited by virtual hosts. You must explicitly set "AllowEncodedSlashes On" in every container. The documentation for how the different configuration sections are merged (http://httpd.apache.org/docs/2.2/sections.html) says "Sections inside sections are applied after the corresponding sections outside the virtual host definition. This allows virtual hosts to override the main server configuration." Virtual hosts are used in many default Apache configurations. In Ubuntu, the default VirtualHost container is set up in /etc/apache2/sites-available/default. `Bug 35256`_: %2F will be decoded in PATH_INFO (Documentation to AllowEncodedSlashes says no decoding will be done) The consequence of this bug is that only the last section in a URL can contain slashes. Conclusions ----------- 1. *AllowEncodedSlashes* and *AcceptPathInfo* must be set to *On* 2. We can successfully add query parameters to the end of the URL providing the identifier embedded in the path is properly encoded. 3. Adding additional path elements beyond the encoded identifier segment will require additional processing, which entails custom parsing of the REQUEST_URI environment variable passed on by the web server. .. _AllowEncodedSlashes: http://httpd.apache.org/docs/2.0/mod/core.html#AllowEncodedSlashes .. _AcceptPathInfo: http://httpd.apache.org/docs/2.0/mod/core.html#AcceptPathInfo .. _`Bug 35256`: https://issues.apache.org/bugzilla/show_bug.cgi?id=35256 .. _`Bug 46830`: https://issues.apache.org/bugzilla/show_bug.cgi?id=46830