On Thu, 24 Feb 2011 16:01:39 -0000, wrote:
Author: markt
Date: Thu Feb 24 16:01:38 2011
New Revision: 1074192
URL: http://svn.apache.org/viewvc?rev=1074192&view=rev
Log:
Add documentation for the Crawler Session Manager Valve.
Modified:
tomcat/trunk/webapps/docs/changelog.xml
tomcat/trunk/webapps/docs/config/valve.xml
Modified: tomcat/trunk/webapps/docs/changelog.xml
URL:
http://svn.apache.org/viewvc/tomcat/trunk/webapps/docs/changelog.xml?rev=1074192&r1=1074191&r2=1074192&view=diff
==============================================================================
--- tomcat/trunk/webapps/docs/changelog.xml (original)
+++ tomcat/trunk/webapps/docs/changelog.xml Thu Feb 24 16:01:38 2011
@@ -130,6 +130,14 @@
<code>ServletContext.getResourcePaths()</code> includes
static resources
packaged in JAR files in its output. (markt)
</fix>
+ <add>
+ Web crawlers can trigger the creation of many thousands of
sessions as
+ they crawl a site which may result in significant memory
consumption.
+ Thw new Crawler Session Manager Valve ensures that crawlers
are
The new Crawler ...
regards
Felix
+ associated with a single session - just like normal users -
regardless
+ of whether or not they provide a session token with their
requests.
+ (markt)
+ </add>
</changelog>
</subsection>
<subsection name="Coyote">
Modified: tomcat/trunk/webapps/docs/config/valve.xml
URL:
http://svn.apache.org/viewvc/tomcat/trunk/webapps/docs/config/valve.xml?rev=1074192&r1=1074191&r2=1074192&view=diff
==============================================================================
--- tomcat/trunk/webapps/docs/config/valve.xml (original)
+++ tomcat/trunk/webapps/docs/config/valve.xml Thu Feb 24 16:01:38
2011
@@ -880,6 +880,62 @@
</section>
+<section name="Crawler Session Manager Valve">
+
+ <subsection name="Introduction">
+
+ <p>Web crawlers can trigger the creation of many thousands of
sessions as
+ they crawl a site which may result in significant memory
consumption. This
+ Valve ensures that crawlers are associated with a single session
- just like
+ normal users - regardless of whether or not they provide a
session token
+ with their requests.</p>
+
+ <p>This Valve may be used at the <code>Engine</code>,
<code>Host</code> or
+ <code>Context</code> level as required. Normally, this Valve
would be used
+ at the <code>Engine</code> level.</p>
+
+ <p>If used in conjunction with Remote IP valve then the Remote
IP valve
+ should be defined before this valve to ensure that the correct
client IP
+ address is presented to this valve.</p>
+
+ </subsection>
+
+ <subsection name="Attributes">
+
+ <p>The <strong>Crawler Session Manager Valve</strong> supports
the
+ following configuration attributes:</p>
+
+ <attributes>
+
+ <attribute name="className" required="true">
+ <p>Java class name of the implementation to use. This MUST
be set to
+
<strong>org.apache.catalina.valves.CrawlerSessionManagerValve</strong>.
+ </p>
+ </attribute>
+
+ <attribute name="crawlerUserAgents" required="false">
+ <p>Regular expression (using <code>java.util.regex</code>)
that the user
+ agent HTTP request header is matched against to determine if
a request
+ is from a web crawler. If not set, the default of
+ <code>.*GoogleBot.*|.*bingbot.*|.*Yahoo! Slurp.*</code> is
used.</p>
+ </attribute>
+
+ <attribute name="sessionInactiveInterval" required="false">
+ <p>The minimum time in seconds that the Crawler Session
Manager Valve
+ should keep the mapping of client IP to session ID in memory
without any
+ activity from the client. The client IP / session cache will
be
+ periodically purged of mappings that have been inactive for
longer than
+ this interval. If not specified the default value of
<code>60</code>
+ will be used.</p>
+ </attribute>
+
+ </attributes>
+
+ </subsection>
+
+</section>
+
+
</body>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org