Thanks again for your comments. More replies below:
That's interesting. Can you share some details about how it works?
Sure. It is quite simple. Cassandra is effectively a multi-level
distributed hash-map, so it lends itself very well do storing session
attributes.
The session manager maintains two column families (like tables), one to
hold session meta-data such as the last access timestamp, etc. and one
column family to hold session attributes. Storing or reading a session
attribute is simply a matter of writing it using the session ID as the
row ID, and the session attribute name as the column name, and the
session attribute value as the column value.
Session attributes are read and written independently, so the entire web
session does not have to be loaded into memory - only the session
attributes that are actually required to service a request are read.
This greatly reduces the memory footprint of the web applications that I
am developing for my employer.
I'd be concerned about how chatty that was.
Devil's advocate question: why store data in the session if it's not needed?
Good question. For large web applications, and particularly web-based
UIs with multiple
user screens, you would have certain data in your session for the
various screens/pages.
Not all pages need _all_ data in your session, and since the session
manager loads session
attributes only when the web app code asks for it, only the data that is
required for the
current page is loaded from Cassandra.
For improved performance I have added a write-through and a write-back
cache, implemented as servlet filters. The cache is flushed or written
back once the current request has finished processing. I am sure there
is room for improvement here, as multiple concurrent requests for the
same session should be served using the same cache instance.
But... (more devil's advocating, sorry) while this should address the
chattiness* problem, doesn't it mean that your solution is invasive and
can't be really deployed without modifying an app?
The session manager works without this cache, but is slow. The cache is
configured
as a filter configured in web.xml. The code of a web app won't have to
be changed,
but you need to update your web.xml to use the session manager effectively.
* is that even a word?
Yes it is, according to dictionary.com
The Manager does not maintain any references to Session instances at
all, allowing them to be garbage collected at any time. This makes
things very simple, as Cassandra holds all session state, and the
session managers in my Tomcat nodes only act as a cache in front of
Cassandra.
The nature of Cassandra and the Tomcat's implementation of web sessions
go together extremely well. I am surprised that nothing like this exists
already. It is a square hole, square peg sort of scenario.
I'm not entirely sure I agree.
Cassandra trades off consistency for availability and partition
tolerance, whereas I'd suggest a session management solution would want
to trade partition tolerance for consistency and availability.
I'm also not sure that the comparison between column store and session
attribute map stands up beyond the initial/apparent similarity between
data type.
Cassandra is write-optimised and hits disk (on at least two nodes for
HA) for every write AFAIK.
Cassandra allows you choose your consistency level. I use a quorum
write, which
writes to (N/2)+1 Cassandra nodes, where the Cassandra ring contains N
nodes.
I think this makes sense for web session data, and my current implementation
has this consistency-level hard-coded. I think it would probably make
sense to
allow this to be configured.
I also have an implementation of the Map interface that stores the
values of each entry as a session attribute. The way many developers
write web applications is to have a "session bean" (a session attribute)
that contains a Map that maintains the actual session attributes. This
is OK if the entire session is persisted as a whole, but it won't
perform very well with the Cassandra session manager (or the Delta
Session Manager from what I understand). A developer can replace their
session bean's HashMap with the SessionMap utility, and the session
attributes will be treated as proper session attributes by the session
manager.
Is there not a way to do this internally& therefore transparently to
the developer? Otherwise you're introducing more dependencies and
creating more of a framework than a pluggable manager.
I don't think there is a clean way of doing this without overriding the
default Map
implementations of the JVM. But, I think storing session data as
individual session
attributes rather than large object hierarchies is good (but not common)
programming practice. It allows the session container/manager to manage
read/write operations of the session attributes separately. This
practice should
benefit not only my Cassandra session manager but also the existing
Delta manager.
1. Be relatively self-contained -- i.e. not require much in the way of
changes to existing classes
There are no changes to existing classes. My session manager implements
the existing org.apache.catalina.Manager interface.
Instead of the filter, could you use a Valve?
For the cache? The main reason why I use a filter is to be able to tie a
cache
object to a thread-local variable for the period for which the request
is being
processed. As soon as the response is streamed to the client the cache
is released.
If Tomcat already contains some internal reference to the current
request then I
won't need to use a filter in this manner. I am not a fan of
thread-local variables,
so I'd very much like to remove the dependency on having this filter in
place.
Morten
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org