| Sign In/My Account | View Cart |
Introducing Automatic Data Expiration
Pages: 1, 2, 3, 4
Another solution to the synchronization bottleneck in the original version of GlobalCache is to change the role of the background thread slightly. In this solution, the background thread makes a copy of the data structure instead of directly checking the main data structure, and then uses the copy to check for expirations. This is exactly what the Tomcat 4.0 codebase does for session keys. Here's the code from Tomcat (from the classes org.apache.catalina.session.StandardManager and org.apache.catalina.session.ManagerBase).
From org.apache.catalina.session.StandardManager
/**
* The background thread that checks for session timeouts and shutdown.
*/
public void run() {
// Loop until the termination semaphore is set
while (!threadDone) {
threadSleep();
processExpires();
}
}
private void processExpires() {
long timeNow = System.currentTimeMillis();
Session sessions[] = findSessions();
for (int i = 0; i < sessions.length; i++) {
StandardSession session = (StandardSession) sessions[i];
if (!session.isValid())
continue;
int maxInactiveInterval = session.getMaxInactiveInterval();
if (maxInactiveInterval < 0)
continue;
int timeIdle = // Truncate, do not round up
(int) ((timeNow - session.getLastAccessedTime()) / 1000L);
if (timeIdle >= maxInactiveInterval) {
try {
session.expire();
} catch (Throwable t) {
log(sm.getString("standardManager.expireException"), t);
}
}
}
From org.apache.catalina.session.ManagerBase
/**
* Return the set of active Sessions associated with this Manager.
* If this Manager has no active Sessions, a zero-length array is returned.
*/
public Session[] findSessions() {
Session results[] = null;
synchronized (sessions) {
results = new Session[sessions.size()];
results = (Session[]) sessions.values().toArray(results);
}
return (results);
}
/**
* The set of currently active Sessions for this Manager, keyed by
* session identifier.
*/
protected HashMap sessions = new HashMap();
This is almost identical to the first version of GlobalCache we wrote. There's only one slight difference -- the method findSessions makes a duplicate copy of the main data structure, copying the values from the HashMap into an array that the background thread then uses to expire objects. Tomcat's approach solves the synchronization bottleneck: by making a second copy of the array and iterating over it, the Tomcat codebase avoids concurrent modification exceptions without locking the cache for an extended period of time.
The Tomcat approach does have some efficiency problems: copying all of the values in a HashMap is not cheap, and Tomcat still iterates over all the sessions. Moreover, making a duplicate copy of the session keys also actually makes the data inconsistency problem slightly worse: it's possible for a request to come in and use a session at the same time that the background thread is expiring the session. The scenario from the point of view of the user looks like this:
This example illustrates a fairly general problem with expiring data. Most approaches force you to choose between a synchronization bottleneck or a data consistency problem. The ideal sequence of operations for the background thread should perform something like the following sequence of operations:
None of our solutions do this (yet), although Tomcat and the TreeSet version of GlobalCache are pretty nice.
In this article, we've discussed the types of data that need to be expired and gone over several different algorithms for expiring data. As a result of this discussion, we've come up with three candidate solutions:
ExpirableObject. This can be made to work, but often just results in brittle and complex code.TreeSet as the index. This made expiration fast and easy, and completely eliminated both the data consistency and synchronization bottlenecks. On the other hand, both adding objects and resetting object expiration times were much slower because of the cost of maintaining the index.Looking at our various implementations gives us set of requirements for a solution:
|
Related Reading
|
Looking at our attempted solutions also lets us realize that one of our initial assumptions has been relaxed: using a single background thread that wakes up occasionally implies that the data can stay valid for an "extra" minute or two. Data expiration is often a casual requirement: the important thing is that the data lives for a while and then is expired at approximately the right time.
In the second article in this series, we'll use these requirements to define and build a new data structure for data expiration.
William Grosso is a coauthor of Java Enterprise Best Practices.
Related Articles by William Grosso
Learning Command Objects and RMI
Seamlessly Caching Stubs for Improved Performance
Return to ONJava.com.