| Sign In/My Account | View Cart |
XML plays a vital role in integrating business-to-business applications. To parse XML files, these applications use either a Simple API for XML (SAX) or a Document Object Model (DOM) parser. Parsing in single-threaded applications is straightforward. However, it is quite complex and challenging in a multithreaded application, such as an application server, because the applications often create a dedicated thread to parse XML, serving many concurrently running threads with the parsed data. This article describes one implementation of parsing XML in concurrent applications.
Based on producer-consumer concurrent programming concepts, a dedicated thread acts as producer to parse the XML. A group of threads act as consumers. As the producer thread parses XML data, it stores the data in a shared data structure for the consumer threads to pick up for further processing. To maximize throughput and minimize memory usage, this design uses a special queue for the producer and the consumers to store and to retrieve parsed data, respectively.
The class provides producer-consumer
threads with queuing functionalities. The primary responsibility of the
SmartQueueSmartQueue is to maintain the size of the queue to prevent
over- and under-flowing. In other words, the SmartQueue
maintains a fixed-length queue policy to maintain resource efficiency. It
enforces this policy by holding and waking up appropriate threads at the right
time. For instance, if there is no room to add data, the queue will hold the
producer thread until a consumer thread removes an item from the queue.
The following code snippet from the SmartQueue shows the
implementation of this strategy:
public synchronized void put(Object data) {
// check to see if the length is 2
while (list.size() >= 2) {
try {
System.out.println("Waiting to put data");
wait();
}
catch (Exception ex) {
}
}
list.add(data);
notifyAll();
}
public synchronized Object take() {
// wait until there is data to get
// come out if the end of file signaled
while (list.size() <= 0 && (eof != true)) {
try {
System.out.println("Waiting to consume data");
wait();
} catch (Exception ex) {
}
}
Object obj = null;
if (list.size() > 0) {
obj = list.remove(0);
} else {
System.out.println("Woke up because end of document");
}
notifyAll();
return obj;
}
|
Related Reading Java and XML |
This design uses the SAX API for parsing XML file for the following reasons:
The XMLParserHandler class extends SAX, implementing callback
methods to receive XML data from the parser. As XMLParseHandler
receives data from the parser, it puts the data in the hashtable. At the end
of each document, the XMLParseHandler puts the data in the
SmartQueue. The handler will go into a wait state if there is no
room in the SmartQueue. The call to the Put method
completes once the consumer threads remove the items from the
SmartQueue. Upon completing the entire XML document, the
XMLParseHandler notifies the consumer threads to stop looking for
more documents.
Let's look at the callback methods that store data in the
SmartQueue and notify waiting consumer threads. The
startElement method instantiates a new hashtable for each document
in the XML file.
public void startElement( String namespaceURI, String localName,
String qName, Attributes atts )
throws SAXException {
System.out.println(
" startElement local names............." +
localName + " " + qName);
if (qName.equalsIgnoreCase(elemmark)) {
doc = new Hashtable();
}
elem = qName;
}
The endElement method is responsible
for adding the parsed data into the SmartQueue. As mentioned earlier,
the SmartQueue holds this thread until there is room to store it.
public void endElement( String namespaceURI, String localName,
String qName )
throws SAXException {
String s = sbData.toString();
System.out.println("element " + elem + " character " + s);
if ((doc != null) & (s != null) & !(s.trim().equals("")))
doc.put(elem, s);
sbData = new StringBuffer();
System.out.println(" endElement ending element............." + qName);
if (qName.equalsIgnoreCase(elemmark)) {
System.out.println(
" endElement ending element............." + localName);
smartQueue.put(doc);
doc = null;
}
}
Finally, the endDocument callback method notifies the consumer
threads about the end of XML document. This means that consumer threads do not have to wait
for more data before finishing their work.
public void endDocument() throws SAXException {
smartQueue.end();
System.out.println("End Document.............");
}
Consumer threads remove items from the SmartQueue once the
producer thread puts items in the SmartQueue. Each consumer thread
will go into a wait state if the SmartQueue is empty. The
consumer threads run until the producer thread signals the end of document
processing and there are no more items in the SmartQueue.
Here is an example of a consumer thread implementation that keeps taking the
data from the SmartQueue until there is no more data or the end of
document is reached.
public void run() {
while (!queue.isEmpty() || !queue.onEnd()) {
Hashtable val = (Hashtable) queue.take();
System.out.println("Obtained by " + this.getName() + " " + val);
// try {
// System.out.println("Simulate lengthy processing...........");
// Thread.sleep(2000);
// }
// catch(Exception ex){}
}
}
This design provides the following benefits:
The SmartQueue implements a fixed-length queue policy to
maintain memory efficiently. By changing the implementation of its
Take and Put methods, you can enforce a different
policy. As mentioned earlier, the XMLParserHandler creates a hashtable of XML elements and values. However, this class can be customized to
build application-specific objects.
The source.zip file contains a TestProducerConsumerForXML class that takes the XML file as a parameter and runs the application. Follow the instructions below to run the
application:
TestProducerConsumerForXML with
order.xml.For example
c:\testarea>java -classpath \
c:\testarea prodcons.TestProducerConsumerForXML \
c:\testarea\prodcons\order.xml
This article has presented a method of parsing XML documents with concurrent programming. It has also explained the ideas behind the producer-consumer model, as well as thread coordination.
Prabu Arumugam is a software architect and senior Java developer at Forest Express, LLC.
Return to ONJava.com.
Showing messages 1 through 1 of 1.
I was doing some research on the internet and came across an article you wrote on J2EE Component Security. I'm the Senior Technology Recruiter here at Digital Insight (www.digitalinsight.com). We are looking for a J2EE Architect with significant experience with J2EE component security. We are located in the Los Angeles area. If this sounds of interest to you or if you know someone that may be a fit, please contact me.
Best Regards,
Michael Bright
(818) 878-6756
michael.bright@digitalinsight.com