Managing a large software project is a challenge. Code is constantly being changed: bugs surface and are fixed, branches are created and merged — it's no small task to keep things coherent. It's even trickier if the developers are geographically distributed; if a development team keeps their code away from everyone else for a month or two, it's likely that when they check it in you'll enter integration hell, with painful merges, conflicting interfaces, and a bevy of dependency problems.
There's no silver bullet for this problem. However, one practice that may help keep things under control is continuous integration. That sounds nice, since the software has to be integrated at some point — but what does continuous integration mean practically, on a day-to-day basis?
UltraLog is a large Defense Advanced Research Projects Agency (DARPA) project. The purpose of the project is to "extend the open source COUGAAR cognitive agent architecture using a layered, integrated approach with technologies in robustness, security, stability, and scalability." More importantly for this article's purposes, the UltraLog project is written in Java, by developers from over a dozen companies distributed around the United States. We needed something to help avoid integration problems; we needed a status page. So we put together the "Dashboard."

Figure 1. The Dashboard (Click for larger view)
There's lots of stuff related to the UltraLog project that, while important in its own right, doesn't make it onto the Dashboard.
So, what does the dashboard show?
That's an overview. But if you've put together an hourly build, you know that there are a lot of details involved. Let's look at some of the technical issues we encountered while hooking things together.
First of all, the whole process is driven by Ruby scripts and a Jakarta Ant build template. There's no room here to discuss those products, so I'll summarize by saying that Ruby is an excellent open source scripting language and that Ant is an excellent Java build tool.
The most important item on the Dashboard is compilation success/failure. If we can't compile the code, we can't do much else. So compilation status is displayed in color — a strategy that has added phrases like "[some project] is back in the green" to our lexicon. This allows someone to come to the Dashboard and see at a glance whose code is not integrating cleanly. It's an excellent motivational tool.
Technically, the Ant javac task takes care of the compilation.
Once it's done and the report is written to an XML file, the Ruby script that
drives the process parses that report, determines success or failure, and
counts the number of deprecated methods. This allows a developer to see that while the code may have been compiled, there may be newer methods to use. It
helps to reduce the integration load when new releases occur.
Ikko is a little templating engine written in Ruby. It's fine for small projects that don't get heavy traffic. In the case of the Dashboard, the web page gets rebuilt once an hour, so template caching is not a requirement.
You can see examples of Ikko's simple operation on the Ikko home page. Here's an example of loading a file and plugging in a couple values:
#!/usr/local/bin/ruby
require 'ikko'
fm=Ikko::FragmentManager.new
fm.base_path="."
puts fm["people.html", {"name"=>"Fred", "age"=>"25"}]
Here's the HTML file for the above snippet:
<html>
<body>
<!--Fragment key="name"--> is <!--Fragment key="age"--> years old.
</body>
</html>
Specify a file name and a Ruby Hash object and you have templated HTML.
JavaNCSS provides a command-line interface and an XML output format. We
used the Unix find utility to gather up a list of files:
$ find . -name *.java > files.txt
and then executed JavaNCSS with the XML flag:
$ javancss files.txt -xml > report.xml
With the help of the Ruby REXML library, the result can be parsed in a line of Ruby:
ncss = (REXML::Document.new(File.new("report.xml"))).elements["ncss"].text
and then it's plugged into an HTML template for display in the final report page.
This process is representative of how the other items are handled. The Ruby script invokes an Ant target or a command-line tool, a report is generated, and the Ruby script parses the result and plugs it into the HTML page.
![]() |
Essential Reading How to Keep Your Boss from Sinking Your Project Like it or not, your project needs management. Yet few good software projects can survive bad management. If you're a programmer on a high-visibility project, this PDF offers five principle guidelines for managing upward that will help you help your boss make the right decisions about setting project expectations, working with users and stakeholders, putting the project on the right track and keeping it there. The PDF also covers what problems cause projects to fail and how to fix them, and what you can do to keep your software project from running into trouble. Read Online--Safari Search this book on Safari: |
|
Showing recent CVS history proved a bit tricky. In order to display commits
by branch, we had to use cvs-exp.pl, an open source Perl script
that wraps the output of the CVS log command. We further wrapped
that in a homegrown Ruby CGI script that allows the recent commit history to
be rendered to HTML.
Since we use CVS for revision tracking, there are numerous open source tools available to create reports. We use the StatCVS tool to generate charts and graphs of CVS history. Again, we use a small Ruby script to drive the report generation. Here's the line of code that runs StatCVS itself:
$ java -jar statcvs.jar -output-dir path/to/html/dir/
project_name project_module/cvslog project_module
Since StatCVS comes in one .jar file, there are no dependencies to track. We run this report nightly, since it takes about 20 minutes to run on all our repositories.
PMD is a Java static analysis tool that checks for unused code, empty catch
blocks, and so forth. We run a subset of the standard PMD rules, and we've
also written a couple of custom rules to check for Thread
creation, Socket creation, and various other coding practices
that are not appropriate for this project. The documentation for the PMD Ant
task is straightforward, but one thing we found helpful was to always delete
the report file from the previous hour before generating a new one. That way,
if the code being checked goes from five errors to zero errors and no new file
is generated, the previous file won't linger around.
The Dashboard Ruby script then parses the PMD HTML report and determines the number of errors by simply counting the number of rows, as illustrated in this snippet:
count=0
File.new("pmd_report.html").each("<td ") {|x| count += 1}
ruleViolations=(count/4) unless count==0
This result is then displayed on the front page and hyperlinked to the full report.
CPD is a Java duplicated-code checker that comes bundled with PMD. We run CPD to check for sequences of more than one hundred duplicate tokens — quite a few, considering that CPD discards whitespace, comments, and various uninteresting sequences like import and package statements. Since CPD has an Ant task, integrating it into the build was similar to integrating PMD.
Note that OnJava.com has published several articles on both PMD and CPD, so there's a lot of information out there on both tools.
JUnit is a popular Java unit testing tool. Some of the developers have
begun to write JUnit tests for their code. To encourage this, we run those
tests and post the results on the Dashboard. In order to standardize a bit,
all tests are to be named by appending Test to the class name
(i.e., FooTest), and placed in a separate, parallel directory tree.
This lets the Ant task easily find the tests, and it keeps test code separate
from the production code. After the tests are run and the results sent to an
XML file via the JUnit Ant task's <formatter type="xml"/>
element, the Ruby script parses out the number of tests passed/failed:
def parseJunitFile(filename, result)
(REXML::Document.new(File.new(filename))).elements.each(
"build/target/task/message[@priority='info']") do |info|
if (info.text =~ "Tests run: ") != nil
tmp = info.text.split
result.testsTotal=result.testsTotal.to_i + tmp[2].to_i
result.testsFailed=result.testsFailed.to_i + tmp[4].to_i + tmp[6].to_i
end
end
end
This allows the totals to be displayed neatly on the Dashboard.
Generating Javadocs is also a straightforward operation with Ant. It's a fairly time-consuming task, though, so we only run it every four hours. Note that Javadocs can consume a considerable amount of disk space; all of the Javadocs on the Dashboard together take up around 500 MB.
We've found it handy to build an hourly drop of the class files and source
files in case someone wants to browse or run the latest code without checking
it out and compiling it. Since the code has to be compiled anyway, creating
these .jars is a simple matter of using the Ant zip task:
<target name="srczip" if="tic.build.usesSrcZip">
<delete file="${tic.build.srcZip}"/>
<zip destfile="${tic.build.srcZip}" basedir="${tic.build.srcDirForZipFile}"
includes="**/*.java"/>
</target>
<target name="jar" depends="compile">
<jar jarfile="${tic.build.jarFile}" baseDir="${buildDir}"/>
<signjar jar="${tic.build.jarFile}" keystore="/var/build/signingCA_keystore"
alias="privileged" storepass="keystore"/>
</target>
Note the dependency in the jar task; there's no need to attempt
to jar things up if the compilation step fails.
What else could be added to the hourly build page? In some projects, a test coverage report (a report on the percentage of the code that the unit tests actually cover) has been found useful. Several tools exist to provide such a report — Clover comes to mind. Of course, such a report isn't very useful unless a decent number of unit tests have been written.
Folks who are familiar with the Jakarta open source build tool Maven may notice some similarities. It might be possible to use Maven to do some of the things the Dashboard does, but Maven was not very far along when we first began putting the Dashboard together. It might be worth revisiting Maven to see if that's possible now.
We've discussed some things can make a large Java project hard to manage. We've looked at one large Java project — UltraLog — and how an hourly build status page helped keep things under control. We've also done a quick overview of some open source tools that you may find to be a useful part of your hourly build page. Give them a try!
Thanks to all of the folks who have donated their time and energy towards the various open source tools mentioned in this article.
cvs-exp.plTom Copeland started programming on a TRS-80 Model III, but demand for that skill has waned and he now programs mostly in Java and Ruby.
Return to ONJava.com.
Copyright © 2009 O'Reilly Media, Inc.