Apple's QuickTime turns 12 this year. Its very extensible file format has contributed to this longevity, allowing QuickTime to migrate from a world of CD-ROMs, AppleTalk, and static content to today's massively-networked, streaming, interactive world. The format is so flexible that it was chosen as the basis of the MPEG-4 file format. More than one might expect, the philosophy and concepts of the file format are integral to working with QuickTime structures at runtime.
However, the QuickTime APIs do much to isolate developers from the nuts-and-bolts of the file format when doing the most common tasks, so we'll examine the format with a simple pure-Java QuickTime file format parser, then we'll use some QuickTime for Java code to generate some different kinds of QuickTime files to illustrate the format's flexibility.
The details of the format are readily available in the 351-page Inside QuickTime: QuickTime File Format (PDF). They are also installed--for Mac OS X developers--in /Developer/Documentation/QuickTime/qtdevdocs/PDF/QTFileFormat.pdf by the Developer Tools installer.
The heart and soul of QuickTime is the concept of the "atom." The name should remind you of high-school chemistry, where an atom was the smallest unit of an element that retained the properties of the element. In QuickTime, an atom is the lowest level to which we can go and still be able to tell the difference between, say, an edit-list and a sprite. All atoms have a size and a type. Any other information they may contain depends on their type. This concept helps forwards-compatibility in the format--it's easy to skip over an unknown type because the size is right there.
There's a difference between "classic" atoms and newer "QT" atoms, but the latter is backwards-compatible with the former and both are commonly encountered in a single file. Let's focus on the commonalities. All atoms have a header of either 8 or 16 bytes, consisting of either two or three parts:
|
Sample Code Download the sample code for this article. |
moov, though any value is valid.1, then this field is present and interpreted as an 8-byte unsigned
integer. This allows an atom to contain more than 4 GB of data. The sample code contains a simple example in the EmptyMovie.mov
file, which is just an untitled movie created in QuickTime Player and saved
without modifiation. Open it in hexdump, od, or your
favorite hex editor (I'm fond of HexEdit for the Mac). If you dump the output as characters (i.e., hexedit -cv EmptyMovie.mov), the atom
types practically jump out at you:
\0 \0 \0 214 m o o v \0 \0 \0 l m v h d
\0 \0 \0 \0 272 @ Q 352 272 @ Q 372 \0 \0 002 X
\0 \0 \0 \0 \0 001 \0 \0 \0 377 \0 \0 \0 \0 \0 \0
\0 \0 \0 \0 \0 001 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
\0 \0 \0 \0 \0 001 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
\0 \0 \0 \0 @ \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
\0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
\0 \0 \0 001 \0 \0 \0 030 u d t a \0 \0 \0 \f
W L O C \0 4 \0 030 \0 \0 \0 \0
If we look at the byte values instead, and carefully count the sizes of the
atoms, we can see the structure of the movie. Figure 1 shows a graphic
representation. In case you're not comfortable reading hex, the file starts
with the size and type of the first atom, an 0x8c-long
moov, which matches the file size. It contains a
0x6c-long mvhd, which has a few non-null bytes. The
moov's other child is a udta of size
0x18, which itself contains a WLOC of size
0x0c.
Figure 1--graphic map of atoms in EmptyMovie.mov
Little things to notice:
moov and udata atoms contain other atoms, and
don't seem to do anything besides contain atoms. This is a key trait of
QuickTime atoms--they either contain data or other atoms, never both.
That's different from other tree-structured data formats like XML, where an
element can have both attributes and child elements.0x0000 that's in the udta but follows
the WLOC? Depending on your mood, it's a bug or a feature. Apple
says that they write an extra 32 bits of zero after the last child of a
udta atom to maintain compatibility with a bug from way back in
QuickTime 1.0.0000 8c00 6f6d 767f, then
you're running on Windows. QuickTime data structures are defined as "big-endian,"
meaning that the most-significant byte of a two-byte value comes first. PCs
running Windows use little-endian ordering, so the bytes appear backwards when
you look at 16-bit values.CAFEBABE
"magic number" that begins Java class files or the ID3
sequence that typically begins an ID3-tagged
MP3 file.What does all this say anyway? The file-format docs define the contents of
each of the "leaf" atoms, so we look there to interpret the
mvhd and WLOC atoms. Since this is a minimal movie,
there's not much to see--the mvhd is a "movie header;" a
structure that defines some metadata values like creation time, preferred
volume, time-scale, et cetera. These defaults are saved into the file. The next
atom is user data, udta, a container for an arbitrarily long list
of metadata atoms. This is a good place to put your own data into the movie,
with whatever format suits you, so long as you choose an unused atom type and
don't use all-lower-case, which is reserved for Apple. Here, there is only one
piece of user data, the window location, WLOC. It contains two
16-bit unsigned ints for x and y, in this case
(0x34,0x18) or in decimal,
(52,24).
|
Related Reading
Ant: The Definitive Guide |
While QuickTime for Java generally isolates you from the grubby details of
the format, I've included a simple all-Java QuickTime file parser so we can
quickly see the structure of a movie file on any J2SE platform. Download the
accompanying source tarball and open it up. The parser source and a
pre-compiled .jar are in the atom-parse directory. An Ant build.xml file is included to help you build the code, if you're interested (do ant help to see the available targets), or you can just run it from the .jar with java
-classpath atomparse.jar com.mac.invalidname.qtatomparse.AtomParser.
The code starts with a basic ParsedAtom class, which represents
any atom found in the file. This is subclassed as
ParsedContainerAtom, containing an array of its children, and
ParsedLeafAtom, which is meant to be a parent for type-specific
subclasses that interpret particular atom types. A factory provides the parser
with the class for a given type--new classes can be added by editing its
properties file. Finally, AtomParser puts it all together,
recursively calling a parseAtoms method when it discovers a
container atom, and returning an array of children.
Here's the critical section for reading an atom's size, type, extended size,
and data, given raf (a RandomAccessFile),
off (current offset that we're reading; i.e., start of an atom), and
stopAt (where the parent atom or file ends).
while (off < stopAt) {
raf.seek (off);
// 1. first 32 bits are atom size
// use BigInteger to convert bytes to long
// (instead of signed int)
int bytesRead = raf.read (atomSizeBuf, 0,
atomSizeBuf.length);
if (bytesRead < atomSizeBuf.length)
throw new IOException ("couldn't read atom length");
BigInteger atomSizeBI = new BigInteger (atomSizeBuf);
long atomSize = atomSizeBI.longValue();
// this is kind of a hack to handle the udta problem
// (see below) when the parent didn't have children,
// meaning we've read 4 bytes of 0 and the parent atom
// is already over
if (raf.getFilePointer() == stopAt)
break;
// 2. next, the atom type
bytesRead = raf.read (atomTypeBuf, 0
atomTypeBuf.length);
if (bytesRead != atomTypeBuf.length)
throw new IOException ("Couldn't read atom type");
String atomType = new String (atomTypeBuf);
// 3. if atomSize was 1, then this is 64-bit ext size
if (atomSize == 1) {
bytesRead = raf.read (extendedAtomSizeBuf, 0,
extendedAtomSizeBuf.length);
if (bytesRead != extendedAtomSizeBuf.length)
throw new IOException (
"Couldn't read extended atom size");
BigInteger extendedSizeBI =
new BigInteger (extendedAtomSizeBuf);
atomSize = extendedSizeBI.longValue();
}
// if this atom size is negative, or extends past end
// of file, it's extremely suspicious (i.e.,we're not
// really in a quicktime file)
if ((atomSize < 0) ||
((off + atomSize) > raf.length()))
throw new IOException (
"atom has invalid size: " + atomSize);
// 4. if a container atom, then parse the children
ParsedAtom parsedAtom = null;
if (ATOM_CONTAINER_TYPES.contains (atomType)) {
// children run from current point to end of the atom
ParsedAtom [] children =
parseAtoms (raf, raf.getFilePointer(), off + atomSize);
parsedAtom =
new ParsedContainerAtom (atomSize, atomType, children);
} else {
parsedAtom =
AtomFactory.getInstance().createAtomFor (
atomSize, atomType, raf);
}
// add atom to the list
parsedAtomList.add (parsedAtom);
// now set offset to next atom (or end-of-file
// in special case (atomSize = 0 means atom goes
// to EOF)
if (atomSize == 0)
off = raf.length();
else
off += atomSize;
// if a 'udta' container atom, then jump ahead 4
// to work around Apple's QT 1.0 workaround
// (http://developer.apple.com/technotes/qt/qt_03.html )
if (atomType.equals("udta"))
off += 4;
} // while not at stopAt
A few caveats to this code. First, please excuse my abuse of the
BigInteger class to get longs from four-byte arrays,
but the alternative is a blinding amount of bit-shifting. Moreover, the reason
I use longs for atom sizes is that it usually avoids signing
problems (32-bit java ints are signed, while the usual QuickTime
atom size is a 32-bit unsigned value). However, it will be wrong if
you happen to encounter an atom larger than 9,223,372,036,854,775,807 bytes
(i.e.,a 64-bit integer with the top bit set). Just thought I'd mention that, in
case you just got back from the store with a 10 exabyte drive. Also,
my scheme for knowing what atoms are containers is to list known containers in
AtomParser. If I've missed one, the parser handles it fairly
gracefully, because we have the size of the atom and simply advance the offset
to the next atom (unfortunately, without parsing the children).
Here's the output when we run the parser on EmptyMovie.mov:
moov (140 bytes) - 2 children
mvhd (108 bytes)
udta (24 bytes) - 1 child
WLOC (12 bytes) (x,y) == (52,24)
So far, so boring. Let's try a more interesting bit of content. The movie tim-drm-ref.mov is a 45-second sound bite of Tim O'Reilly discussing digital rights management at the recent O'Reilly Mac OS X conference. The file is a reference to a 51 MB movie of the entire keynote panel, yet this file is a dainty 6 KB, since it consists entirely of metadata, including the references to the original movie on the O'Reilly web site.
This file's structure is a lot more involved:
moov (5957 bytes) - 4 children
mvhd (108 bytes)
trak (3951 bytes) - 4 children
tkhd (92 bytes)
edts (36 bytes) - 1 child
elst (28 bytes) [1 edit]
mdia (3803 bytes) - 3 children
mdhd (32 bytes)
hdlr (58 bytes) [mhlr/vide - Apple Video Media Handler]
minf (3705 bytes) - 4 children
vmhd (20 bytes)
hdlr (55 bytes) [dhlr/url - Apple URL Data Handler]
dinf (76 bytes) - 1 child
dref (68 bytes)
stbl (3546 bytes) - 6 children
stsd (102 bytes)
stts (24 bytes)
stss (216 bytes)
stsc (172 bytes)
stsz (2248 bytes)
stco (776 bytes)
udta (12 bytes) - 0 children
trak (1857 bytes) - 4 children
tkhd (92 bytes)
edts (36 bytes) - 1 child
elst (28 bytes) [1 edit]
mdia (1709 bytes) - 3 children
mdhd (32 bytes)
hdlr (58 bytes) [mhlr/soun - Apple Sound Media Handler]
minf (1611 bytes) - 4 children
smhd (16 bytes)
hdlr (55 bytes) [dhlr/url - Apple URL Data Handler]
dinf (76 bytes) - 1 child
dref (68 bytes)
stbl (1456 bytes) - 5 children
stsd (132 bytes)
stts (24 bytes)
stsc (880 bytes)
stsz (20 bytes)
stco (392 bytes)
udta (12 bytes) - 0 children
udta (33 bytes) - 2 children
WLOC (12 bytes) (x,y) == (83,93)
SelO (9 bytes)
This file is far more typical of what we expect to see in a movie, or more
accurately, in a moov (go ahead, say it out loud:
moo-vee). In addition to the metadata-bearing mvhd movie
header and the udta user data, there are two trak
atoms, both with a deep, yet similar, structure. This movie consists of two
"tracks," one for video and one for audio. Tracks store metadata in
the tkhd track header (analogous to the mvhd we saw
earlier), an "edits" structure that indicates what parts of the
underlying media are used by the track, and a detailed "media"
structure.
The media structure has, again, a metadata header, a hdlr
handler atom that indicates which component should handle the media data, a
"data information" structure made up of dref data
references to say where the media data is (in this file, elsewhere on disk, on
the net, etc.), and finally, a tricky structure for locating and intepreting
media samples.
It's too much to try to understand what all of these atoms represent right away
if you're new to QuickTime, but it might be helpful to look at Apple's Introduction
to QuickTime tutorial, specifically the section on tracks
and media, and see how the contents map fairly directly onto the structure
presented in the preceding two paragraphs. Another point of interest is
Ridgeworks' QTatomizer,
a shareware product that represents the atom structure of a QuickTime movie as
a Swing JTree.
|
You might well wonder if all this stuff is really necessary. After all, MPEG-1 and MPEG-2 don't have a particular file format at all, and they seem pretty popular. What does all of this fanciness gain us?
Consider the power of storing media data by reference. Let's say you're writing an audio or video editor. Your user has selected a big segment of media from a file and wants to copy it from the source movie and paste it into a new one. Do you read all that data from disk? Media files are big, so that's going to take a while. Worse yet, if you can't store it all in memory, are you going to turn around and write it to a scratch movie? Great, copy-and-paste now requires copying hundreds of megabytes--even with fast hard drives, your user will be annoyed (and really unpleasant, if you fill the drive). Consider what QuickTime provides instead: the ability to refer to that source media and an edit list to say what parts of that source we want. The copy and paste is practically instantaneous--we just store pointers.
That's part of the thinking that led MPEG-4 to adopt the QuickTime file format. As Carsten Herpel, Guido Franceschini, and David Singer write in The MPEG-4 Book:
The MPEG committee sought a life-cycle format--one in which the files could be used when capturing media, editing it, and combining it; when serving the media as a file download or as a stream; and when exchanging partial or complete presentations. This need for a life-cycle format is not met in many simple file format designs. For example ... the design approach of MPEG-2, in which a stream is simply recorded to a file, makes editing hard. (pp. 253-4)
Beyond the issues of handling audio and video, consider the scope of MPEG-4, which, in its various permutations, can incorporate 2D and 3D graphics, compositing of captured video with rendered graphics, a Java API ("MPEG-J") for writing interactive applications to be delivered inside a movie or stream, etc. To support all of that, the format needs to be extremely extensible. With the ability to define new structures as new atom types, QuickTime fits the bill.
To learn more about MPEG-4, start at the MPEG-4 Industry Forum. Let's cut to the chase and let our parser take a look at some MPEG-4 content. Envivo, which makes MPEG-4 software, has a handy page of MPEG-4 samples from various sources. A few that I find amusing are the Philips television commercials. Here's what the 800K "CD-R Dinner" commercial looks like when we let our parser have a look at it:
ftyp (16 bytes)
skip (16 bytes)
mdat (2918834 bytes)
moov (46140 bytes) - 6 children
mvhd (108 bytes)
trak (469 bytes) - 3 children
tkhd (92 bytes)
mdia (337 bytes) - 3 children
mdhd (32 bytes)
minf (264 bytes) - 3 children
dinf (36 bytes) - 1 child
dref (28 bytes)
stbl (208 bytes) - 6 children
stts (24 bytes)
stsd (84 bytes)
stsz (20 bytes)
stsc (28 bytes)
stco (20 bytes)
ctts (24 bytes)
nmhd (12 bytes)
hdlr (33 bytes) [/odsm - ]
tref (32 bytes) - 1 child
mpod (24 bytes)
trak (449 bytes) - 2 children
tkhd (92 bytes)
mdia (349 bytes) - 3 children
mdhd (32 bytes)
minf (276 bytes) - 3 children
dinf (36 bytes) - 1 child
dref (28 bytes)
stbl (220 bytes) - 6 children
stts (24 bytes)
stsd (96 bytes)
stsz (20 bytes)
stsc (28 bytes)
stco (20 bytes)
ctts (24 bytes)
nmhd (12 bytes)
hdlr (33 bytes) [/sdsm - ]
trak (5855 bytes) - 2 children
tkhd (92 bytes)
mdia (5755 bytes) - 3 children
mdhd (32 bytes)
minf (5682 bytes) - 3 children
dinf (36 bytes) - 1 child
dref (28 bytes)
stbl (5622 bytes) - 6 children
stts (32 bytes)
stsd (118 bytes)
stsz (5200 bytes)
stsc (172 bytes)
stco (68 bytes)
ctts (24 bytes)
smhd (16 bytes)
hdlr (33 bytes) [/soun - ]
trak (39209 bytes) - 2 children
tkhd (92 bytes)
mdia (39109 bytes) - 3 children
mdhd (32 bytes)
minf (39036 bytes) - 3 children
dinf (36 bytes) - 1 child
dref (28 bytes)
stbl (38972 bytes) - 8 children
stts (5312 bytes)
stsd (196 bytes)
stsz (3628 bytes)
stsc (544 bytes)
stco (192 bytes)
ctts (24 bytes)
stss (108 bytes)
uuid (28960 bytes)
vmhd (20 bytes)
hdlr (33 bytes) [/vide - ]
iods (42 bytes)
skip (37 bytes)
Similar structure, but some significantly different contents. Here are some key differences worth noting:
moov. Some are
trivial (skip is a placeholder for free space in the file), but
mdat contains the raw media data for this movie. Our earlier
examples referred to media outside the movie file. This is the first our
parser has seen of a self-contained movie.wide that's
used like the first skip in this file, right before an
mdat or other potentially huge atom. It's a placeholder in case
the atom grows large enough to require an extended size, which means it would
need another 8 bytes of header.vmhd video media header and smhd sound media header
atoms, and associated handlers of subtypes vide and
soun), and two new MPEG-4-only tracks that have nmhd
headers. The handlers have subtypes odsm and sdsm.
There's another MPEG-4-only atom, the "initial object descriptor" or
iods. These MPEG-4 extensions are not defined in the QuickTime
spec, but that's okay. We don't trip up parsing them because they're still
normal atoms with a type and size.Now that we've toured the format and exposed ourselves to the parsing from
which QuickTime for Java isolates us (with calls like
Movie.fromFile()), we'll turn our attention to writing files. We
can write different different kinds of QuickTime files, depending on our
particular needs for an application.
The following code assumes that you have downloaded and installed the QuickTime for Java SDK
on your Mac or Windows machine (apologies, as always, to developers using
operating systems not supported by QuickTime). Because we'll want to use
MPEG-4, please make sure you have QuickTime 6. Also, while the sample code
includes an Ant build.xml
file, you'll need to copy my.ant.properties.mac or
my.ant.properties.win to my.ant.properties and
possibly edit it so that its qtjavazip.file entry points to
QTJava.zip on your system. Curiously, while the QTJ classes are
found in your Java extensions directory when running an application, they need
to be put in the CLASSPATH explicitly for a compile. Equivalent
caveats apply if you're using make or your favorite IDE.
On the other hand, if you just want to run the code, running java
-classpath makemovies.jar com.mac.invalidname.makemovies.MovieMaker
should work fine, with one more caveat--you must use Java 1.3 on the Mac,
because Apple is eliminating the JDirect library used by QuickTime for Java in
its upcoming Java 1.4 implementation and generally advises against calling
Carbon code from their Java 1.4. (This issue is a moving target and the 1.4
implementation is NDA'd, but here's
the java-dev post announcing the policy and a follow-up
with more details.)
The sample MakeMovies class creates a Movie in
memory composed of references to another movie, saving variants of this movie
to disk. The movie is created with low-level
edits, meaning functions that work with segments of a movie defined by
starting time and duration. To keep things simple, our movie consists of
three five-second segments grabbed from the beginning, middle, and end of
another movie:
// figure out start points for 5-second segments at
// approximate beginning, middle, and end of movie
int scale = sourceMovie.getTimeScale();
int end = sourceMovie.getDuration();
int fiveSeconds = 5 * sourceMovie.getTimeScale();
int[] startTimes = {0, // beginning
end/2, // middle
end - fiveSeconds};
// insert 5-second segments from sourceMovie into
// refMovie
int fiveSecRefTime = 5 * refMovie.getTimeScale();
for (int i=0; i < startTimes.length; i++) {
sourceMovie.insertSegment (refMovie,
startTimes[i],
fiveSeconds,
i * fiveSecRefTime);
}
With that, we have a 15-second movie, which the demo app plays in a
QTCanvas. Now to save it to disk.
If you were just combing over the javadocs, you might be tempted to use the
convertToFile method in the Movie class. It's fairly
straightforward, just needing the file and some constants for file-type, Mac
file "creator," and a Mac ScriptManager. The downside here is that
the generated file has uncompressed audio, and video barely compressed with
Apple's "Video" codec. Still, take a look at it with our atom parser
and we've got a normal-looking self-contained movie:
moov (2732 bytes) - 3 children
mvhd (108 bytes)
trak (631 bytes) - 3 children
tkhd (92 bytes)
edts (36 bytes) - 1 child
elst (28 bytes) [1 edit]
mdia (495 bytes) - 3 children
mdhd (32 bytes)
hdlr (58 bytes) [mhlr/soun - Apple Sound Media Handler]
minf (397 bytes) - 4 children
smhd (16 bytes)
hdlr (57 bytes) [dhlr/alis - Apple Alias Data Handler]
dinf (36 bytes) - 1 child
dref (28 bytes)
stbl (280 bytes) - 5 children
stsd (52 bytes)
stts (24 bytes)
stsc (40 bytes)
stsz (20 bytes)
stco (136 bytes)
trak (1985 bytes) - 3 children
tkhd (92 bytes)
edts (36 bytes) - 1 child
elst (28 bytes) [1 edit]
mdia (1849 bytes) - 3 children
mdhd (32 bytes)
hdlr (58 bytes) [mhlr/vide - Apple Video Media Handler]
minf (1751 bytes) - 4 children
vmhd (20 bytes)
hdlr (57 bytes) [dhlr/alis - Apple Alias Data Handler]
dinf (36 bytes) - 1 child
dref (28 bytes)
stbl (1630 bytes) - 6 children
stsd (102 bytes)
stts (40 bytes)
stss (56 bytes)
stsc (364 bytes)
stsz (824 bytes)
stco (236 bytes)
free (16 bytes)
wide (8 bytes)
mdat (5059930 bytes)
While we're here, let's note another seemingly-useful-but-probably-not
method: createShortcutMovie in the QTFile class.
You'd be forgiven for thinking that this creates a movie that preserves our
references to the media in the original movie. Not even close--take a look at
it with the atom-parser:
moov (254 bytes) - 1 child
mdra (246 bytes)
dref (238 bytes)
In other words, a "shortcut" movie is something of a QuickTime analogue to an file alias or symbolic link.
So far, none of these methods have given us a way to specify that we'd like
to state (and possibly change) the encoding or format of the saved movie.
That's the realm of the MovieExporter, which writes a movie in a
particular format with our choice of audio and video codecs. The code isn't
hard to understand: get an exporter for a particular format, bring up a dialog
for the user to specify encoding and quality settings, and let the exporter get
to work.
What can be tricky is getting a MovieExporter. The
list of available exporters is variable, depending on the user's version and
what optional pieces of QuickTime they have installed. One technique is to call
the MovieExporter with an int constant:
MovieExporter me =
new MovieExporter (StdQTConstants.kQTFileTypeMovie)
This creates an exporter to create typical QuickTime .movs.
You can also use the hex value 0x6d706734 to get an MPEG-4
exporter in QuickTime 6. In case you were wondering, that int is
the string mpg4 in ASCII. Passing short strings as 32-bit
ints is very common in the QuickTime API.
What if you want to offer the user the ability to export to a format that
might be a post-install add-on, or that might be included in a future version
of QuickTime? For this, the MovieExporter has a second
constructor, one that takes a ComponentIdentifier as its argument.
To find a suitable ComponentIdentifier, we can iterate through the
installed components, with ComponentIdentifier.find(), looking for
those that have type "spit," which is provided as the constant
StdQTConstants.movieExportType. The sample code produces a dialog
of the discovered choices, modestly validating those that are actually
appropriate for exporting our movie:
// build up a list of exporters and let user choose one
Vector compIdentifiers = new Vector();
ComponentIdentifier ci = null;
ComponentDescription cd =
new ComponentDescription(StdQTConstants.movieExportType);
while ( (ci = ComponentIdentifier.find(ci, cd)) != null) {
// check to see that the movie can be exported
// with this component (this throws some obnoxious
// exceptions, maybe a bit expensive?)
try {
MovieExporter exporter = new MovieExporter (ci);
if (exporter.validate (movie, null))
compIdentifiers.addElement (ci);
} catch (StdQTException expE) {} // ow!
}
The sample code then takes the Vector of
ComponentIdentifiers and populates a JComboBox, which
goes into a user dialog, as seen in Figure 2. The sample code tries to export
all tracks, audio and video. Choosing a movie audio-only format like
"AIFF" will throw a QTException. Production code could
be more careful about what tracks to export, or what choices the user has.

Figure 2--the choice of MovieExporter
Once the user has chosen a MovieExporter, we call a method
named doUserDialog to let the user choose quality and other
format-specific options. If the user chooses the normal "QuickTime
Movie," the export dialog looks like Figure 3. You may notice that the
MPEG-4 exporter dialog is exceptionally verbose and carefully explains whether
or not your choices will create a standard MPEG-4 file readable by other
machines. Another quirk of the MPEG-4 exporter is that Windows users won't be
able to export audio. (I'm not sure if this is because of technical
limitations or issues licensing the AAC audio codec from Dolby.)

Figure 3--the user dialog for QuickTime Movie export
The export takes a long time, particularly with large movies, slow
computers, or certain codecs. To provide a good user experience, it's best to
provide a progress update. In QTJ, a MovieProgress implementation
can get callbacks from time-consuming operations. One thing that makes this a
little difficult, however, is that the javadocs say that as the operation
progresses, your implementation will receive the messages
movieProgressOpen, movieProgressUpdatePercent, and
movieProgressClose ... but those values from the native QuickTime
API don't seem to be defined in QTJ. Fortunately, their values turn out to be
pretty simple: 0, 1, and 2, respectively. In the sample code, I've extended a
Swing ProgressMonitor to update as the export continues, as seen
in Figure 4. Unfortunately, this only works on the Mac. On Windows, the
callbacks occur on the AWT-Windows thread (even though the export
was called from the main thread) and QuickTime seems to block the
AWT thread, so our attempts to update the ProgressMonitor never
get a chance to repaint. I haven't found a clever thread-scheduling or
SwingUtilities way around this. If you do, please put it in the
talkback!

Figure 4--the progress bar for MovieExporter
Let's say that you're happy with saving as a QuickTime movie. In fact, you
want to keep the original audio and video encoding, but you want to eliminate
references to external files, copying all of the media data into one movie that
can be sent to other machines without breaking. This process of eliminating
references is called "flattening." It takes a straightforward call
to Movie.flatten() with a list of usually-constant values:
movie.flatten (0, // movieFlattenFlags
flatFile, // fileOut
StdQTConstants.kMoviePlayer, // creator
IOConstants.smSystemScript, // scriptTag
StdQTConstants.createMovieFileDeleteCurFile, // createQTFileFlags
StdQTConstants.movieInDataForkResID, // resId
flatFile.getName()); // resName
This produces a typical-looking QuickTime movie, with a big
mdat atom, indicating the media is inside of the movie file:
wide (8 bytes)
mdat (2326820 bytes)
moov (3100 bytes) - 4 children
mvhd (108 bytes)
trak (2077 bytes) - 3 children
tkhd (92 bytes)
edts (36 bytes) - 1 child
elst (28 bytes) [1 edit]
mdia (1941 bytes) - 3 children
mdhd (32 bytes)
hdlr (58 bytes) [mhlr/vide - Apple Video Media Handler]
minf (1843 bytes) - 4 children
vmhd (20 bytes)
hdlr (57 bytes) [dhlr/alis - Apple Alias Data Handler]
dinf (36 bytes) - 1 child
dref (28 bytes)
stbl (1722 bytes) - 5 children
stsd (102 bytes)
stts (24 bytes)
stsc (412 bytes)
stsz (920 bytes)
stco (256 bytes)
trak (895 bytes) - 3 children
tkhd (92 bytes)
edts (60 bytes) - 1 child
elst (52 bytes) [3 edits]
mdia (735 bytes) - 3 children
mdhd (32 bytes)
hdlr (58 bytes) [mhlr/soun - Apple Sound Media Handler]
minf (637 bytes) - 4 children
smhd (16 bytes)
hdlr (57 bytes) [dhlr/alis - Apple Alias Data Handler]
dinf (36 bytes) - 1 child
dref (28 bytes)
stbl (520 bytes) - 5 children
stsd (68 bytes)
stts (24 bytes)
stsc (256 bytes)
stsz (20 bytes)
stco (144 bytes)
udta (12 bytes) - 0 children
In a moment of curiosity, I browsed the methods of the
AtomContainer class, which is used (infrequently) to pass around
QuickTime memory structures as particularly complex parameters or for other
really low-level tasks. I noted that it has a getBytes() method
(inherited from QTHandleRef), and that a Movie could
be coaxed into an AtomContainer representation.
So I'm like, "Huh, I could get the raw bytes of the Movie ... wonder what that looks like."
Dumping the byte array to disk is simple, and the first few bytes look awfully familiar:
0000 0e3c 6d6f 6f76 0000 006d 6d76 6864
0000 0000 ba6b 3f16 ba6b 3f70 0000 0258
0000 2328 0001 0000 00ff 0000 0000 0000
...
Yep, there's moov and a mvhd right there on the
first line. The memory structure is almost identical to the
file format. Almost? Yes, it's apparently the same except for one
byte: the size of the mvhd is wrong. On the Mac, it's
0x006d, when it should be 0x006c. On Windows, it's
0x016c. Accounting for endian differences between the platforms,
it's like 1 was added to the size in an endian-specific way.
The sample code dumps the movie's AtomContainer two ways, in
its raw form as atom.out and with this byte fixed as
atom-fixed.mov. Surprisingly, in my testing, this fixed version
consistently plays in QuickTime Player.
This may not be a recommended way to create a movie on disk that just keeps pointers to its source segments, but it should help tie things together, to help illustrate the fact that QuickTime's concepts of movies, tracks, and media and of atoms and their containment heirarchy, and its use of pointers to media data, are not just a conceit of the file format, but a core concept of how movies are managed in memory and manipulated by code.
Now that you know how hairy those structures are, be glad that the API largely isolates you from them!
Chris Adamson is an author, editor, and developer specializing in iPhone and Mac.
Return to ONJava.com.
Copyright © 2009 O'Reilly Media, Inc.