Jaguar User Guide
Matt Welsh
Jaguar v2.1 - Last modified 17 May 2000
This is the user guide for Jaguar v2.1, dated 17 May 2000. Jaguar is a system developed by Matt Welsh at the UC Berkeley Computer Science Division which supports high-performance Java communication and I/O. For more information, see the Jaguar Project web pages.
Before we discuss actually using Jaguar, it might help to get an idea of how the system works. Please read the papers and talks at the Jaguar project web site for background on Jaguar's motivation.
Jaguar consists of two compilers: a front-end compiler which converts Java bytecode to Jaguar bytecode, and a back-end compiler which converts Jaguar bytecode to machine code. The front-end compiler is itself written entirely in Java and is invoked with the command jaguarc. There are two back-ends available: one is a modification of OpenJIT, and the other is a modification to the GCJ Java front-end to GCC. OpenJIT-Jaguar acts like a regular JIT compiler, except that it understands Jaguar bytecode. GCJ-Jaguar compiles Jaguar bytecode to object files, which are then linked into an application binary, which is invoked just as one would any other program. The binary is also linked against a runtime library (libgcj) which provides all of the standard Java libraries, garbage collection, threads, and so forth.
jaguarc: The Jaguar front-end compiler
The real meat of how Jaguar works is embedded in jaguarc, the front-end compiler. jaguarc takes a Java classfile and converts it to Jaguar bytecode. Jaguar bytecode is identical to Java bytecode, with the exception that Jaguar bytecode has several new instructions not supported by Java. Currently, these instructions are
Clearly, we don't want to allow Java application code to use these instructions directly; Jaguar's approach is to translate Java bytecode to Jaguar bytecode at compile time, by applying a set of translation rules. These translation rules embody the privileged Jaguar driver code for a specific function, such implementing Pre-Serialized Objects, or manipulation of VIA network buffers. In some sense, you can think of the Jaguar driver as a specialized "library" for a specific device or low-level system operation which one wishes to perform from Java; however, this driver code is inlined into the Java application at compile time.
Because Jaguar drivers are implemented through bytecode translation, a number of interesting properties arise. First, the resulting code is very efficient. Jaguar driver code is directly inlined with application code, and can be optimized (by the compiler) along with it. Also, the high cost of native methods is entirely avoided. Native methods incur a high overhead for both the method call itself as well as for copying data between Java and the native code context. This is particularly bad using JNI, found in most JVMs. The performance problems with native methods are documented in the papers on Jaguar.
Secondly, rather than hiding all of the driver code behind a method-call interface (as is done with native methods), we can translate an arbitrary set of Java bytecodes into Jaguar bytecode with inlined Jaguar driver code. For example, field accesses can be translated into specialized Jaguar driver code which implements fast object serialization; this is what is done with Pre-Serialized Objects. This is a powerful concept, as it allows Java applications to express system-level operations in a natural way (e.g., through field accesses, use of operators, or method calls as appopriate), yet the implementation of those operations can be extremely efficient. Again, the Jaguar papers contain much more detail about this.
jaguarc takes a Java .class file and produces either a Jaguar class file, which ends in the extension .jagc, or an annotated Java classfile, which is a copy of the original classfile with additional information for use by the back-end compiler. Jaguar classfiles are used by GCJ-Jaguar, while annotated classfiles are used by OpenJIT-Jaguar. The basic idea is that when you use a Jaguar-enabled JIT, the JVM only understands "standard" Java classfiles; using the annotations allows us to endow the classfile with additional information which the back-end compiler can use. The static Java compiler has no such limitation, since it is modified directly to support Jaguar classfiles.
By default jaguarc produces Jaguar classfiles; using the switch --annotate causes it to produce an annotated classfile instead. The rule of thumb is to use jaguarc --annotate whenever you are using OpenJIT-Jaguar, and just jaguarc when using GCJ-Jaguar. Running jaguarc Foo.class results in the Jaguar class file Foo.jagc. Running jaguarc --annotate Foo.class results in the Jaguar-annotated class file Foo.class, and a backup of the original classfile in Foo.class.bak.
The Jaguar back-end compiler
As mentioned above, there are two back-end compilers used by Jaguar. OpenJIT-Jaguar is a modification to the OpenJIT compiler from Fujitsu Labs and the Tokyo Institute of Technology. GCJ-Jaguar is a modification to GCJ, a static Java compiler based on GCC (formerly EGCS). To use Jaguar you will need to download and install one of these back-end compilers. Both can be used to compile and run standard Java applications as well.
The design of Jaguar is such that it is easy to modify an existing JVM, JIT, or static Java compiler to support Jaguar bytecodes. If there is interest I should be able to easily modify Kaffe to support Jaguar. I would very much like to take Sun's or IBM's JVM/JIT solutions and adapt Jaguar to them, however, this is dependent upon getting source code for those systems.
How you use Jaguar depends on whether you are using OpenJIT-Jaguar or GCJ-Jaguar as your back end compiler. For most people I recommend using OpenJIT-Jaguar, but GCJ-Jaguar generates faster code (with the loss of some flexibility).
Here's how using OpenJIT-Jaguar works:
Using GCJ-Jaguar is very much the same as using GCJ by itself. It works like this:
Using jaguarc
jaguarc takes the following syntax:
jaguarc [options] classfileOnly one classfile may be specified on the commandline at a time.
The options can be:
The rulefile is a file containing a list of Jaguar translation rules to apply to the bytecode being processed. Each line of the rulefile has the format
classname [ arguments ... ]The classname is the fully-qualified Java classname of the rule to apply, while arguments are optional arguments passed to that rule at initialization time. The format and structure of the arguments is up to the rule itself.
The rulefile included in Jaguar v2.0 (in the file jaguar2/default-jaguar-rules) is just:
Jaguar.compiler.rules.PSOBufferRuleNote that none of these rules have any additional arguments. Here, PSOBufferRule is a driver for Pre-Serialized Object buffers (also known as External Objects in the Jaguar papers); PSORule is a driver for PSO field accesses and method calls. ViaDescrRule is a driver for operations on JaguarVia queue descriptors, and ViaDBRule for JaguarVia doorbell operations.
Jaguar.compiler.rules.PSORule
Jaguar.compiler.rules.ViaDescrRule
Jaguar.compiler.rules.ViaDBRule
Clearly, the rules listed in the rulefile must be all accessible to jaguarc (that is, they must be on your CLASSPATH).
jaguarc takes the classfile specified on the command line and outputs a either an annotated classfile (when using the --annotate option) or a Jaguar classfile (with the filename extension .jagc).
Running Applications using OpenJIT-Jaguar
First compile your Java sourcecode to .class files using javac. (I have not tested alternate bytecode compilers, such as Jikes, but they should work.) Next, annotate each class file using jaguarc --annotate. Note that you don't strictly need to run jaguarc on every classfile in your application; only those classfiles that you know will be using Jaguar drivers (such as the PSO libraries) need this treatment. For example you don't need to run jaguarc on any of the standard class libraries in your JDK installation.
Now all that should be required is to set the JAVA_COMPILER environment variable to Jaguar. You will know if it is running if the following line is printed when running a Java application:
Jaguar/OpenJIT-1.1.10(X86 JDK1.1.7B)
Compiling Applications using GCJ-Jaguar
To compile a Java application to a running binary using GCJ-Jaguar, here's what you do:
javac myclass.java(You can also use:
gcj -C myclass.javafor this purpose, however, GCJ does not yet support inner classes when compiling source. I prefer to use the standard Sun javac as the bytecode it produces is well-understood by many compilers.)
jaguarc myclass.class
Note that you don't strictly need to run jaguarc on every classfile in your application; only those classfiles that you know will be using Jaguar drivers (such as the PSO libraries) need this treatment. For example you don't need to run jaguarc on any of the standard class libraries in your GCJ installation.
gcj -O6 -c -o myclass.o myclass.jagc
(Of course, you can also do:
gcj -O6 -c -o myclass.o myclass.classif you don't wish to use any Jaguar drivers from myclass.)
gcj -O6 -shared -o mylibrary.so myclass1.o myclass2.o ...
You can even combine the last two steps into one:
gcj -O6 -shared -o mylibrary.so myclass1.jagc myclass2.jagc ...
gcj -O6 --main=MyApplication -o MyApplication MyApplication.o mylibrary.soHere, MyApplication is the class which contains the method
public static void main(String args[]) { ... }that is, the main method for your application. The above example assumes that MyApplication.o was in turn compiled from running gcj on MyApplication.jagc.
./MyApplicationthat is, just run it like any other program!
Of course, it's not necessary that all of the above files be in the same directory or in the same Java package; I'm only showing it that way for simplicity's sake. Take a look at the Makefiles included in the Jaguar release for an example of using this for a multi-package, multi-directory project.
An additional option to gcj has been added for GCJ-Jaguar:
gcj -fjaguar-debug ...will print out additional debugging information while compiling .class or .jagc files. Right now this displays the bytecode PC currently being processed during compilation. This can be useful if you are debugging a Jaguar driver that seems to be giving the compiler problems.
Note that gcj can also compile Java sourcecode directly, as in:
gcj -O6 -c -o myclass.o myclass.java # DON'T DO THIS!While this is fine for classes you don't intend to use Jaguar drivers with, there is the additional caveat that GCJ does not (yet) support inner classes when compiling directly from source. It's usually better to compile down to bytecode first and then to machine code.
jag-dump
Another tool included with Jaguar is jag-dump, which dumps the contents of a Java (plain or annotated) or Jaguar classfile, showing you the bytecode for each method in the class. This is like a Jaguar-enhanced version of javap. This is useful to debug Jaguar translation rules (see below).
To use it, just type
jag-dump [ .class file | .jagc file ]
For example,
% jag-dump Hello.jagc
Hello.jagc is a Jaguar classfile
Method main ([Ljava/lang/String;)V:
[0] getstatic java/lang/System.out
[3] ldc 0x1
[5] invokevirtual java/io/PrintStream.println (Ljava/lang/String;)V
[8] return
Method <init> ()V:
[0] aload_0
[1] invokespecial java/lang/Object.<init> ()V
[4] return
If you are dealing with an annotated classfile, jag-dump will show the original bytecode as well as the "patched" Jaguar bytecode resulting from the application of the Jaguar translation rules.
Java Debugging with gdb
It is possible to debug Java programs compiled with GCJ using GDB. This is very useful, as it allows you to drill down and debug not only high-level Java code, but also debug into native methods, debug multithreaded applications, inspect machine registers, look at the machine code for the portion of the application you're debugging, and more.
To do this, you need a recent version of GDB 4.18 with support for Java and Linux threads; an RPM for Red Hat systems is available from the Jaguar download page. (Note that not all varions of GDB 4.18 include support for Java and Linux threads. It appears as though the GDB 4.18 included with Red Hat 6.1 will work, while previous versions may not.)
When debugging GCJ-compiled Java programs, you need to tell GDB to ignore the SIGPWR and SIGXCPU signals (which are used by the garbage collector). This can be done with the GDB commands:
handle SIGPWR nostop noprintAlternately you can place these two lines in the file .gdbinit in the directory where you're running GDB.
handle SIGXCPU nostop noprint
Here is an example of debugging a simple test program (which uses multiple Java threads) in GDB:
$ javac TestT.java
$ gcj -g -O --main=TestT -o TestT TestT.class
$ gdb TestT
GNU gdb 4.18
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...
(gdb) handle SIGPWR nostop noprint
Signal Stop Print Pass to program Description SIGPWR No No Yes Power fail/restart(gdb) handle SIGXCPU nostop noprint
Signal Stop Print Pass to program Description SIGXCPU No No Yes CPU time limit exceeded(gdb) break TestT.main
Breakpoint 1 at 0x8049fa2: file TestT.java, line 64.
(gdb) run
Starting program: /disks/now/grad/mdw/src/ninja/test/mdw/TestT [New Thread 16843 (manager thread)] [New Thread 16835 (initial thread)] [New Thread 16844] [Switching to Thread 16844] Breakpoint 1, TestT.main (args=@806cff0) at TestT.java:64 64 TestT a1 = new TestT(1,false);(gdb) where
(gdb) where #0 TestT.main (args=@806cff0) at TestT.java:64 #1 0x4011033a in java::lang::FirstThread::run (this=@8064f90) at /home/cs/mdw/disks/enclave1/libgcj-991104/libjava/java/lang/natFirstThread.cc:52 #2 0x400ccdfa in java.lang.Thread.run_ (this=@8064f90) at /home/cs/mdw/disks/enclave1/libgcj-991104/libjava/java/lang/Thread.java:119 #3 0x4011554a in java::lang::Thread::run__ (obj=@8064f90) at /home/cs/mdw/disks/enclave1/libgcj-991104/libjava/java/lang/natThread.cc:286 #4 0x4012524a in really_start (x=@805fef0) at /home/cs/mdw/disks/enclave1/libgcj-991104/libjava/posix-threads.cc:316 #5 0x401d7ba6 in GC_start_routine (arg=@807ffe0) at /home/cs/mdw/disks/enclave1/libgcj-991104/boehm-gc/linux_threads.c:533 #6 0x401eece9 in pthread_start_thread (arg=@bf7ffe7c) at manager.c:204
Note that the stack trace includes both Java code and the native methods in the libgcj runtime library!
You can examine threads using the info threads and thread commands:
(gdb) info threads
* 3 Thread 16844 TestT.main (args=@806cff0) at TestT.java:64 2 Thread 16835 (initial thread) 0x4022a1bb in __sigsuspend (set=0xbffff4f4) at ../sysdeps/unix/sysv/linux/sigsuspend.c:48 1 Thread 16843 (manager thread) 0x402b37d0 in __poll (fds=0x808fef0, nfds=1, timeout=2000) at ../sysdeps/unix/sysv/linux/poll.c:45(gdb) thread 2
[Switching to thread 2 (Thread 16835 (initial thread))] #0 0x4022a1bb in __sigsuspend (set=0xbffff4f4) at ../sysdeps/unix/sysv/linux/sigsuspend.c:48 48 ../sysdeps/unix/sysv/linux/sigsuspend.c: No such file or directory. Current language: auto; currently c(gdb)
External Objects and Pre-Serialized Objects (PSOs), described in the Jaguar papers, are the most generally useful feature provided by Jaguar. External Objects allow you to create and manipulate memory regions outside of the Java heap directly from your application; these memory regions could correspond to network buffers, memory-mapped files, or explicitly-managed application heaps. PSOs take this a step further and allow you to map Java objects onto such an external memory region; every field access to these objects automatically "serializes" the data such that it can be directly transmitted over the network, or saved to disk for later use.
The point of PSOs is that the standard Java object serialization is extremely slow, but many applications still want to be able to communicate Java objects over the network, or read and write them from disk files. Using PSOs you can avoid the overhead of object serialization altogether; see the Jaguar papers for more information on this. PSOs mapped onto a disk file give you a kind of "fast persistence" as well as a way to do very efficient disk I/O (since the data is read from or written to a memory-mapped disk file, rather than using the standard Java I/O libraries).
The test programs in jaguar2/classpath/Jaguar/PSO/test demonstrate the use of External Objects and PSOs. PSOTest creates a MemoryPSOBuffer -- an External Object which maps onto a region of memory outside of the Java heap -- and creates a PSO within it. FilePSOBuffer does the same thing, except that the External Object is mapped from a file specified on the command line.
The PSO itself is in the class MyObject.java in the test directory. To specify your own PSOs, you create a class which:
public static int getPSOSize();The body of this method is ignored; any calls to this method are translated by Jaguar to return the size of the PSO in bytes.
Note that PSOAllocator is an interface which is implemented by subclasses of PSOBuffer. That is, MemoryPSOBuffer and FilePSOBuffer are two valid arguments for the constructor of Jaguar.PSO.PSO. The idea is that the PSOAllocator is responsible for mapping the PSO onto itself.
Any public fields of your class which are one of the Java primitive types (byte, int, long, etc.) or are references to subclasses of Jaguar.PSO.PSO will be represented in the "pre-serialized" format of the object. That is, an assignment of any value to that field writes the value into the External Object memory out of which the PSO is mapped; any read of the field's value will read from the External Object memory as well. Other fields will be treated as normal Java fields.
What this means is that you can do something like:
FilePSOBuffer fb = new FilePSOBuffer(filename);The value of mo.someInt will be read from the data contained in the file.
MyObject mo = new MyObject(fb); // Map PSO onto file
System.out.println("The field value is: "+mo.someInt);
The format of fields stored in PSOs is defined by the Jaguar driver
which implement PSOs (Jaguar.compiler.rules.PSORule).
Basically the contents of the PSO are laid out like a C struct,
although fields are not aligned to an integer multiple of their
size (that is, an int field isn't necessarily aligned to an
offset multiple of 4). Object references to other PSOs within the
same External Object are stored as two 32-bit words: the offset
into the External Object to the PSO, and its size (if it is a PSOArray;
otherwise -1 is stored). References to non-PSO objects aren't part of
the pre-serialized format of the object. References to PSOs not in the
same container are stored with an offset and size of -1, meaning that
those references can't be recovered from the pre-serialized form.
(However, such references are stored in the Java object itself.)
The PSOTest program dumps the output
of the External Object after it's done, so you can see the format for
yourself.
The PSOArray class is a "generic" PSO with various methods allowing
you to treat it like an array. This is a nice way to get access to external
object memory without having to create PSOs, however, the types that you can
represent in this way are of course limited. For example,
JaguarVia is a Jaguar interface to the Berkeley VIA communications
substrate, which obtains round-trip times of about 80 microseconds for
small messages, and over 488 megabits/sec of bandwidth for large messages
on the Myrinet system area network. This is basically identical to the
performance of Berkeley VIA as measured from C code.
To use JaguarVia effectively you should be familiar with the VIA
architecture as well as Berkeley's implementation of it on Myrinet.
See the
Berkeley VIA web pages
for more information about this. Note that you need to have Berkeley VIA
installed and working (from C) before you attempt to use it with JaguarVia.
JaguarVia is a Java implementation of the VIPL (VI Provider Library), the
quasi-standard API for applications that wish to communicate using a
VIA interface. If you have a C program written to use the VIPL, it is
pretty straightforward to port it to Java using JaguarVia.
JaguarVia obtains high performance in 3 ways:
The easiest way to see how JaguarVia works is to look at the test
programs in
jaguar2/classpath/Jaguar/JaguarVia/test.
ViaPingpong is a simple ping-pong benchmark (used to measure
round-trip latency), and
ViaWindow is a streaming packet benchmark (used to measure
bandwidth).
Unfortunately VIA is a fairly low-level API, so it's not particularly
easy to use at this level. Also, it does not provide any
flow control, reliable transmission, or error correction; it is up
to the application to implement these features on top of VIA.
Note that this is not a limitation of JaguarVia itself; it is the
philosophy of VIA that applications should implement their own
protocols on top of this bare-bones interface. See
www.viarch.org for background on
VIA itself.
I have a library which implements a simple reliable transmission
protocol over JaguarVia (implemented entirely in Java, of course), which
I plan to make available soon.
Bug me about it if you are
interested in using JaguarVia and would like to see it.
As mentioned above, it's the translation rules from Java bytecode to Jaguar
bytecode which are responsible for implementing Jaguar drivers. Jaguar
translation rules are subclasses of Jaguar.compiler.rules.Rule
which convert a sequence of Java bytecode to some new sequence of
Jaguar bytecode. Remember that Jaguar bytecode is the same as Java
bytecode, but with a few additional instructions.
You shouldn't have to concern yourself with how Jaguar translation rules
work unless you're a developer wishing to build a new Jaguar driver (say,
to interface to a new device or change the way External Objects work).
A good way to get a feel for how they work is to look at the code for
one of the simpler rules, such as ViaDBRule (which transforms
calls to methods on the VIA_Doorbell class into direct access to
the VIA doorbell register).
The most important structure used in
transformation rules is Jaguar.compiler.classfile.CodeTree,
which represents a linked list of Java or Jaguar bytecode instructions.
insertInsn() lets you insert an instruction, getInsn()
reads an instruction, and deleteInsn() deletes an instruction.
Jaguar.compiler.classfile.Insn is the class representing an
instruction, and it has several subclasses
(such as FieldRefInsn or JumpInsn).
Jaguar instructions are represented by the class JaguarInsn.
Soon I plan to write a "Jaguar Driver Developer's Guide" which will
describe all of this in more detail; for now I hope you can learn enough
by reading the code for some of the Jaguar drivers included in the release
to write your own.
Using OpenJIT-Jaguar, you should be able to make use of all of the
features supported by standard JVMs.
Using GCJ-Jaguar, however, you will have to work around some missing
featues and bugs in that compiler.
GCJ-Jaguar is based on GCJ 2.95.2, which has several
important missing features:
Under Linux, GCJ only supports native threads, implemented using the
Linux kernel threads mechanism. As has been
pointed out by several groups, Linux kernel threads don't scale well and
have performance problems under some circumstances (for example, contended
locks are slow). The good news is that GCJ is entirely open source and
it should be possible to augment or replace the threading mechanism with
something better. Any volunteers?
Despite these shortcomings I think that GCJ is an excellent Java platform
for doing high-performance Java research.
I selected GCJ for this project for several reasons: First, it's open source
(very important for research projects); second, it does in fact support
native threads; and third, the code it produces is very efficient. It's
also a lot easier to work with a static compiler rather than a heavyweight
JVM and JIT compiler; there's a lot less complexity involved.
I am interested in helping people to use Jaguar, so please don't hesitate
to report any problems or bugs to me.
If you do have a bug report, please e-mail me
with a complete
description of the problem. Please send me the following information as well:
You can use GDB to debug the problem yourself; see the section on
debugging with GDB for details on that.
Thanks to the following people for their help and feedback:
FilePSOBuffer fb = new FilePSOBuffer(filename);
This reads the contents of the file one byte at a time, through the
PSOArray mapped onto the FilePSOBuffer.
PSOArray pa = new PSOArray(fb,fb.getSize()); // Map PSOArray onto file
for (int i = 0; i < fb.getSize(); i++) { byte b = pa.readByte(i); }
Going through native methods would require a high method-call overhead for
all 3 of these operations, as well as an expensive copy between C and Java
heap memory for network buffers. Also, JaguarVia is arguably "safer" than
use of the C-based libvia library, since it's implemented almost
entirely in Java, with Jaguar drivers providing the bare minimum
functionality for the 3 operations described above.
Note that all of these features, with the exception of
serialization, are supported by more recent versions of GCJ. Soon I plan to
release a new version of Jaguar which is up-to-date with a more recent GCJ
snapshot. In other words, this is a temporary situation.