ISMM 2007 START Conference Manager    

Accordion Arrays: Selective Compression of Unicode Arrays in Java

Craig Zilles

The 2007 International Symposium on Memory Management (ISMM 2007)
Montreal, Canada, 21-22 October, 2007


In this work, we present accordion arrays, a straight-forward and effective memory compression technique targeting Unicode-based character arrays. In many non-numeric Java programs, character arrays represent a significant fraction (30-40% on average) of the heap memory allocated. In many locales, most, but not all, of those arrays consist entirely of characters whose top bytes are zeros, and, hence, can be stored as byte arrays without loss of precision.

In order to get the almost factor of two compression rate for character vectors, two challenges must be overcome: 1) all code that reads and writes character vectors must dynamically determine which kind of array is being accessed and perform byte or character loads/stores as appropriate, and 2) compressed vectors must be dynamically inflated when an incompressible character is written. We demonstrate how these challenges can be overcome with minimal space and execution time overhead, resulting in an average speedup of 2% across our benchmark suite, with individual speedups as high as 8%.

START Conference Manager (V2.54.5)