Next: Chapter 18  Contributing to the RichDoc Framework Up: Part II  Expert's Guide Previous: Chapter 16  Introduction to the Data Model


Chapter 17
The RichDoc Print Format

The RichDoc contains a facility for printing documents into intermediate format, that can be used later for print preview or sending to a real printer. The intermediate files are useful to avoid repetitive print layouts, which may be very lengthy for big documents.

The RichDoc Print format has similar role like PostScript and PDF formats, and also has similar architecture. There are several reasons why we decided to create a new format and not to reuse existing standard formats. The main reason is that there is no cross-platform, patent free, Java™ compatible library for writing and reading PostScript, PDF or similar standard format. Implementing such library would be difficult, as the formats are very complex, and many of their features could not be even utilized by our framework. Another reason is that these formats do not have good support for Unicode, while Java™ Printing APIs fully support Unicode.

Moreover, our format has one more extra feature: it supports incremental printing, which allows modification of existing print file by reprinting only those pages that have been modified since last printing. This feature is useful for keeping print files of large documents up-to-date, if the documents are often modified.

The RichDoc Print format is merely a serialization of commands issued through the interface of the java.awt.Graphics2D class. That is, the “printer” component implements the Graphics2D interface by serializing the commands and storing them to a binary file. The “playback” component reads the serialized commands, and sends them to supplied object compatible with Graphics2D, which may be either real printer or a print-preview component.

17.1 The Overall Structure of a Print File

The print file is actually a ZIP file, consisting of page definition files, embedded bitmaps, document source title, and an index to support the incremental printing function. Each page definition file obeys a format described in Section 17.2.

17.2 The Page Description Format

This section defines the format of a single file within the overall print file, which is a definition of objects on a single page. The page-definition format is binary, i.e. it is a stream of bytes. Therefore, we need to define data types that are used to define object properties, and how they are serialized into streams of bytes. The data types are defined in Section 17.2.1.

Table 17.1 defines the top-level structure of the page definition file. First, the description of the printing media is written, see Section 17.2.2. Then follows a list of printing instructions, as they were recorded by the printer component. Each instruction starts with a code byte, which defines the type of the instruction, see Section 17.2.3. The sequence of instructions is terminated by a END_OF_FILE value is encountered.

Table 17.1 Overall Page Definition Structure

Field Description
pageFormat description of the printing media, see Table 17.8
byte #1 instruction code
instruction #1 instruction data, see Table 17.9
byte #2 instruction code, etc., until END_OF_FILE byte encountered

17.2.1 Data Types

The data types are listed in Table 17.2. All integer data types are signed using the common, i.e. “two's complement”, encoding of negative numbers (e.g. unsigned 0xFF represents –1 byte value, 0xFE –2 value, etc.) All integers are stored in big endian, that is, more significant values are stored first.

Real numbers are encoded by first converting them to integers using java.lang.Double.doubleToLongBits() or java.lang.Float.floatToIntBits(), and then encoding them in big endian as integers. This encoding corresponds to the IEEE 754 standard for encoding numbers.

The string value is encoded the same way as if saved using java.io.RandomAccessFile.writeUTF(). That is, the length of the string is first encoded as a 16-bit integer, and then the string characters are appended, each encoded using the UTF-8 encoding.

Table 17.2 Data Types

Name Description
byte 8-bit signed integer
double 64-bit real number
float 32-bit real number
short 16-bit signed integer
int 32-bit signed integer
long 64-bit signed integer
boolean 8-bit boolean
string UTF-8 encoded string, with leading short for the string length
shape serialization of java.awt.Shape, see Table 17.3
stroke serialization of java.awt.BasicStroke, see Table 17.4
transform serialization of java.awt.geom.AffineTransform, see Table 17.5

Besides primitive data types, there are also more complex types needed to encode some parameters of the Graphics2D interface. The serialization of java.awt.Shape is described in Table 17.3.

Table 17.3 Serialization of java.awt.Shape

Field Description
byte winding rule, 0 – even-odd, 1 – non-zero
byte type of #1 segment, 0 – move-to, 1 – line-to, 2 – quad-to, 3 – cubic-to, 4 – close
data for segment #1
byte type of segment #2, etc, until -1 is encountered
 
move-to,
line-to
float x endpoint coordinate
float y endpoint coordinate
quad-to float x control point coordinate
float y control point coordinate
float x endpoint coordinate
float y endpoint coordinate
cubic-to float x control point #1 coordinate
float y control point #1 coordinate
float x control point #2 coordinate
float y control point #2 coordinate
float x endpoint coordinate
float y endpoint coordinate

Table 17.4 Serialization of java.awt.BasicStroke

Field Description
float line-width
byte end-cap: 0 – butt, 1 – round, 2 – square
byte join-type: 0 – miter, 1 – round, 2 – bevel
float miter-limit
short dash length
float dash #1
float dash #2 etc.
float dash phase (only if dash length > 0)

Table 17.5 Serialization of java.awt.geom.AffineTransform

Field Description
double the scale-x value of the transform matrix
double the shear-y value of the transform matrix
double the shear-x value of the transform matrix
double the scale-y value of the transform matrix
double translate-x value of the transform matrix
double translate-y value of the transform matrix

Table 17.6 Serialization of java.awt.Color

Field Description
byte the alpha value (0 = transparent, 255 = opaque)
byte the red value
byte the green value
byte the blue value

Table 17.7 Serialization of java.awt.Font

Field Description
string font name
byte font style (0 – plain, 1 – bold, 2 – italic, 3 – bold italic)
float font size

17.2.2 Printing Media Definition

The printing media definition section of the file defines the size and orientation of the paper to which the material is printed, see Table 17.8. First, the orientation of the paper is written, then its total size, imageable size (total size without margins), and imageable area offset from the total area. See Figure 17.1 for the meaning of fields in Table 17.8. Note that the fields describing the paper size refer to the portrait orientation of the paper. For instance, the A4 paper will always have the width of 210mm and the height of 297mm, regardless the paper orientation.

Table 17.8 pageFormat Field Definition, see Figure 17.1

Field Description
byte paper orientation: 0 – landscape (Windows), 1 – portrait, 2 – reverse (Macintosh) landscape
double paper width
double paper height
double imageable x
double imageable y
double imageable width
double imageable height

[picture]
y
x
x
y
1 – portrait
2 – reverse landscape
y
0 – landscape
x
imageable area
paper width
paper height
imageable width
imageable height
imageable x
imageable y
[end of picture]

Figure 17.1 Meaning of pageFormat Fields

17.2.3 Printing Instructions

We call a single call to the Graphics2D interface a printing instruction. To encode the printing instruction, we must encode its type, and all its parameters. All supported printing instructions and their parameters are summarized in Table 17.9. Since the parameters closely correspond to the parameters of Graphics2D methods, we comment them very sparsely. See documentation of Graphics2D for more information.

Table 17.9 instruction Fields

Instruction Code Field Description
0 = DRAW_STRING_INT draw string on integer coordinates
string the string
int the x coordinate
int the y coordinate

1 = DRAW_STRING_FLOAT

draw string on real coordinates
string the string
float the x coordinate
float the y coordinate

2 = FILL_SHAPE

fills shape
shape the shape to fill
3 = DRAW_SHAPE draws shape
shape the shape to draw
4 = SET_STROKE sets the stroke
stroke the stroke to set
5 = TRANSFORM appends transform to current transform
transform transform to append
6 = SAVE_TRANSFORM saves current transform under given ID
short transform ID
7 = RESTORE_TRANSFORM restore previously saved transform
short transform ID
8 = CLIP reduces current clip area by given shape
shape the shape to clip to
9 = SAVE_CLIP saves current clip under given ID
short transform ID
10 = RESTORE_CLIP restore previously saved clip
short transform ID
11 = RESET_CLIP resets the clip to the state before playback has started
12 = SET_COLOR sets current color
color the color to set
13 = SET_FONT sets the current font
font the font to set
14 = SET_FONT_VARIANT sets the variant of the current font (to save space, if the font name is the same)
byte font style
float font size
15 = DRAW_TRANSFORMED_IMAGE draws image with transformation
string the image file name
transform the transform to apply before drawing
16 = DRAW_IMAGE draws image
string the image file name
17 = SAVE_SHAPE fills given shape, and saves it under given name
string shape name
double shape x-location
double shape y-location
boolean x-mirrored
shape shape
18 = USE_SHAPE fills previously saved shape
string shape name
double shape x-location
double shape y-location
boolean x-mirrored

19 = SHAPE_SCALE

writes the scale to be used for SAVE_SHAPE and USE_SHAPE
double scale
20 = END_OF_FILE end of file mark

17.3 Notes on Printing to the RichDoc Print Format

In this section, we would like to mention some limitations and other technical notes regarding using the RichDoc Print interface from a Java™ application.

First, note that some features of the java.awt.Graphics2D may not be implemented. For example, painting using general Paint component is not supported. Also note that like Java Serialization format, this format is fragile, and is subject to change without notice. You should use it only for intermediate storage of print files, not for long-term storage. No support is provided for maintaining compatibility of various versions of this format. If you detect that a print file has different format version than you expect, you should consider the file invalid, and should not attempt to parse it.

Be aware that recorded printing commands may be played back in a different context that they were captured. It is therefore illegal to use any commands that set the global state of the java.awt.Graphics2D class, such as by calling setTransform() or setClip(). You should use their relative equivalents, i.e. transform() and clip(). It is, however, legal to save the state of the class, e.g. by calling getTransorm, and restore it later, e.g. by setTransform(). The printing component automatically converts these commands to SAVE_TRANSFORM and RESTORE_TRANSFORM instructions, respectively.


Next: Chapter 18  Contributing to the RichDoc Framework Up: Part II  Expert's Guide Previous: Chapter 16  Introduction to the Data Model