Several of the formats share a standardized way of encoding a list of text items, consisting of a single number at the front defining how many (or which) items are there, followed by up to 31 integer links to the text items, followed by the text items themselves. The low 12 bits of each link is an offset to the (integer, 4-byte-increment) beginning of the text; the next 6 bits is the number of characters in the item, and the upper bits usually encode the pixel width of that item in its normal font.
Often a text list aggregates multiple single-line text tiems such as
checkboxes or pushbuttons. The aggregate can be displayed separately as
an editable list, permitting the user to change the names of the checkboxes.
The resource ID of the list is a multiple of 32, and the IDs of the collected
one-lines items are each offsets from that list ID, by the line number.
Thus 64 is the list of language categories, and 65 is the first checkbox
in that list. These list/checkbox combinations can also be used to enable
or disable other data formats, by the matching low bits of of their respective
IDs. Thus checkbox 65 enables radio button group ID 1089 and variable list
ID 2113, as well as L&N linkage table ID 32065.
The second word of the format has 1s in the bits where elements in the corresponding item list represent variable names, and 0s where the elements are computed values. Only variables in the item list can be linked to slot labels in the Dot Connector format.
Each line of the group can have up to 31 elements, one per byte with
a byte count at the front of the line, for a total of exactly four integer
words (32 bytes) for each line. Only the low five bits of each byte are
significant; the other bits may be set to facilitate exporting, but are
ignored.
The second word of the format is a row/column count, the number of actual rows and columns to be displayed; the row count is in the high half, and the column count is in the low half. Following this count word are tab stops for each column boundary (one more than the number of data columns), the pixel position of the left edge of that column. These tab stops are calculated dynamically from the actual widths of the labels and data items.
Following the tab stops is one word for each data item, (rows * columns)
words. Each word is encoded to one of these (high two bits):
| 00 | No data, or integer | |
| 10 | Character | |
| 01 | 4 Chars | |
| 110 | Text link | |
| 111 | Negative integer |
An integer 0 value is encoded as a character '0'. The low 18 bits of
a text link are understood in the same way as the links in a text list,
12 bits are an integer offset (at the end of the table), and six bits are
the text length. Four-character items allow short text entries without
allocating and managing variable-length text space.
Following the tabs, there are two words for each row: the first word
is encoded with the L&N concept number, and the second word is bitwise
encoded with the checkmarks for that row, the least significant bit corresponding
to column 1 and language category 1.
The high half of the first word contains the 11-bit ID number of the node shape; the top 3 bits select the icon type, and the low 8 bits enumerate the various shapes defined for that type. The low four bits of the first word is the number of connection patterns in this group, and the remaining bits are used to record the active group and anchor dot while the user is forming a connection. The high half of the second word contains the (relative local) xy coordinate of the free end of the connection line being formed, and the low half is the ID of a variable that selects which pattern to use when there are more than one.
Each defined pattern takes six integer words: the high half of the first
word is the drag line ID, and the low four bits
selects one of its lines, if more than one. The remaining bits of the first
two words contain formatting information. The last four words in each pattern
enumerate in Item List order, which slot is connected
to that item, four bits each, for a total of 31 possible items. Zero in
any item position is no connection. There are at most eight slots in any
node shape, the index of which fits easily in four bits.
The values are stored in prefix Polish form, one integer per code; the operation is in the high byte, and the xy location of its popup in the displayed image is packed into the low 24 bits. The five data codes are followed by one or more words of actual data. Expression values can be arbitrarily complex, up to the size limit of resources; nested values are shown in depressed rectangles, with the operators in popup menu buttons.
0 Null data 1 Integer data 2 Up to 4 chars of text data 3 Variable data 4 Formatting code data 5 String Length() function 6 Negative 7 Logical NOT 8 + 9 - 10 * 11 / 12 MOD 13 AND 14 OR 15 XOR 16 String Item() function 17 String ConCat() function 18 < 19 >= 20 <= 21 > 22 = 23 unequal 24 L&N number of tree node 25 parent of tree node 26 sibling of tree node 27 child of tree node 28 noun# of tree node 29 Bible reference (bk,ch,vs) 30 pronoun # of tree node 31 32 ... 62 else 62 else 63 if ... then 128+n n-character string data
The high half of the first word is the number of variables being set,
and the low half is the Item List ID containing
popups for possible values. Each variable line consists of a word containing
the variable list ID and line number (in its high
half) of the variable being set, followed by a word for the value to be
assigned, encoded similar to table values.
The low 22 bits of the first word chooses a variable to connect. The
same bits in the second word identifies a selector variable, if there are
more than one line; the number of lines is in the upper byte of this second
word. Each additional word is one line of connection, the drag line ID
in the low half, and its line number in the low four bits of the high half.
0 This is a placeholder for a 12x16 glyph table starting in resource#10000 (it could grow as big as seven or eight 1K resources, depending on how many characters are defined and how wide they are). The table is in the form of a font table as used for normal text display: Table position [0] is the 4-character font name, [1] gives the total height and ascent (from base line to top of cell) in 16-bit numbers, followed by [2] the character spacing and [3] space width. Beginning in [4] is an index of offsets to the beginning of that ASCII character (-28, that is, [4] is space character 32, [5] is '!', and so on) to character 255 in [223], and the offset to the end of the table in [224]. Glyph width is determined by the difference between adjacent glyph offsets.1 This is a placeholder for a panel displaying the pixels of a single glyph enlarged for editing. The resource contains the index of the selected glyph, and some positioning information; the actual pixels are in the table.
2 This is a kerning table, in case the language needs to overlap vowel and consonant glyphs (not yet implemented).
3 This is a text string in the glyph font, so the characters can be viewed in context.
4 This is 27 character sets (named by each letter of the Roman alphabet, plus a special set of word breaks named "#". Each set is a 224-bit bitmap, one bit for each character in the set.
5-7 These are three groups of morphological (character substitution) rules to be applied after translation. Because the translation rules are ASCII only, one of these (#6) defines a conversion from (Roman) ASCII to whatever character font is defined in the glyphs. The other two perform substitutions before (#5) and after (#7) conversion. The differences are superficial (which font the characters are displayed in); all rules work exactly the same, and are tested in strict numerical sequence exactly once on each generated text character. Each rule is stored as a sequence of characters that is the "context" for applying the rule, followed by a sequence of characters to replace the match with. Character set codes (1-27) can be used in the rules to refer to any character in the corresponding lettered set (#4 above).
Rev. 2007 August 15