Compiled Rule Code


This page describes the compiled form of the different kinds of grammar rules. All rules start with a line number, typically zero when lines don't make sense. This enables the debugger to open and highlight the appropriate rule (and sometimes a line within the rule) during breakpoints or single-stepping.

The rules are compiled into "Code" resources (and indexed in "CodX" resources) using a pseudo-code defined in the Mnemonics table of ExecEng. When Compile_Logging=true the rule compiler lists the generated code in the system log as it compiles. See The BibleTrans Virtual Machine for an explanation of the virtual operations.

The first eight CodX resources more or less sequentially number the lexical rules from 1 to 7999 (1000 in each index resource), where 0.1 is in position 1, and so on to 93.615 which is 7765. The starting numbers for each Louw&Nida domain are in resource "L&NX" #1. Subsequent CodX resources (as many as needed) index the named rules, where each rule has a single number linking it to a Code resource, followed by the name of the rule. References to the named rules point to the index word. Each index word contains a resource number and a byte offset into that resource. Rules fill up Code resources in the order compiled, which is arranged so that rules which make use of other rules are compiled after them.
 

Set Variable

Each line of a SetVar rule compiles to an evaluation of whatever goes into that variable, either a load from some named variable, or else a call to an evaluation rule like a table or conditional value. These are compiled (and executed) in the order they are in the grammar rule window.
 

Conditional Value

Each line of a Conditional Value rule compiles to an evaluation of that condition, which is tested and jumps to the next condition line if false, followed by the evaluation of the return value when the test is true. The return value is stored in a special "if res" variable for use by the calling rule. Values can be made up from literal constants (numbers or strings) and variables, combined with built-in arithmetic and other operators.
 

Syntax Line

At the front of a compiled syntax rule with multiple lines is a test of the selector variable to choose which line will be activated. Then for each line the code begins with any selected initialization (regardless of where it occurs in the source window line, unless it has already been done in the calling node shape rule), then the remaining items in the order they occur in that source line. These items can be other rules (which get called to do their thing), or tree variables (which similarly get called to let the appropriate lexical rule(s) operate), or other variables which simply generate their text values to the output. The lines are compiled in reverse order, but that has no effect on execution, since only one line is activated on any given call.
 

Dot Connector

Tree node connection rules provide a general format for linking node shapes to syntax lines, and the subtrees to the items on each line. Like syntax lines, the connection groups are compiled in reverse order, with a selector test to choose one of them, except the special SetVar connection at the front (if present) is compiled at the front where it executes before choosing a connection group. If a tree list variable is one of the groups, then a reference to the current node is also stored in that variable, so it can be generated out of sequence, wherever the grammar places that variable in a syntax line.

The right side of each connection group is a list of the variables available to the selected syntax line, although it may not actually use all of them. Code is compiled for each subtree connection, to store a reference to it in the variable(s) it is linked to. Then the syntax line rule is called to arrange them as generated text.

The Thing connection rule also contains code to drive the pronoun generator, so that if the current noun # is associated with any of the defined pronouns, that pronoun is generated instead of everything else. This happens in the "Pron Gen" rule, which is built from the pronoun selector.
 

Lookup Table

Lookup Tables do not compile into separate rules, but use built-in table lookup operators to directly access the table value.
 

Lexical Rule

The lexical rule specification in its source rule window makes up only part of the compiled rule code. Some of the parameters can be variables not initialized in this category; these values are stored into the variables, wherever they are. The local category variables are just initialized with the values given them in the lexical rule, or else initialized as null. This prevents nested tree structures from interfering with each other's variables. Variables that are initialized in an L&N Connection table from this concept number are also given values here.

The node shape is also used to select an appropriate connection rule, if any, to call, and if there are variable connection rules that might be active, all of them are called (they simply do nothing and return if their variables are null).

Lexical rules do some of their operations the first time through a Thing or Proposition modified list (so to analyze what subtrees are there), other operations only the second time (when text is to be generated), and some on both passes.

There are special compiled "lexical" rules for the driver nodes, 0.3 Thing, 0.4 Proposition, and 0.8 Root. The root is always the last "node" compiled but first to run, before it starts on the selected tree node, so this rule also initializes a copy of all variables to null, except the selected node is as highlighted in the tree window. Thing and Proposition rules initialize a local pass# variable so that the subtrees can be scanned and processed differently during the analysis and generation passes.
 

Built-In Rules

The translation engine makes use of several built-in (named) rules, similarly compiled when their needed information and rules are available. The built-in rules can be recognized in the rule lists or in the debugger window by the space in their names. At this time there are five:
(no shape) -- Stand-in for any node with no rule for its specified shape; it does nothing.

Show Verse -- Called within any discourse relation node shape rule and a few other times, to display the verse number when the current node has a verse number attached to it. It also stores the verse number into the pre-defined variable CurrentVerse#, from whence other rules can access it.

Do Sentence -- Called within the built-in lexical rule for 0.4 Proposition, so that when that node contains 0.313 Sentence Start, any previous sentence closing punctuation is generated, and the new sentence opening punctuation (including Show Verse, if any) is generated. The predefined variable IllocutionSeen contains two 2-bit subfields to capture and hold the illocution of the previous sentence and the current one; these two numbers index into the sentence punctuation table (if defined) for output to be displayed in their respective places. The NewParagraph and NewSentence variables are tested and cleared, then their respective outputs are generated. Finally, if capitalization is specified, the output stream is marked to capitalize the first word.

Do Tree List -- Called within a Syntax Line rule, this repeatedly looks at the tree list in predefined variable (parameter) "Line #"; if it's a tree node or list, the first item is removed from the list and the rest put back into Line #, then the Lexical rule for that first item is called. It exits when the list is empty.

Pron Updt -- Invoked by the " !! " tile of a Syntax Line to update the pronoun selection, or else from the built-in 0.3 Thing lexical rule. If the predefined "Pron #" variable has a pronoun number set from the designated pronoun selector Conditional Value, then the current node's noun number is stored into that pronoun variable. Subsequent calls to the pronoun selector that return this same noun number will invoke the Pronoun Generator (Pron Gen) instead of the Syntax Line specified for 0.3 Thing.

Pron Gen -- Called from somewhere in the built-in 0.3 Thing rule when a pronoun is selected for this noun. It first calls the designated pronoun selector Conditional Value to determine if this noun can be a pronoun, and returns without doing anything else if not. Otherwise the predefined "Pron #" variable is set to the specified pronoun number (for Pron Updt to use later). Some pronouns are designated in the grammar as unconditional; others can be temporarily suppressed by the PronSuppress variable; these are taken into account before looking to see if the designated pronoun matches the current noun. If not (and not unconditional) or it is suppressed, Pron Gen exits with the pronoun number, which tells the calling Thing to go ahead and generate the full noun phrase. Otherwise the specified Syntax Line or table lookup or Conditional Value is invoked and the resulting pronoun emitted, then it exits null (no additional output needed for this noun phrase).

Running

When you choose the "Translate" menu item, BibleTrans first compiles all the rules needed for this subtree, then activates lexical rule 0.8 (root), giving it the selected tree node. After initializing all the variables, the lexical rule for the selected node is called. If it is a Proposition or higher, that rule will initialize its category variables, then scan the subtrees setting up variables as specified in the linkage table(s), then go back and choose the syntax line selected for that verb or relation, which generates the output text. Subordinate propositions, as well as embedded Things, are treated in exactly the same way, except they are called in the order specified in the higher-level syntax line. This happens recursively until the whole tree has been translated. Note that role slots not connected to syntax variables, or variables not appearing on the syntax line, will not be visited during the generation pass and will not produce any text.

We have an extensive debugging facility for locating and fixing grammar errors...
 

Working Draft, 2012 October 2