BibleTrans Ontology

This document describes our choice of ontology, which is the vocabulary of concepts used in machine representation of natural language. Just using words in the natural language is an inadequate ontology because ordinary words have many different senses in different contexts -- and sometimes different senses in the same context (which leads to ambiguous meaning). Humans can make sense out of these different meanings (or laugh at the joke implied when multiple senses are intended), but computers are not so smart.

The BibleTrans translation engine is ontology-agnostic; the concepts are just numbers and it does not care what the numbers mean, nor how they are organized. It just needs a set of translation rules, one for each concept, how to generate text for that concept. Humans need to write these rules, so it helps if the ontology is as simple as possible, but no simpler. The problem is getting an ontology that is robust, complete, and consistent.

Louw & Nida

Every Greek lexicon lists for each Greek word, all its different senses, but the Greek-English Lexicon by Semantic Domains by Johannes Louw and Eugene Nida (L&N) splits out the different senses and organizes them grouped semantically rather than alphabetically. Separate Greek words used synonymously are collected into the same lexical entry and assigned a single number, whereas the same Greek word may appear in several different semantic domains and with different sense numbers, according to the meanings it has in context. The result is that there are 6,975 separately enumerated lexical concepts in this lexicon, each with exactly one meaning. This is ideal for computer usage.

The American Bible Society has granted permission to distribute an electronic copy of L&N with BibleTrans. There is a small royalty associated with the distribution, so we need to keep records of who gets it.

However the L&N lexicon is just that, a lexicon, a word book. Every one of those 6,975 lexical concepts is tied to specific Greek words in the text. Other parts of the meaning of the text are conveyed in verb tense and mood, noun case and number, and sometimes just by word order. A lexicon is not much help for encoding these parts of the meaning.

ABP

In 1999 August, Tod Allman, Steve Beale, and Tom Pittman met at SIL in Dallas to work out a mutually satisfactory extension to the L&N concepts for dealing with discourse and structural semantics. Allman and Beale both had prototype translation engines similar in concept to BibleTrans, and there was sufficient overlap to make a unified ontology feasible. The result was what we called, for want of a better term, the Allman-Beale-Pittman format, or ABP for short. Allman and Beale have since then gone on to do other things apparently unrelated to the ABP format, but it remains the best ontology for what BibleTrans is doing, and we continue to support it with only minor modifications.

[more TBA]