Large Language Model Communication Moonshot

'moonshot' or 'lofty goal' written in Blissymbolics

In 1962, U.S. President Kennedy made a famous speech where he promoted the idea of putting a human on the moon by the end of the decade. This was very ambitious and, to some, highly unlikely given the state of space technology and exploration at the time. However, progress was made over the following years and the goal was achieved in 1969 with the Apollo 11 Moon landing.

It was called a “moonshot”, where the term originally meant, literally, to develop a rocket system capable of going to the moon, landing there, and then returning to Earth safely. The objective was monumental for the time, and the term gained a figurative meaning, that of any attempt to achieve a lofty and unlikely goal. The Bliss symbols at the top of this article show this difference. When asked to express “moonshot” using Bliss, the author came up with the literal “rocket landing on the moon” symbols on the left. A more experienced Bliss “speaker” offered “huge metaphor goal” as the meaning in Bliss.

Large Language Models (LLMs) are trained using massive amounts of textual data taken from books, articles, web content, and so on. All of this text is human generated and represents a corpus of linguistic expressions in a natural language. Natural languages come from speech, a vocalization of sequences of words; that is, most natural languages are based on phonetics. Written language, or orthography, is a character-graphic based representation of the phonetics. As a consequence, LLMs are models of phonetic based languages.

Blissymbolics is a meaning-graphic form of language. Charles Bliss, its creator, called this kind of writing “semantography” or “meaning-writing”. Bliss is not spoken and has no phonology. With respect to LLMs, it is an outlier and at best is not represented well within an LLM. At worst, it is not represented at all. Nonetheless, like written forms of natural languages, it is a productive linguistic system in that many thoughts and ideas can be expressed in a sentential manner based on the fundamental rules of Bliss symbol composition.

One of the latest goals of the Baby Bliss Bot project is to extend and expand the capabilities of phonetic-based LLMs to handle meaning-based symbol systems like Blissymbolics. Another aspect of the 1960s moonshot was the positive side effects from the technologies that were developed. They proved useful beyond the field of space exploration and with a variety of beneficial outcomes for society. The hope is that stretching the capabilities of LLMs with a minority symbol system will have similar unforeseen but advantageous side effects. The research opens up new possibilities to adapt LLMs to other symbol-based communication systems, such as Picture Communication Symbols, SymbolStix and Widget Symbols, and to other minority languages. These systems usually have smaller amounts of data available, presenting a challenge for using them with LLMs. The project’s approach will provide a way to extend and stretch language models in these low-resource areas, making communication more accessible and supportive for a wider range of users.