Latex Maths

Wednesday, May 15, 2013

What is Montague grammar?

Montague grammar

Richard Montague developed his theory of grammar from 1950-1970s.  What is special about Montague grammar is that, whereas Chomsky's transformational grammar provides a formal account for the syntax of natural language, Montague grammar has both formal syntax and semantics.

Basically, Montague uses categorial grammar to describe the syntax of a natural language.  Then he establishes formal translation rules that maps natural-language sentences to logical formulas in intensional logic.  Such formulas have well-defined semantics based on possible worlds.

Let's turn to the semantics first, since it is the most interesting and relevant to AGI...

Intensional logic (內涵邏輯)

Intensional logic originated as an attempt to resolve paradoxes due to confusing references with senses (指涉 和 涵義), or extensions with intensions (外延 和 內涵).  Its idea has been studied by Frege (1892), Lewis (1914), Tarski (1933), Carnap (1947), Kripke (1963), Montague (1970).

To say that:
       The Empire State Building is the tallest building in New York
is not the same as saying:
       The tallest building in New York is the tallest building in New York
even though we have substituted equal by equal.  How come?

The problem is that "the tallest building in New York" is a symbol that has a reference (or extension) which is the object it refers to, and a sense (or intension) which is more subtle and depends on the context under which it is interpreted.

In different possible worlds (such as different time periods), "the tallest building in New York" can refer to different objects.  While the extension of a symbol can be taken as the set of individuals it refers to, the intension of the symbol cannot be such a set, but has to be the function that maps from possible worlds to individuals in the worlds.  As shown in the diagram:






In other words, the function associates to each possible world an individual in that world which the symbol refers to.  That is:
$ f: I \rightarrow U $
$ f: i \mapsto u $
where $I$ is the set of possible worlds (or interpretations), and $U$ is the set of individuals that figure in any of the worlds.

The intensional logic (IL) as described by Montague is a meta-language based on $\lambda$-calculus, that allows to define various modal operators, so that it can subsume modal logic, temporal logic, deontic logic, and epistemic logic, etc.

Compound nouns

I discovered that intension functions may also help us define the semantics of compound nouns, such as:


cyber sex
sex that happens on the internet
glass slippers
slippers made of glass
couch potato
person who stays in the couch all the time

Some compound nouns are easy to interpret as they are simply the conjunction of their constituent atomic concepts.  For example, a "girl scout" is both a "girl" and a "scout".  However, a "girl school" is not a "girl", but is for girls.  The meaning of such atypical compound nouns seem to require abductive interpretation (溯因解釋).  Abduction means to search for explanations for what is known, which is the reverse of deduction which searches from what is known to new conclusions.

Now with the insight from intensional logic, perhaps we can interpret compound nouns like this:


So each modifier is a function that maps from possible worlds to individuals.

The advantage of such a treatment is that concept composition in general can be uniformly represented by intensional logic.

What is a function?

When we say "there exists a function...", it can be deceptive as the function can be any arbitrary mapping -- for example, it can include a bizarre map that maps an individual to Confucius in world$_1$, to the number 7 in world$_2$, to the Empire State Building in world$_3$, etc.  What we need is to operationalize the concept of functions.

In the logic-based paradigm, a function is implemented as a set of logic formulas that determine the elements of a predicate.  For example, the Prolog program:
    ancestor(X,Y) :- father(X,Y)
    ancestor(X,Y) :- ancestor(X,Z), father(Z,Y)
    etc...
determines the elements of the "ancestor" relation.

So we have found an operational implementation for functions, and as logic programs are Turing universal, they are also the broadest class of functions we can define.  One question is whether we can simplify the logical form, while keeping its Turing universality.

Possible world semantics

Possible world semantics were developed by Carnap and Kripke, among others, but Leibniz was the first "logician" to use the term "possible world".

The relevance to intensional logic is that each intensional individual is a function from the set of possible worlds (W) to the set of possible individuals (D):
    $ f : W \rightarrow D $.
In other words, the set of all intensional individuals is $D^W$.

Possible world semantics is a way to model modal logics (模態邏輯).  Modal logic is distinguished by the use of the modal operators $\Box$ "necessarily" and $\diamond$ "possibly".  To model a logic means to establish a correspondence between propositions (命題) in the logic and an underlying algebraic structure called interpretations, such that the truth of propositions in the logic can be determined (or evaluated) by looking into that algebraic structure.  In the case of modal logic, its model can be given by Kripke structures, or possible worlds.

What is a Kripke structure?  It is a tuple $\left< W, R, V \right>$ of possible worlds $W$, a binary accessibility relation $R$ between worlds, and a valuation map $V$ assigning {true, false} to each proposition $p$ in possible world $w$.

Modal truths are defined by this interpretation rule:
     $ \Box \phi \Leftrightarrow \forall w \in W. \phi $
which means that $\phi$ is necessarily true, if $\phi$ is true in all possible worlds in $W$.  There is also a dual rule for $\diamond$.

This is an example Kripke structure:


Where the $w$'s are possible worlds, with $w_0$ being a distinguished "current state" or "real world", the arrows are "reachable" relations, and the $a, b$'s are propositions true in those worlds.  This example may model a computational process, where we want to ensure, say, "state 0 will never be reached after N steps".  Such a statement can be translated into modal logic and be verified by a model checker.  This debugging technique is what enables the design of highly complex modern microprocessors.


Categorial grammar


{ For completeness' sake ... }

References:


[1] Portner 2005.  What is meaning? - fundamentals of formal semantics.  Blackwell Publishing.

[2] James McCawley 1993.  Everything that linguists have always wanted to know about logic *but were ashamed to ask, 2nd ed.  Chicago.

[3] 朱水林 1992 (ed). 邏輯語義學研究 (Studies in logical semantics)。上海教育出版社

[4] 鄒崇理 2000.  自然語言研究 (Natural language linguistics)。北京大學出版社

[5] 陳道德 2007 (ed).  二十世紀意義理論的發展與語言邏輯的興起 (20th century semantic theory and the rise of logical linguistics)。 北京社會科學出版社

[6] 方立 2000.  邏輯語義學 (Logical semantics: an introduction)。  北京語言文化大學出版社

[7] J v Benthem 2010.  Modal logic for open minds.  CSLI publications

[8] Gopalakrishnan 2006.  Computation engineering - applied automata theory and logic.  Springer.