Overview of lectures:Overview

Lecture 1 (21.10.96)

The classical Problems

Two thousand and more years ago the Greeks asked the question whether one can construct with ruler and compass a square with the same volume as the unit circle, or whether one construct cube twice is voluminous as a given cube. The exact answer to these question (no) was given only in the 18th century.

The way the Greeks posed the question was algorithmical: find a straight line ruler and compass algorithm such that given a specified input the out put satisfies other specified properties.
The way the answer was proven is the leading theme of this course: The algorithmic description of the problem via ruler and compass was first replaced by an equivalent algebraic definition (Galois, Abel). Only then methods could be applied to show non-definability in this algebraic framework (Galois, Abel, Lindemann).

For our interest here the important message lies in the establishement of the equivalence of objects constructed from given objects via straight line ruler and compass algorithms and the definability in algebraic terms of the required object. In other words, two frameworks were proven equivalent: an algorithmic way of construction or recognizing objects and a descriptive way of defining objects.

Regular languages

A first example of an equivalence between algorithmic and descriptive ways of recognizing objects relevant to Computing Science is Kleene's theorem which states that a set of strings is recognizable by a finite automata iff it can be described by a regular expression. You should know this theorem from the course Automata and Formal Language Theory (xxxxx).

The formalism of regular expressions is still rather algorithmical. Another way of talking about strings consists in talking about the positions of a letter as a linear order, and to say that a certain letter occurs in the a certain position as a property of the position.

To say that a string is of the form $aa^*b^*$ we say that there is a position such that all the position before and including this position consists of a's and after that we have only b's. Alternatively, we can say that the word does not start with a b, and no b is followed by an a.
To say that a string is of the form $(ab)*$ we say that there is a set of positions containing the first, every other but not the last, which each contains an a, and all the others contain a b.
To say that a string is of the form $a^nb^n$ (which is not regular by the pumping lemma) we say that the positions in a string can be divided into two intervals of equal length (i.e. there is a bijection between the two intervals) such that the first conatins only a's and the second only b's.

The descriptions given above can be formalized in (1) First Order Logic (2) Monadic Second Order Logic (3) Second Order Logic. The B\"uchi-Trakhtenbrot theorem now states that a language (set of strings) is regular iff it can be described by a sentence of Monadic Second Order Logic.
Chapter 5 of the book [EF] gives a thorough discussion of the the B\"uchi-Trakhtenbrot theorem. We shall prove this theorem in our course. But first we have to learn about Second Order Logic.

Second Order Logic

First Order Logic (FOL) allows us to speak about elements of a structure. We can say that elements are represented as terms. We can say that they are in certain relations (atomic formulas). We can form boolean connections of formulas and we can quantify individual variables (there exists, for all). FOL is described in my course not of the course Logic for Computer Science 1 (xxxx) and in chapter 0 of [EF].

In Second Order Logic (SOL) we have, additionally, variables for sets of elements and variables for sets of tuples of elements (relation variables). This gives us new atomic formulas $X_{n,\alpha}(t_1, \ldots, t_n)$ where $t_1, \ldots, t_n$ are terms as in FOL. Again we allow boolean connections and quantification, but now not only for individual variables but also for relation variables. SOL is fully defined in chapter 2.1 of [EF]. A full definition of the syntax and semantics of SOL was given in this first lecture. We shall explore the expressive power of SOL in the second lecture.
$\Sigma_1$ is the class of SOL formulas which are of the form $\exists X_{n_1,1}, \ldots \exists X_{n_k, k} \phi$ where $\phi$ is a FOL formula. $\Sigma_1$ is also called the set of Existential SOL formulas.

Examples of SOL properties of graphs , with V the set of vertices and E a binary symmetric, non-reflexiv relation on V.

Connected graphs: Every two vertices are connected by a path.
Exercise: Write the SOL formula.
Cycle-free graphs: A graph without paths rejoining a vertex.
Exercise: Write the SOL formula.
A graph is three colourable (3COL): there exist three sets (colours) $X_1, X_2, X_3 \subseteq V$ such that they form a partition and no two vertices connected by E have the same colour.
Exercise: Write the SOL formula.
A graph is Hamiltonian (HAM): There exists an E-cycle containing each vertex exactly once.
Exercise: Write the SOL formula.
A graph is planar (PLAN): There exists a `drawing' of the graph in the Euclidean plane with non intersecting arc as edges between the vertices. However, this definition will not lead us to a description of planarity in SOL. Instead, we have to use Kuratowski's theorem, which states that a graph is planar iff neither $K_5$ nor $K_{3,3}$ can be embedded in the graph. Embedded here means vertices are mappaed on vertices and edges are mapped on non-intersecting paths.
Exercise: Write the SOL formula.

None of the above properties can be expressed by FOL-formulas. A method for showing non-definability in FOL is discussed in chapter 1 of [EF]. We shall treat this in the sequel of the course.

Complexity classes

The class of regular languages is the complexity class of strings recognizable with memoryless Turing machines. Coding other structures such as graphs as strings usually gives strings which are longer than the number of vertices as it has to take the edges into account. The complexity classes for which this inconvenience is not serious are

L: Deterministic logarithmic space Turing machines;
NL: Non-deterministic logarithmic space Turing machines;
P: Deterministic polynomial time Turing machines;
NP: Non-deterministic polynomial time Turing machines;
PH: The polynomial hierarchy.
PSpace: Deterministic polynomial space Turing machines;
NPSpace: Non-deterministic polynomial space Turing machines, but NPSpace = PSpace.

Fagin and Christen proved in 1974 that the classes of graphs in NP are exactly the classes of graphs definable by Existential SOL formulas. Meyer and Stockmeyer proved that the classes of graphs in PH are exactly the classes of graphs definable by SOL formulas. Immerman and Vardi gave characterizations of L, NL, P using fragments of SOL and the restriction that the vertices of the graphs are ordered. These results are described in chapter 6 of [EF]. Different characterizations were given by Graedel, and are presented in Padadimitriou's book on Complexity Theory. These are all examples of theorem relation complexity classes to definability classes of SOL, which we plan to prove in the course.

Next lecture:Lecture 2