\documentclass[12pt]{article}
% $Id: writeup.tex,v 1.22 2004/02/27 01:01:30 peter Exp $

\renewcommand{\baselinestretch}{1.5}

\newcommand{\definition}[1]{\textbf{#1}}
\newcommand{\file}[1]{\texttt{#1}}
\newcommand{\code}[1]{\textbf{#1}}

\title{Graph Classification via Graph Reductions: Moving from Existential Proofs to Constructive Algorithms}
\author{Peter Boothe\\
Computer Science Dept.\\
University of Oregon\\
{\tt peter@cs.uoregon.edu} \\
\url{http://soy.dyndns.org/~peter/projects/research/reduction/}}

\usepackage{algorithmic}
\usepackage{algorithm}
\usepackage{verbatim}
\usepackage{url}
\usepackage{amsmath}
\usepackage{amsthm}
\begin{document}
\maketitle

\begin{abstract}

We prove correctness and present an implementation of an algorithm that
determines whether or not a graph has a property represented by a system of
reductions.  Several different implementation methods are discussed and one
implementation method is provided, along with experimental timing results and a
user's manual.

\end{abstract}

\section{Introduction}

Much of computer science has concerned itself with problems of graph
classification.  Ideally, we want a program which takes in as its input a graph
and a description of the property desired, and spits out, in a reasonable time,
whether the graph has the property.  Unfortunately, testing for many
interesting properties is NP-Hard.  The road to resolving this problem seems
rocky and difficult, so many attempts have been made at finding out what we can
do efficiently.  One common approach is to restrict the property description
language.  If the description language is sufficiently restricted, perhaps it
would be impossible to express an NP-complete property.  It turns out that one
such restricted language that still retains quite a bit of descriptive power is
\definition{monadic second order logic}\cite{Reduction}.

Monadic second order logic is a generalization of first order logic that allows
quantification over set variables as well as individual entities, but not over
relations of arity greater than one.   A \definition{graph reduction system}
consists of a finite set of labeled graph pairs (\definition{reductions}) and a
finite set of irreducible graphs (\definition{reducts}).  A reduction system is
applied to a graph by replacing the left hand side of any reduction with the
right hand side, until no further reductions can be applied.  If a graph that
is isomorphic to one of the reducts results, then the reduction system accepts
the graph.  Otherwise it does not.

Graph reduction systems are important because one can use them to check
any property that is expressible in monadic second order logic\cite{Reduction},
which include such useful properties as treewidth, planarity, outerplanarity,
and many others.  Previously, every graph reduction system required a separate
implementation of all the relevant search algorithms and the requisite data
structures.  Not only was this process often tedious, but it was extremely
error-prone.  In this paper we provide a general algorithm for all reduction
systems, along with a discussion of several implementation strategies, and one
sample implementation.

\section{Background}

Graph reductions were first used by XXXREFERENCESXXX to XXXREFERENCESXXX.
Since then, graph reductions have proven useful in the field of graph grammars
and computational complexity.  After several reduction systems were devised a
surprising equivalence was discovered: For every property of a graph that can
be expressed in monadic second order logic, there exists a reduction system and
consisting of a finite set of reduction rules and a finite set of reducts.
Furthermore, for every reduction system there exists a linear time algorithm
that will determine whether a particular graph can be reduced to one of the
reducts via the reduction system.  Unfortunately both steps of this chain were
non-constructive.  There is currently no algorithmic way of deriving the
reduction system from the logic representation, and there was previously no way
of determining the algorithm for the reduction system.  So while the existence
of an efficient algorithm was never in doubt, the exact search rules had to be
derived via other means.

The reduction approach to determining graph properties generally implies an
algorithm for each reduction system, and algorithms using this approach have
been designed and implemented for many properties, including algorithms for
recognizing partial 3-trees\cite{Reduction}, and planar partial
3-trees\cite{p3t}.  These algorithms were specific to those properties,
however, and are not generalizable to an arbitrary reduction system.

In this paper we describe a general algorithm that will work for all reduction
systems, talks about different implementation strategies, and provides an
implementation that runs in $O(n)$ expected time with $O(n^2)$ as a worst case.

\section{Algorithm}
\subsection{Description}

Broadly speaking, the goal for Algorithm~\ref{alg:general} is to take each
vertex of ``small'' degree in the graph, and to try to place it in each place in
each reduction.  If this cannot be done, then this vertex can be safely ignored
unless its degree is changed.  If the vertex can be placed, then save all
possible placements.  If the placement completes a reduction, then perform the
reduction and update the storage structures as necessary.

\input{algorithm}

The runtime of Algorithm \ref{alg:general} is $O(n * \max(t_{set}(M),
t_{access}(M)))$ where $n$ is the size of the input graph and $M$ is the data
structure we use to store partial matches.  On every run through the loop we
deal with one vertex.  If that vertex completes a rule, then we end up reducing
the number of vertices in the to-be-searched graph by at least one, and add a
constant number of vertices back into our queue of vertices that need to be
searched ($Q$).  Therefore, the number of times any particular vertex could go
through the loop is bounded above by a constant, and we execute the main
\code{while} loop $O(n)$ times.  Each of the nested \code{for} loops is bounded
above by a constant, so the only things we have to worry about are the queue
operations, applying reduction rules, and the storing and retrieving of partial
matches.

The storage structure $Q$ can be implemented with a stack or queue and an extra
array of booleans, leading to $O(1)$ time for enqueueing, dequeuing, and
membership testing.  Similarly, for the application of a reduction rule, it
suffices to note that a given rule only constant number of vertices, and all of
the removed ones are of bounded degree, leading to a constant bound on runtime
for that operation as well.  Therefore, all we  really need to concern
ourselves with is $M$ and its associated operations.  There are a number of
strategies one can use to implement $M$, each with its own tradeoffs in time
and space.  They lead to $O(1)$ time with large space requirements, and $O(\lg
n)$ or $O(n)$ time with small space requirements.  Thus, the total runtime of
our algorithm will never be more than $O(n^2)$, and will probably always be
implemented using the $O(n \lg n)$ strategy.

\subsection{Space and Time Requirements of Various Implementations}

\subsubsection{Multidimensional lookup table}

Under the uniform cost measure we can allocate as much memory as we want and,
as long as we don't touch it, we don't have to pay the cost of
initialization\cite{RET}.  Using this, we can create an implementation that
requires $O(n^k)$ memory, $O(n)$ time.  This strategy is potentially very
expensive in space, but linear in time.  Using this, we would end up allocating
$O(n^k)$ memory, where $k$ is the sum of the number of reduction rules, the
maximum number of restricted components in each rule, and the maximum number of
border nodes of a restricted component.

While this implementation would use $O(n)$ time, it is not currently feasible
under the memory available on most modern machines, because the value of $k$
grows too fast.  As an example, for the class of partial 3-trees $k$ seems to
be more than 10, and most computers do not have $O(n^{11})$ space for
non-trivial values of $n$.

\subsubsection{Tree}
\label{tree-based}

Using a tree-based structure for $M$, we could use $O(n)$ memory, with $access$
and $set$ now costing $O(\lg n)$ per operation.  This ends up leading to a
runtime of $O(n \lg n)$ time.

\subsubsection{Hashtable}

If we use a hashtable, then accesses and sets are expected to be $O(1)$ time,
but in the worst case can grow to $O(n)$ time.  This leads to a worst case
analysis of $O(n^2)$ time, $O(n)$ memory but $\Theta(n)$ expected time.  If
hash collisions are solved through the use of a tree rather than a linked list,
the space and time requirements become identical to those in \ref{tree-based}

\subsection{Not-So-Hidden Constants}

There are, however, two inputs to the system, the rule set and the graph.
Looking at the for loops, we quickly notice that there is a combinatorial
explosion on the size of each rule.  Examining the rules in more depth, we
quickly derive a runtime of $O(r*p*s^d*t_{gi}(s)$, where $r$ is the number of
reduction rules, $p$ is the maximum number of subgraphs of restricted
components in a rule, $s$ is the maximum size of these subgraphs, $d$ is the
maximum degree of a restricted vertex, and $t_{gi}$ is the time it takes to
perform graph isomorphism on a graph of the given size. In this equation, we
will make the subgraph size our factor ($k$) for our parameterized complexity
measure, because we raise it to degree of at most $k$, and the best known graph
isomorphism solutions have a running time of $O(2^k)$ where $k$ is the size of
the graph being tested.  Also, on an intuitive level, graph reduction systems
have many rules, but it is only when the rules get big that the problem becomes
really hard.  Choosing this as our parameter allows us to do several things.
First of all, we can ignore $r$ and $p$, then we can note that the largest $s$
or $d$ can be is $k$.  Which makes the runtime of the algorithm
$O(2^k*k^k*n*(t_{access}(M) + t_{set}(M)))$.  Which, with appropriate
sacrifices, we can make as low as $O(n)$, but will probably be $O(n \lg n)$ in
most implementations.

\subsection{Proof of Correctness}

\input{proof}

\section{Implementation}
%\subsection{Challenges}
\subsection{Users' Guide}
\subsubsection{Running the Program}

To run the program you need to provide 2 things: the graph to be tested, and
the rule set to test it with.  To test
whether a graph, stored in a file \file{graph.in}, is outerplanar (the
reduction system for which is stored in \file{outerplanar.py}) then 
invoke the program like this:\begin{verbatim}
    % ./membership.py outerplanar < graph.in
\end{verbatim}

Notice that the program's inputs are provided in two ways - the property is
specified by the file it is stored in, and the graph is passed in along
standard in.

\subsubsection{Creating New Graphs to Test}

The format for graphs is a restricted and simplified form of the dot format used by the Graphviz suite of programs.

\subsubsection{Adding New Rule Sets}

This requires creating a new file whose name ends in \file{.py}, and putting in
definitions of the following three variables: \code{propertyName},
\code{reductions}, and \code{representatives}.  The first is simply a string,
but the second and third are a little more complicated.  Both
\code{representatives} and \code{reductions} at least partially consist of
instances of the \code{Graph} class which can be found in \file{graphs.py}.  

In the case of \code{representatives}, each of
these graphs should be a valid reduct graph, and the list should be complete for the given reduction system.  The situation for \code{reductions} is a little more complicated, however.  That variable should consist of a list of instances of the \code{Reduction} class, which is included in the file \file{reduction.py}.

When a graph system is written down the reductions are pairs of labeled graphs,
one of which is the left hand side, and the other of which is the right.
Besides this labeling, which indicates the exact transformation function, some
vertices are restricted, and others are not.  Thus, to fully specify a
reduction two labeled graphs are required, the first of which has indications
as to which vertices are restricted and which are not.  The \code{Reduction}
class therefore requires 4 arguments to create a new instance: two labeled
graphs, a list of the restricted vertices, and an optional name.


\section{Conclusion}

The algorithm is correct, and the implementation works.  I hope others find
them useful.

\section{Further Work}

The $f(k)$ could probably be reduced significantly in complexity via
implementing various branch and bound schemes.

The program might be significantly faster if implemented in a compiled
language.

More rule sets could be implemented.

Automatic discovery of reduction rules from a MSOL description remains an open
problem.


\bibliographystyle{plain}
\bibliography{drp}

% \appendix
% \input{code}

\end{document}
