Alenka (8bf1493a) · Commits · Martin Jonáš / DTEDI

Chapters/Chapter02.tex

+116 −106

Original line number	Diff line number	Diff line
		@@ -114,7 +114,7 @@ Tseitin transformation can be used to transform every formula to a
		equisatisfiable formula in \cnf with only linear increase of its
		size~\cite{Tse68}.

		\subsection{DPLL}
		\subsection{Davis--Putnam--Logemann--Loveland algorithm}

		Historically, the first procedure to solve \sat without explicitly
		computing the truth table of the formula was proposed by Davis and
		@@ -129,16 +129,18 @@ exponentially even for simple formulas. To alleviate this problem, the
		refinement of \dppr algorithm was introduced in 1962 by Davis, Putnam,
		Logemann and Loveland~\cite{DPLL62}.

		The Davis--Putnam--Logemann--Loveland algorithm (\dpll) iterativelly
		tries to build a satisfying assignment by deciding values of the
		propositional variables and it backtracks if any of the input clauses
		becomes false in the current assignment. A key procedure guiding the
		\dpll search is the \emph{unit clause rule}, which is based on the
		observation that if a formula contains a clause $C$ that is unit in
		the current assignment, the only way to build a satisfying assignment
		is to add the sole undefined literal of $C$ to $M$. The iterated
		application of the unit clause rule is called \emph{unit propagation}
		or \emph{Boolean constraint propagation} (\bcp)~\cite{ZM88}. The \dpll
		The Davis--Putnam--Logemann--Loveland \quotegraffito{If you don't know
		where you are going any road can take you there.}{Alice in
		Wonderland} algorithm (\dpll) iterativelly tries to build a
		satisfying assignment by deciding values of the propositional
		variables and it backtracks if any of the input clauses becomes false
		in the current assignment. A key procedure guiding the \dpll search is
		the \emph{unit clause rule}, which is based on the observation that if
		a formula contains a clause $C$ that is unit in the current
		assignment, the only way to build a satisfying assignment is to add
		the sole undefined literal of $C$ to $M$. The iterated application of
		the unit clause rule is called \emph{unit propagation} or
		\emph{Boolean constraint propagation} (\bcp)~\cite{ZM88}. The \dpll
		search consists of decision and propagation steps. In decision steps,
		a variable and its new value are chosen and added to the current
		partial assignment. After each decision step, the \bcp is performed to
		@@ -278,10 +280,12 @@ Notable examples of decidable first-order theories include
		all structures from $\teuf$ that are isomorphic to real numbers and
		interpret the function $+$ as addition, and the predicate $\leq$ as
		the real comparison;
		\item the theory of \emph{real arithmetic}, which consists of all
		structures from $\teuf$ that are isomorphic to real numbers and
		interpret the function $+$ as addition, $\times$ as multiplication,
		and the predicate $\leq$ as the real comparison;
		\item the theory of \emph{real arithmetic}\marginpar{In contrast to
		real arithmetic, integer arithmetic with multiplication was shown
		to be undecidable by Gödel.}, which consists of all structures
		from $\teuf$ that are isomorphic to real numbers and interpret the
		function $+$ as addition, $\times$ as multiplication, and the
		predicate $\leq$ as the real comparison;
		\item the theory of \emph{arrays}, which consists of all structures
		from $\teuf$ isomorphic to the set of arrays with a binary function
		$read(a, i)$ interpreted as a value in the index $i$ of the array
		@@ -293,7 +297,7 @@ For a detailed description of these theories and implementation of the
		respective $T$-solvers, we refer the reader for example to the book of
		Bradley and Manna~\cite{BM07}.\marginpar{TODO: paragraph about combination of theories}

		\subsection{DPLL(T)}
		\subsection{DPLL modulo theories}

		Most of the \smt approaches can be classified as \emph{eager} or
		\emph{lazy}~\cite{BSST09}. The eager \smt approach consists in
		@@ -364,8 +368,8 @@ Z3~\cite{Z3}.\marginpar{TODO: Přetok}
		\label{ssec:natDomainSat}
		Although the separation of the Boolean and theory reasoning in the
		\dpll approach allows the solver to be modular, it can be also
		restricting in some cases. In particular, the \dpllt based solvers can
		not directly reason about values of first-order variables, but have to
		restricting in some cases. In particular, \dpllt based solvers can not
		directly reason about values of first-order variables, but have to
		rely on the $T$-solver guiding the search over Boolean
		valuations. While there are some techniques like \emph{splitting on
		demand}~\cite{BNOT06} , which allow the $T$-solver to add new atoms
		@@ -472,19 +476,19 @@ and uninterpreted functions~\cite{McM11}.
		The \emph{theory of fixed sized bit-vectors (\BV)} is a multi-sorted
		first-order theory with infinitely many sorts $\sort{n}$ corresponding
		to bit-vectors of length $n$. The only predicate symbols in the \BV
		theory are $=$, $\leq_u$, and $\leq_s$, representing equality,
		theory are $=$, $\leq_u$, and $\leq_s$, interpretead as equality,
		unsigned inequality of binary-encoded natural numbers, and signed
		inequality of integers in $2$'s complement representation,
		respectively. Function symbols in the theory are
		$+, \times, \div, \&, \mid, \oplus, \ll, \gg, \cdot, \extract{n}{p}$,
		representing addition, multiplication, unsigned division, bit-wise
		interpreted as addition, multiplication, unsigned division, bit-wise
		and, bit-wise or, bit-wise exclusive or, left-shift, right-shift,
		concatenation, and extraction of $n$ bits starting from the position
		$p$, respectively. For detailed description of the \BV theory syntax
		and semantics, see for example Hadarean's PhD
		$p$, respectively. For the detailed description of the \BV theory
		syntax and semantics, see for example Hadarean's PhD
		thesis~\cite{Had15}. This section focuses on the problem of
		satisfiability of the quantifier-free fragment of the \BV theory,
		denoted \QFBV. %The the full \BV logic is treated in the next section.
		denoted \QFBV. The the full \BV logic is treated in the next section.

		Current state-of-the-art \smt solvers for the \QFBV logic rely on
		rewriting techniques, used to simplify the formula during the
		@@ -505,14 +509,16 @@ possibly faster but incomplete sub-solvers for equality and inequality
		reasoning and if the sub-solvers are not sufficient for deciding the
		satisfiability of the formula, theory lemmas and propagated literals
		generated by the sub-solvers are added to the formula and a lazy
		\dpllt bit-blasting solver is employed~\cite{HBJBT14}.
		\dpllt bit-blasting solver is
		employed~\cite{HBJBT14}. \marginpar{TODO: Nepřidat odstavec o
		propagating-complete \cnf encodings?}

		\subsection{Word-level techinques}
		Although bit-blasting is highly efficient for most of practical
		problems, it can exhaust memory if the input formula contains complex
		arithmetic or variables with large bit-width. Several techniques that
		which avoid the bit-blasting have been proposed to alleviate this
		problem.
		problems, it can exhaust memory of the solver if the input formula
		contains complex arithmetic or variables with large bit-width. Several
		techniques that avoid the bit-blasting have been proposed to alleviate
		this problem.

		Some useful fragments of the bit-vector theory can be solved by
		specialized algorithms for deciding satisfiability. For example,
		@@ -525,90 +531,93 @@ used for deciding satisfiability of bit-vector formulas that contain
		only arithmetic and no bit-wise operations~\cite{BM05}.

		Recently, an instance of the model-constructing satisfiability
		calculus introduced in subsection \ref{ssec:natDomainSat} was
		implemented by Zeljić et al. in the solver \mcbv~\cite{ZWR16}. They
		have extended the \mcsat framework by the ability to perform partial
		assignments and proposed heuristics for generalizing explanations of
		bit-vector conflicts. For example, the solver \mcsat can perform the
		partial assignment $\extract{2}{0}(x) \mapsto 10$, denoting that the
		two least significant bits of $x$ are $10$. To be able to efficiently
		use such partial assignments, the solver \mcbv mantains an
		over-approximation of the set of models that are compatible with the
		current partial assignment using bit-patterns and arithmetic
		intervals. Bit-patterns are sequences of $0$, $1$ and $u$, which
		represent undefined bits, and constrain the values of particular bits
		in the assignment. Arithmetic intervals are pairs of bit-vector values
		representing lower and upper bounds and constrain integral values of
		bit-vectors. Both bit-patterns and arithmetic intervals can be ordered
		to form a lattice, in which the solver performs a search if a conflict
		is detected in order to generalize from the conflict.
		calculus, which was introduced in subsection \ref{ssec:natDomainSat},
		was implemented by Zeljić et al. in the solver
		\mcbv~\cite{ZWR16}. Zeljić et al. have extended the \mcsat framework
		by the ability to perform partial assignments and have proposed
		heuristics for generalizing explanations of bit-vector conflicts. For
		example, the solver \mcsat can perform the partial assignment
		$\extract{2}{0}(x) \mapsto 10$, denoting that the two least
		significant bits of $x$ are $10$. To be able to efficiently use such
		partial assignments, the solver \mcbv mantains two over-approximations
		of the set of models that are compatible with the current partial
		assignment -- using \emph{bit-patterns} and \emph{arithmetic
		intervals}. Bit-patterns are sequences of $0$, $1$ and $u$, which
		represents undefined bit, and constrain the values of particular bits
		in the assignment. On the other hand, arithmetic intervals are pairs
		of bit-vector values representing lower and upper bounds and constrain
		integral values of bit-vectors. Both bit-patterns and arithmetic
		intervals can be ordered to form a lattice in which the solver
		performs a search for a more general explanation if a conflict is
		detected.

		Another word-level approach for the full bit-vector theory is
		\emph{stochastic local search} (\sls), proposed for solving
		bit-vectors by Frohlich et al.~\cite{FBWH15} and subsequently improved
		by Niemetz et al.~\cite{NPBF15,NPB16}. In the \sls approach, the
		solver randomly chooses initial values of bit-vector variables and
		tries to find a satisfying assignment by performing random bit flips
		guided by the scoring function based on the satisfaction of
		tries to find a satisfying assignment by performing random bit flips,
		which are guided by the scoring function based on the satisfaction of
		subformulas of the input formula. Niemetz et al. have improved the
		\sls technique with the path-based propagation, which, instead of
		relying solely on random modifications guided by the scoring function,
		allows computing values of bit-vector variables that are necessary to
		satisfy randomly selected subformulas. The \sls based solver has been
		shown to be able to decide several formulas not decided by
		bit-blasting solvers. To combine benefit of bit-blasting and \sls
		approaches, the latest version of Boolector, which have won the 2016
		SMT competition in category of unquantified bit-vectors, uses a
		protfolio approach which consists in first running a \sls based solver
		for a short period of time and then running a bit-blasting solver if
		the \sls solver fails to solve the formula~\cite{BoolectorComp}.
		\sls technique with the \emph{path-based propagation}, which, instead
		of relying solely on random bit modifications guided by the scoring
		function, allows computing values of bit-vector variables that are
		necessary to satisfy randomly selected subformulas. The \sls based
		solver has been shown to be able to decide several formulas not
		decided by bit-blasting solvers. To combine benefit of bit-blasting
		and \sls approaches, the latest version of Boolector, which have won
		the 2016 SMT competition in category of unquantified bit-vectors, uses
		a protfolio approach, which consists in first running a \sls based
		solver for a short period of time and then running a bit-blasting
		solver if the \sls solver fails to solve the
		formula~\cite{BoolectorComp}.

		\subsection{Preprocessing}
		For both bit-blasting and word-level approaches, a preprocessing of
		the input formula is necessary for the efficiency of the
		solver. Therefore, modern SMT solvers employ hundreds of rewrite rules
		in order to simplify the input formula. Franzén describes several
		simplifications implemented in the solver MathSAT5: canonization,
		unconstrained variables propagation, packet splitting, and disjunctive
		partitioning~\cite{Fra10}.
		the input formula is crucial for the efficiency of the
		solver. Therefore, modern \smt solvers employ hundreds of rewrite
		rules in order to simplify the input formula~\cite{Fra10}. Franzén
		describes several classes of simplification methods implemented in the
		solver MathSAT5: canonization, unconstrained variables propagation,
		packet splitting, and disjunctive partitioning~\cite{Fra10}.

		The aim of the \emph{canonization} is converting subterms of the
		formula to their canonical forms. For example $x + x = 0$ can be
		rewritten to $2*x = 0$. Although efficient algorithms exist for
		several fragments of the \BV theory, like the core theory of
		bit-vectors, the problem of computing the canonical form for the full
		\BV theory is NP-hard~\cite{BDL98}. Therefore, SMT solvers usually do
		not compute the canonical form, but rely on multiple heuristically
		chosen rewrite rules that produce the canonical form for simple
		terms, but are not required to produce the canonical form in general.
		formula to their canonical forms. For example, the term $x - x$ can be
		rewritten to $0$. Although efficient algorithms exist for several
		fragments of the \BV theory, like the core theory of bit-vectors, the
		problem of computing the canonical form for the full \BV theory is
		NP-hard~\cite{BDL98}. Therefore, SMT solvers usually do not compute
		the canonical form, but rely on multiple heuristically chosen rewrite
		rules that produce the canonical form for simple terms, but are not
		required to produce the canonical form of all terms.

		From the remaining simplification techniques presented by Franzén, we
		describe propagation of unconstrained variables in more detail, as it
		is highly relevant for software verification.
		focus on propagation of unconstrained variables, as it is highly
		relevant for software verification.

		A variable $x$ in the given formula is called \emph{unconstrained} if
		it occurs only once in the formula. Brummayer~\cite{Brum10} and
		Bruttomesso~\cite{Bru08} independently observed that if an
		A variable $x$ in a formula is called \emph{unconstrained} if it
		occurs only once in the formula. Brummayer~\cite{Brum10} and
		Bruttomesso~\cite{Bru08} have independently observed that if an
		unconstrained variable occurs as an argument to a function symbol,
		which can be \emph{inverted} with respect to this argument, replacing
		this function with a fresh variable yields an equi-satisfiable
		this function with a fresh variable yields an equisatisfiable
		formula. Moreover, unconstrained variables often occur in the
		industrial benchmarks and especially in benchmarks comming from a
		industrial benchmarks and especially in benchmarks produced during a
		verification of programs in a single static assignment form, such as
		LLVM bit-code.

		With a slight abuse of notation, which identifies interpreted function
		With a slight abuse of notation which identifies interpreted function
		symbols with their intended interpretations, a simple definition of
		invertibility for binary function symbols, which can be easily
		generalized, is as follows. A binary function $f$ can be inverted with
		respect to its first argument, if for every two values $a_i, a_o$,
		there exists a value $b$ such that $f(b,a_i) = a_o$. For example, as
		the addition is invertible with respect to its first argument, the
		formula $\varphi \equiv x + (y * y - 30 * z * y) = 0$ can be
		transformed to an equi-satisfiable formula $v = 0$, where $v$ is a
		fresh variable, because $x$ is unconstrained in $\varphi$. Note that in
		contrast with the theory of integers, the function $f(x) = k \times x$
		is invertible precisely if $k$ is odd.
		respect to its first argument if for every two values $a_i, a_o$ there
		exists a value $b$ such that $f(b,a_i) = a_o$. For example, as the
		addition is invertible with respect to its first argument, the formula
		$\varphi \equiv x + (y * y - 30 * z * y) = 0$ can be transformed to an
		equisatisfiable formula $v = 0$, where $v$ is a fresh variable,
		because $x$ is unconstrained in $\varphi$. Note that in contrast to
		the theory of integers, the function $f(x) = k \times x$ is invertible
		in bit-vectors precisely if $k$ is odd.

		\section{Satisfiability of quantified bit-vector formulas}

		@@ -717,11 +726,11 @@ computational complexity perspective, as it is the first problem that
		was shown to be \NP-complete. The complexity of the satisfiability
		problems for various variants of bit-vector theory have been recently
		studied and was shown to range from \NP-complete to
		2-NEXPTIME-complete~\cite{KFB16}. In particular, interesting variants
		\NNEXPTIME-complete~\cite{KFB16}. In particular, interesting variants
		of the bit-vector satisfiability problem differ in allowing
		uninterpreted functions, allowing quantifiers, and in encoding of the
		bit-widths (unary vs. binary). In the following, we follow the
		notation from of Kováznai et al~\cite{KFB16} -- decision problems for
		notation of Kováznai et al~\cite{KFB16} -- decision problems for
		quantifer-free fragments are dentoted by the prefix QF\_, the
		combination with the theory of uninterpreted functions is denoted by
		the prefix UF, and the problems with unary and binary encoded
		@@ -733,8 +742,8 @@ uninterpreted functions and binary encoded bit-withs.
		polynomial time reduction from QF\_BV1 to \sat, showing that QF\_BV1
		is in NP. A similar reduction from BV1 to \qbf can show that BV is in
		\PSPACE. For lower bounds, \NP-hardness of QF\_BV1 follows from a
		simple reduction from \sat, which encodes each propositional variable
		as a bit-vector of bit-width 1, and similarly, BV1 can be shown to be
		simple reduction from \sat, by encoding each propositional variable as
		a bit-vector of bit-width 1, and similarly, BV1 can be shown to be
		\PSPACE-hard.

		In quantifier-free formulas, uninterpreted functions can be eliminated
		@@ -742,26 +751,27 @@ by the Ackermann expansion with only quadratic increase in the size of
		the formula. Therefore, QF\_UFBV1 is in \NP. As the set of QF\_UFBV1
		formulas contains all QF\_BV1 formulas, QF\_UFBV1 is also NP-hard. The
		case of UFBV1 is more involved. Wintersteiger et al. have shown that
		UFBV1 is \NEXPTIME-complete by using \emph{effectively propositonal}
		UFBV1 is \NEXPTIME-complete by using \emph{effectively propositional}
		fragment of the first-order logic, which is well known to be
		\NEXPTIME-complete~\cite{WHD13}. The class of effectively
		propositional formulas, also known as the Bernays--Schönfinkel class,
		\marginpar{The class of effectively propositional formulas is also
		known the Bernays--Schönfinkel class.} propositional formulas
		consists only of formulas in form $\exists^\forall^\varphi$, where
		$\varphi$ does not contain any quantifiers or function symbols.

		\paragraph{Binary encoded bit-widths}
		With binary encoded bit-widths, the bit-blasting may yield
		propositional formula that is exponentially larger than the original
		input bit-vector formula, as the number of bits may be exponential
		with respect to the size of the formula. Therefore, bit-blasting shows
		that QF\_BV2 is in \NEXPTIME. On the other hand, Kovásznai et al. have
		presented a polynomial time reduction of satisfiability of
		\emph{dependent quantified boolean formulas} (\dqbf) to QF\_BV2. Since
		\dqbf is well known to be \NEXPTIME-complete, this reduction shows
		\NEXPTIME-hardness of QF\_BV2~\cite{KFB12}. In contrast, the precise
		complexity after adding quantifiers is not known. BV2 is known to be
		in \EXPSPACE and hence it contains all formulas from QF\_BV2, it is
		also \NEXPTIME-hard.
		For formulas with binary encoded bit-widths, the bit-blasting may
		yield propositional formula that is exponentially larger than the
		original input bit-vector formula, as the number of bits may be
		exponential with respect to the size of the formula. Therefore,
		bit-blasting shows that QF\_BV2 is in \NEXPTIME. On the other hand,
		Kovásznai et al. have presented a polynomial time reduction of
		satisfiability of \emph{dependent quantified boolean formulas} (\dqbf)
		to QF\_BV2. Since \dqbf is well known to be \NEXPTIME-complete, this
		reduction shows \NEXPTIME-hardness of QF\_BV2~\cite{KFB12}. In
		contrast, the precise complexity after adding quantifiers is not
		known. BV2 is known to be in \EXPSPACE and because it contains all
		formulas from QF\_BV2, it is also \NEXPTIME-hard.

		Similarly to the case with the unary encoding, the complexity of the
		quantifier-fre fragment stays the same when the uninterpreted

Includes/config.tex

+3 −1

Original line number	Diff line number	Diff line
		\theoremstyle{definition}
		\newtheorem{exmp}{Example}[section]

		%\newcommand{\quotegraffito}[2]{\graffito{#1\\--~\emph{#2}}}
		No newline at end of file

Includes/packages.tex

+2 −1

Original line number	Diff line number	Diff line
		\usepackage{amssymb}
		\usepackage{amsthm}
		\usepackage[strict]{changepage}
		\usepackage{cleveref}