forked from christhomson/lecture-notes
-
Notifications
You must be signed in to change notification settings - Fork 0
/
cs360.tex
3060 lines (2696 loc) · 194 KB
/
cs360.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\documentclass[]{article}
\usepackage[margin = 1.5in]{geometry}
\setlength{\parindent}{0in}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{cleveref}
\usepackage{amsmath}
\usepackage{amsthm}
\usepackage{mathtools}
\usepackage{tikz}
\usepackage{qtree}
\usepackage{tikz-qtree}
\usepackage{float}
\usepackage[lined]{algorithm2e}
\usepackage[T1]{fontenc}
\usepackage{ae,aecompl}
\usepackage{color}
\usepackage{bussproofs}
\usepackage{enumerate}
\usepackage[pdftex,
pdfauthor={Chris Thomson},
pdftitle={CS 360: Introduction to the Theory of Computing},
pdfsubject={Lecture notes from CS 360 at the University of Waterloo},
pdfkeywords={CS 360, course notes, notes, Waterloo, University of Waterloo, theory, finite automata},
pdfproducer={LaTeX},
pdfcreator={pdflatex}]{hyperref}
\DeclarePairedDelimiter{\set}{\lbrace}{\rbrace}
\definecolor{darkish-blue}{RGB}{25,103,185}
\hypersetup{
colorlinks,
citecolor=darkish-blue,
filecolor=darkish-blue,
linkcolor=darkish-blue,
urlcolor=darkish-blue
}
\theoremstyle{definition}
\newtheorem*{defn}{Definition}
\newtheorem*{theorem}{Theorem}
\newtheorem*{thesis}{Thesis}
\newtheorem*{lemma}{Lemma}
\newtheorem*{corollary}{Corollary}
\newtheorem*{aside}{Aside}
\newtheorem{ex}{Example}[section]
\crefname{ex}{Example}{Examples}
\setlength{\marginparwidth}{1.5in}
\newcommand{\lecture}[1]{\marginpar{{\footnotesize $\leftarrow$ \underline{#1}}}}
\makeatletter
\def\blfootnote{\gdef\@thefnmark{}\@footnotetext}
\makeatother
\begin{document}
\let\ref\Cref
\title{\bf{CS 360: Introduction to the Theory of Computing}}
\author{Professor Shai Ben-David \\ \\ Fall 2013, University of Waterloo}
\date{Notes taken by Chris Thomson}
\blfootnote{I'd love to hear your feedback. Feel free to email me at \href{mailto:[email protected]}{[email protected]}.}
\blfootnote{See \href{http://cthomson.ca/notes}{cthomson.ca/notes} for updates.
\ifdefined\sha % Also, \commitDateTime should be defined.
Last modified: \commitDateTime{} ({\href{https://github.com/christhomson/lecture-notes/commit/\sha}{\sha}}).
\fi
}
\maketitle
\newpage
\tableofcontents
\newpage
\section{Introduction \& Course Structure} \lecture{September 10, 2013}
\subsection{Course Structure}
The grading scheme is 30\% assignments, 30\% midterm, and 40\% final. If your final grade is better than your midterm grade, your final will be worth 70\% and the midterm will be worth 0\%.
\\ \\
In terms of textbooks, any textbook relating to ``automata theory'', ``theory of computation'', or ``formal language theory'' is acceptable.
\\ \\
The professor's office is in DC 1311, and his office hours are on Tuesdays from 2:00 to 3:00 pm.
\\ \\
This course is mathematical. Consequently, relevant background courses are mainly mathematical (mostly logic) and (almost) no programming.
\subsection{Introduction}
The goal of the course is to develop the theory of computation. We want to understand the mathematics that govern computers. We will explore questions like:
\begin{itemize}
\item What are computers?
\item Which tasks can computers carry out?
\item Which tasks are easy or hard for computers?
\end{itemize}
We make abstractions and use them as common properties that are relevant to many phenomenon. We aren't going to be focusing on any particular programming language or machine architecture, but instead, we'll focus on issues that underlie any automated computation.
\\ \\
These abstractions allow us to form conclusions that are relevant to computing in general, and not just for today's hardware and software.
\\ \\
This course has two key goals:
\begin{enumerate}
\item To familiarize you with the fundamentals of computer science theory.
\item To develop the ability to reason formally and abstractly about computing.
\end{enumerate}
Throughout the course, we will follow two main threads concurrently: abstract models of computers, and computing tasks. We'll progress from simple models and simple tasks to complex models and complex tasks, and along the way connections will be made between the two.
\subsection{Computing Tasks}
We're going to focus on \textbf{decision problems}. Decisions problems are problems which have an input and produce a binary (yes/no) output. Some examples of decision problems are:
\begin{itemize}
\item Input: a number. Decision: is it a prime number?
\item Input: email message. Decision: is it spam?
\item Input: a graph. Decision: is it connected?
\end{itemize}
We're interested in seeing which of these questions a computer can carry out and how difficult are such tasks.
\subsubsection{Modeling Decision Problems}
We model decision problems using \textbf{formal languages}. A decision problem has two components:
\begin{itemize}
\item A domain set $X$.
\item A language $L \subseteq X$.
\end{itemize}
Our decision problem is then ``for some $x \in X$, is $x \in L$?'' Re-examining our examples from earlier, we would define:
\begin{itemize}
\item $X = \mathbb{Z}$ (the set of all integers), and $L = $ the set of all prime numbers.
\item $X = $ the set of all email messages, and $L = $ the set of all spam email messages.
\item $X = $ the set of all graphs, and $L = $ the set of all connected graphs.
\end{itemize}
For further concreteness, we will fix some finite set $\Sigma$, which we will call the \textbf{alphabet}. Some examples of alphabets are $\set{a, b}$, $\set{0, 1}$, and $\set{0, 1, 2, 3}$.
\\ \\
$\Sigma^\star$ represents the set of all finite strings over the alphabet $\Sigma$. In many cases, we define $X = \Sigma^\star$. For example, if $\Sigma = \set{0, 1}$ then $\Sigma^\star = \set{ \epsilon, 0, 1, 00, 11, 01, 10, \ldots }$. Note that $\Sigma^\star$ is an infinite set, but every member of $\Sigma^\star$ is finite.
\\ \\
For the time being, we will define just one operation for strings: \textbf{concatenation}. Given two strings, $\sigma$ and $\eta$, we get $\sigma \eta$. That is,
\AxiomC{$\sigma$}
\AxiomC{$\eta$}
\BinaryInfC{$\sigma \eta$}
\DisplayProof . For example, if $\sigma = 01$ and $\eta = 111$, then $\sigma \eta$ is 01111.
\\ \\
\textbf{Languages} are subsets of $\Sigma^\star$ (that is, a language is a collection of strings). Some examples of languages include:
\begin{itemize}
\item $L = \emptyset$. This is the \textbf{empty language}, which no strings belong to. Note that $|L| = 0$. This is analogous to an empty fridge.
\item $L = \set{ \epsilon }$. This is a non-empty language that contains one element ($|L| = 1$), which just happens to be $\epsilon$. This is analogous to having a fridge that contains only an empty pop can.
\item $L = \set{\epsilon, 0, 01, 11111}$.
\item $L = \Sigma^\star$.
\end{itemize}
A compiler's task of checking their input (the code of a program) to see if the program is valid (if the code is in the language), can be readily seen as checking membership in such a language (namely, the language of all binary sequences that represent valid code in the given programming language).
\section{Inductive Definition of Sets and Structural Induction}
We'll start off by reviewing some basic notions that were covered on CS 245, to ensure that everyone's on the same page.
\subsection{Formal Tools for Defining Sets}
We want languages to model every computational task, some of which are complex. We need several different methods for formally defining sets, since all languages are sets.
\begin{enumerate}
\item \textbf{List all members of the set}. This method is precise and concrete, but it makes look-ups difficult ($O(n)$) and it fails if the set is infinitely large.
\item \textbf{Common property that characterizes the members of the set}. For example, consider the set of all even numbers. This set is not definable by a list because it's an infinite set. However, there is a common property that's shared between all members of the set: if you divide by two and examine the remainder, every member in the set will have a remainder of zero and only members of the set have this property.
\item \textbf{Inductive definitions}. Methods (1) and (2) don't offer us a way to define a set of all your blood relatives, since there's no common property that characterizes that set. Instead, we could use \emph{inductive definitions}.
\\ \\
Inductive definitions of sets requires three components:
\begin{itemize}
\item A domain set $X$ (sometimes omitted if it is obvious).
\item A core set $A$ (as in ``atoms''), such as $\set{ \text{me} }$.
\item A set of operations $P$, such as $\set{ \text{``son of''}, \text{``father of''}, \text{``mother of''}}$. This set is a set of functions from $X \to X$.
\end{itemize}
Given these components, we define the \textbf{inductive set} $I(A, P)$ (say, the set of blood relatives), as all domain elements that can be reached from the core set by applying a finite sequence of operations from $P$.
\begin{defn}
$I(A, P)$ -- the \textbf{inductive set} defined by core $A$ and operations $P$ -- is the smallest set satisfying:
\begin{itemize}
\item Contains all members of $A$, and
\item Closed under the operations in $P$.
\end{itemize}
\end{defn}
\noindent \textbf{Examples:}
\begin{itemize}
\item Let the domain be the set of all real numbers ($\mathbb{R}$), the core set be $\set{2}$, and the set of operations be $\set{ \text{``add 2''}, \text{``subtract 2''}}$. The defined set is the set of all even numbers, both positive and negative.
\item To define the set of all algebraic expressions, our core set would be $\set{x, y, z, 1, 2, 3, \ldots, a, b, c, \ldots}$, and our operations would be:
\AxiomC{$\sigma$} \AxiomC{$\eta$} \BinaryInfC{$(\sigma + \eta)$} \DisplayProof,\AxiomC{$\sigma$} \AxiomC{$\eta$} \BinaryInfC{$(\sigma \cdot \eta)$} \DisplayProof.
\end{itemize}
Given a set (language) defined inductively, how can we tell if some $x \in X$ is in the language or not?
\begin{ex}
Let $X = \set{a, b}^\star$, $A = \set{a}$, and $P = $ \bigg\{ \AxiomC{X} \UnaryInfC{aX} \DisplayProof, \AxiomC{X} \UnaryInfC{Xa} \DisplayProof, \AxiomC{X} \AxiomC{Y} \BinaryInfC{bXY} \DisplayProof, \AxiomC{X} \AxiomC{Y} \BinaryInfC{XbY} \DisplayProof, \AxiomC{X} \AxiomC{Y} \BinaryInfC{XYb} \DisplayProof \bigg\}.
\\ \\
A typical question would be: is $abbaa \stackrel{?}{\in} I(A, P)$? The answer: yes! However, we must show how to generate $abbaa$ from $a$, using the operations in $P$, in a finite number of steps.
\begin{enumerate}[1.]
\item $a$, belongs to the core $A$.
\item $baa$, after applying $P_3(a, a)$.
\item $abbaa$, after applying $P_4(a, baa)$.
\end{enumerate}
Note that the first step must always be an element in the core set.
\\ \\
Is $\epsilon \in I(A, P)$? No. We could use simple reasoning to show that $\epsilon \not \in I(A, P)$ (each function in $P$ increases the length). However, in general, it is more difficult to prove non-membership of an element than it is to prove membership of another element.
\end{ex}
\begin{defn}
A \textbf{certificate} (or \textbf{generating sequence}) for $x$ is a sequence of elements of $X, \theta_1, \theta_2, \theta_3, \ldots, \theta_n$ such that:
\begin{itemize}
\item $\theta_n = x$, and
\item Every $\theta_i$ in the sequence is either an element of the core set $A$ or is the result of applying an operation from $P$ to a $\theta_j, \theta_l$ that appeared earlier in the sequence ($j < i$ and $l < i$). \lecture{September 12, 2013}
\end{itemize}
\end{defn}
\end{enumerate}
Our definition of $I(A, P)$ is still somewhat informal, however. We haven't formally defined what it means for a subset to be the \emph{smallest} subset, and we also haven't formally defined what it means for a set to be ``closed'' under a set of operations.
\subsection{More on Inductive Sets and Structural Induction}
We say that a set $B$ is \textbf{closed under the operations of $\boldsymbol P$} if for every $f \in P$ and every $x, y \in B$, $f(x, y) \in B$ (for concreteness, we considered here an $f$ that accepts two inputs, but it may also be a unary function, $f(x)$ or ternary $f(x,y, z)$, etc.). For example, the set of even numbers is closed under the operation +.
\\ \\
Now, let's redefine $I(A, P)$ in a more formal way:
$$
I(A, P) = \bigcap \left\{ B : A \subseteq B ~\text{ and } ~B ~\text{ is closed under } ~P \right\}
$$
Note that $I(A, P)$ is the intersection over the \emph{entire} collection of sets (and is therefore uniquely defined).
\begin{ex}
Let $X = \mathbb{N}, A = \set{10}, P = \set{+}$.
\\ \\
$I(A, P)$ is defined as the intersection of all sets $B$ such that $ A \subseteq B $ and $B$ is closed under $P$ (that is, $B$ is closed under $+$). So, we have:
\begin{itemize}
\item $B_1 = \set{\text{all even numbers}}$
\item $B_2 = \set{\text{all numbers divisible by 5}}$
\item $B_3 = \set{\text{all numbers divisible by 10}}$
\end{itemize}
In this case, $I(A, P) = \set{10, 20, 30, \ldots}$, since $I(A, P)$ must be the minimal set among all possible sets $B$ (or, in other words, the multiples of 10 are exactly the numbers that belong to each and every of these sets $B$).
\end{ex}
\begin{aside}
\begin{align*}
\bigcap \left\{ [0, r] : r > 0 \right\} &= \set{0} \\
\bigcap \left\{ (0, r) : r > 0 \right\} &= \emptyset \\
\bigcup \left\{ \set{1, \ldots, n} : n \in \mathbb{N} \right\} &= \mathbb{N} \\
\bigcap \left\{ \set{1, \ldots, n} : n \in \mathbb{N} \right\} &= \set{1}
\end{align*}
\end{aside}
The equivalence of the two definitions of $I(A, P)$ is not immediately clear. We claim that $I(A, P)$ is closed under the operations of $P$, and $A \subseteq I(A, P)$.
\begin{proof}
Pick any $x, y \in I(A, P)$ and operation $f \in P$. For every $B$ which contains $A$ and is closed under $P$, $x, y \in B$ (since $I(A,P)$ is the intersection of all such $B$'s).
\\ \\
Since every such $B$ is closed under $P$, $f(x, y) \in B$ for every such $B$. Therefore, $f(x, y) \in \cap \set{ B : A \subseteq B \text{ and } B \text{ is closed under } P}$, so $f(x,y) \in I(A, P)$.
\\ \\
\underline{Exercise}: prove that $A \subseteq I(A, P)$.
\end{proof}
\begin{corollary}[Proof by Structural Induction Theory]
Any $B \subseteq X$ which contains $A$ and is closed under $P$ is a superset of $I(A, P)$. Namely, $I(A, P) \subseteq B$.
\end{corollary}
We could rephrase this corollary as follows: given any set $B$, if you want to prove that $I(A, P) \subseteq B$, it suffices to show:
\begin{itemize}
\item $A \subseteq B$, and
\item $B$ is closed under $P$.
\end{itemize}
\subsubsection{Proof by Structural Induction}
Let's say we want to show that for every $n$, $n^2 + n$ is even. A proof by usual (mathematical) induction requires us to show two things:
\begin{itemize}
\item Prove the claim holds for $n = 1$. This is equivalent to showing that $A \subseteq B$.
\item Prove that if the claim holds for $n$, it also holds for $n + 1$. This is equivalent to showing that $B$ is closed under $P$.
\end{itemize}
How is this the same as structural induction? Usual induction shows that $\mathbb{N} = I(A, P)$, where $A = \set{1}$ and $P = \set{ \text{``+1''}}$. That is, mathematical induction is just one specific case of structural induction.
\subsection{Examples of Proofs by Structural Induction}
\begin{ex}
Imagine we have three paper cups placed on a desk, with two of them facing upwards and one upside down. You must flip exactly two cups at a time. We want to get all of the cups facing upwards. Can we prove that this goal is not achievable?
\\ \\
Let $X$ be the set of all possible configurations of the cups. Let our core set $A$ contain only the state ``[up] [down] [up]''. Let our set of operations $P$ contain:
\begin{itemize}
\item $P_1$: flip the two rightmost cups.
\item $P_2$: flip the two leftmost cups.
\item $P_3$: flip the rightmost and leftmost cups.
\end{itemize}
We claim that the state ``[up] [up] [up]'' $\not \in I(A, P)$.
\begin{proof}
Let $B$ be the set of all configurations with an even number of ``up'' cups.
\\ \\
We want to show that $I(A, P) \subseteq B$. This would show that ``[up] [up] [up]'' is not attainable because it is not a member of $B$. We will use structural induction to show this.
\\ \\
\textbf{Induction base}: $A \subseteq B$.
\\ \\
\textbf{Induction step}: $B$ is closed under $P$. Namely, if $c \in B$, then $P_1(c), P_2(c)$, and $P_3(c) \in B$. There are three cases:
\begin{itemize}
\item We flip one ``up'' cup and one ``down'' cup. This is a difference of +0 ``up'' cups.
\item We flip two ``up'' cups. This causes us to remove two (which is even) from an already even number, which gives us another even number of ``up'' cups.
\item We flip two ``down'' cups. This causes us to add two (which is even) to an already even number, which gives us another even number of ``up'' cups.
\end{itemize}
Note that these three cases show us that no matter which operation we use, it is not possible to obtain all three ``up'' cups, since the number of ``up'' cups will always be even.
\end{proof}
\end{ex}
\begin{ex}
Earlier, we defined $X = \set{a, b}^\star$, $A = \set{a}$, and $P = $ \bigg\{ \AxiomC{X} \UnaryInfC{aX} \DisplayProof, \AxiomC{X} \UnaryInfC{Xa} \DisplayProof, \AxiomC{X} \AxiomC{Y} \BinaryInfC{bXY} \DisplayProof, \AxiomC{X} \AxiomC{Y} \BinaryInfC{XbY} \DisplayProof, \AxiomC{X} \AxiomC{Y} \BinaryInfC{XYb} \DisplayProof \bigg\}.
\\ \\
For this example, let $\#_a(x)$ denote the number of ``a''s in the string $x$ and similarly, let $\#_b(x)$ denote the number of ``b''s in the string $x$.
\\ \\
\textbf{Claim}: every member of $I(A, P)$ has more ``a''s than ``b''s. To put this more simply, let $B = \set{ x : \#_a(x) > \#_b(x) }$. We wish to show that $I(A, P) \subseteq B$.
\begin{proof}
We will prove this claim by structural induction.
\\ \\
\textbf{Induction base}: $A \subseteq B$. It is trivial to see that the string ``a'' has more ``a''s than ``b''s (since it has no ``b''s).
\\ \\
\textbf{Induction step}: if $x, y \in B$, then we must show that $P_1(x) \in B, P_2(x) \in B, P_3(x, y) \in B, P_4(x,y) \in B$, and $P_5(x, y) \in B$.
\\ \\
It is clear that for $P_1$ and $P_2$, one ``a'' is being added to a string that must already include more ``a''s than ``b''s, so this trivially holds true.
\\ \\
The case of $P_3$ isn't as immediately obvious. Assume $\#_a(x) > \#_b(x)$, and $\#_a(y) > \#_b(y)$. So, $\#_a(x) \ge \#_b(x) + 1$ and $\#_a(y) \ge \#_b(y) + 1$. This gives us $\#_a(xy) \ge \#_b(xy) + 2$.
\\ \\
Now, note that $\#_a(bxy) = \#_a(xy)$, and $\#_b(bxy) = \#_b(xy) + 1$. We can conclude that $\#_a(bxy) \ge \#_b(bxy) + 1 > \#_b(bxy)$.
\\ \\
A similar argument applies for $P_4$ and $P_5$.
\end{proof}
\end{ex}
\subsection{Regular Languages}
For us, tasks will always be decision problems. That is, given $L \subseteq \set{a, b}^\star$ and input $x \in \set{a, b}^\star$, decide if $x \in L$.
\\ \\
The set of all possible tasks is $\set{L : L \subseteq \set{a, b}^\star}$. This set is an infinite set of finite strings.
\\ \\
Recall that not all infinite sets have the same size. The set of all finite strings $\set{a, b}^\star$ is a ``small'' infinite set, meaning it is \textbf{countable}. However, the set of all subsets of $\set{a, b}^\star$ is a ``large'' infinite set, which is \textbf{uncountable}.
\\ \\
The set of all possible computer programs is a subset of $\set{0, 1}^\star$. Note that since every program is a \emph{finite} string, the set of all possible programs is countable.
\\ \\
In summary, we have an uncountable number of tasks, but a countable number of programs (and each program is suitable for carrying out at most a single task). This implies that there are tasks (languages) for which no program exists (namely, for such languages, $L$, there exists no program that will figure out correctly for every string $x$ whether it is a member of $L$ or not).
\subsection{Defining Interesting Families of Tasks} \lecture{September 17, 2013}
\begin{defn}
The \textbf{empty string} can be denoted as either $\epsilon$ or $\lambda$.
\end{defn}
\subsubsection{Operations on Languages}
There are several operations that we can perform on languages.
\\ \\
We can use \textbf{standard set operations}, including $\bigcup, \bigcap$, complement ($\overline A$), and set difference ($A \textbackslash B$). You should be familiar with how these standard operations are defined.
\\ \\
Since we focus on subsets of the set of finite strings, $\Sigma^\star$, we can also use \textbf{operations on sets of such strings}. Let's go into more detail of how these string set operations are defined.
\begin{enumerate}
\item \textbf{The concatenation operation}. Given $L_1, L_2 \subseteq \Sigma^\star$, we define $L_1 L_2 = \set{ w_1 w_2 : w_1 \in L_1, w_2 \in L_2 }$.
\begin{ex}
Let $L_1 = \set{0, 011}$ and $L_2 = \set{ \epsilon, 110 }$. Then $L_1 L_2 = \set{0, 0110, 011, 011110 }$.
\end{ex}
Concatenation is not always commutative. Consider $L_1 = \set{0}$ and $L_2 = \set{1}$. We have $L_1 L_2 = \set{01}$ but $L_2 L_1 = \set{10}$, so $L_1 L_2 \ne L_2 L_1$.
\\ \\
Can $L_1 \subseteq L_1 L_2$? Yes, but only if $\epsilon \in L_2$. Otherwise, $L_1 \not \subseteq L_1 L_2$.
\item \textbf{The power operation}. Given any language $L$, define:
\begin{itemize}
\item $L^0 = \set{ \epsilon }$
\item $L^{n + 1} = L^n L$ for every $n > 0$
\end{itemize}
In particular, note that $L^1 = \set{ \epsilon} L = L$.
\begin{ex}
Let $L = \set{ 0, 1 }$. Then $L^n$ is the set of all ($\set{0, 1}$) strings.
\end{ex}
\item \textbf{The star (Kleene) operation}. For any language $L$, $L^\star = \displaystyle \bigcup_n L^n$. That is, $L^\star$ is the set of all finite concatenations of strings from $L$. Note that $L^\star$ includes $L^0$, which means $\epsilon \in L^\star$.
\end{enumerate}
\noindent \textbf{Examples:}
\begin{itemize}
\item $\set{1}^\star$ is the set of all finite strings of 1s.
\item $\set{0, 1}^\star$ is the set of all finite $\set{0, 1}$ strings.
\item Can $L^\star = L$? Yes. Consider $L = \set{ \epsilon }$, or $(L^\star)^\star = L^\star$ for any $L$.
\item Can $L^\star$ be finite? Yes. Consider $L = \set{ \epsilon }$ or $L = \emptyset$.
\end{itemize}
\subsubsection{The Set of Regular Languages}
Given some finite alphabet set $\Sigma$, let $X = \Sigma^\star$. We will also define our core set to be:
$$
A = \set{\emptyset, \set{\epsilon}} \cup \set{ \set{a} : a \in \Sigma }
$$
For instance, suppose $\Sigma = \set{0, 1}$. Then $A = \set{ \emptyset, \set{ \epsilon }, \set{ 0 }, \set{ 1 } }$. Next, we'll define our set of operations $P$:
$$
P = \text{ \bigg\{ \AxiomC{$L_1$} \AxiomC{$L_2$} \BinaryInfC{$L_1 \cup L_2$} \DisplayProof, \AxiomC{$L_1$} \AxiomC{$L_2$} \BinaryInfC{$L_1 \cdot L_2$} \DisplayProof, \AxiomC{$L$} \UnaryInfC{$L^\star$} \DisplayProof \bigg\}}
$$
A language is said to be \textbf{regular} if it belongs to $I(A, P)$, for this $A$ and $P$.
\\ \\
To prove a language is regular, you must be able to show how to generate it.
\begin{ex}
$\set{0, 1, 01}$ is regular.
\begin{proof}
\begin{align}
\set{0} &\text{ core} \\
\set{1} &\text{ core} \\
\set{01} &\text{ product} \\
\set{0, 1} &\text{ union of (1) and (2)} \\
\set{0, 1, 01} &\text{ union of (3) and (4)}
\end{align}
\setcounter{equation}{0}
\end{proof}
\end{ex}
\textbf{Claim 1}: for every (finite) string $\sigma$, $\set{\sigma}$ is a regular language. It's easy to see this is true, because you can generate the language by concatenating the languages that represent individual letters.
\\ \\
\textbf{Claim 2}: every finite language is regular. Since we know every $\set{\sigma}$ is regular, so is any finite unions of such languages.
\begin{ex}
Let $L_\text{even}$ be the language of all words of even length. $L_\text{even}$ is regular. We can form every word in $L_\text{even}$ using the expression $(\set{0, 1} \set{0, 1})^\star$.
\end{ex}
\begin{ex}
Let $L_\text{equal} = \set{ w : \#_0(w) = \#_1(w) }$. $L_\text{equal}$ is not regular. We can't prove this yet though.
\end{ex}
\subsection{Explicit Descriptions of Regular Languages: Regular Expressions}
A \textbf{regular expression} is just a word in a language that is designed to describe languages.
\subsubsection{Defining The Set of Regular Expressions}
Given some alphabet $\Sigma$, let $X$ be defined as
$$
X = (\Sigma \cup \set{+, \star, (, ), \emptyset, \epsilon})^\star
$$
We will define our core set $A$ as
$$
A = \set{\emptyset, \epsilon} \cup \set{ a : a \in \Sigma }
$$
Finally, we will define our set of operations $P$ as
$$
P = \left\{ \text{\AxiomC{$r_1$} \AxiomC{$r_2$} \BinaryInfC{$r_1 + r_2$} \DisplayProof, \AxiomC{$r_1$} \AxiomC{$r_2$} \BinaryInfC{$r_1 r_2$} \DisplayProof, \AxiomC{$r$} \UnaryInfC{$r^\star$} \DisplayProof, \AxiomC{$r$} \UnaryInfC{$(r)$} \DisplayProof } \right\}
$$
A regular expression is any member of this $I(A, P)$.
\\ \\
Suppose $\Sigma = \set{a, b}$. Some regular expressions over $\Sigma$ are $a$, $ab$, $a + b$, $a(a + \epsilon)^\star$, and $(a + b)(a + b)$.
\\ \\
\textbf{Claim}: in any regular expression, the number of $+$ symbols is at least one less than the number of symbols from the set $\Sigma \cup \set{ \epsilon, \emptyset }$. The proof of this (by structural induction) is left as an exercise.
\subsubsection{Mapping Regular Expressions to Languages}
We will define a mapping $L$ such that $L: \text{Regular Expression} \to \text{Language}$. We define such a mapping $L$ recursively over the definition of regular expressions.
\begin{itemize}
\item $L(\epsilon) = \set{ \epsilon }$
\item $L(a) = \set{ a }$ for every $a \in \Sigma$
\end{itemize}
Assume $L(r_1)$ and $L(r_2)$ are already defined. We then define the mappings for each operation:
\begin{itemize}
\item $L(r_1 r_2) = L(r_1) L(r_2)$
\item $L(r_1 + r_2) = L(r_1) \cup L(r_2)$
\item $L(r_1^\star) = (L(r_1))^\star$
\item $L((r_1)) = L(r_1)$
\end{itemize}
\noindent \textbf{Examples}:
\begin{itemize}
\item Let $r = (0 + 1)$. Then $L(r) = \set{0, 1}$.
\item Let $r = 01$. Then $L(r) = \set{01}$.
\item Let $r = (01)^\star$. Then $L(r) = \set{ \epsilon, 01, 010101, 01\ldots0101\ldots01, \ldots}$.
\item Let $r = (01)^\star + (10)^\star$. Then $L(r)$ is the set of all alternating strings that either start with 0 and end with 1, or start with 1 and end with 0.
\end{itemize}
Find an expression $r$ such that $L(r)$ is the set of \emph{all} alternating strings. One such expression is $r = (0 + \epsilon)(10)^\star(1 + \epsilon)$. Another expression to represent the same language is $r' = (01)^\star + (10)^\star + 1(01)^\star + 0(10)^\star$.
\\ \\
Since $L(r) = L(r')$ represent the same language, this shows that there is no \emph{unique} regular expression to describe a language. Multiple distinct regular expressions can represent the same language.
\subsubsection{Uses of Regular Expressions}
Regular expressions are often useful to search for patterns in code. For example, if you were looking for all strings that start with ``a'' and end with ``b'', you could use the regular expression $a(a + b)^\star b$.
\\ \\
Regular expressions are also used in compilers. Compilers need to find certain patterns in the code they're given to compile, in order to perform the parsing step of the compilation.
\\ \lecture{September 19, 2013} \\
\emph{The lecture from September 19, 2013 is omitted. If you have notes from this lecture, feel free to email me (\href{mailto:[email protected]}{[email protected]}) and I'll include them here.}
\section{Computing Machines} \lecture{September 24, 2013}
\subsection{Deterministic Finite Automata}
The first model of computing machines that we'll discuss are \textbf{deterministic finite automata} (also known as \text{DFA}s). A DFA is denoted by $A = (\Sigma, Q, q_0, \delta, F)$, where
\begin{itemize}
\item $\Sigma$ is the alphabet (the set of letters) of the DFA.
\item $Q$ is the set of all states in the DFA.
\item $q_0$ is the initial state.
\item $\delta$ is the transition function.
\item $F$ is the set of accepting states. (Note that there can be more than one accepting state.)
\end{itemize}
Every DFA can be represented with a bubble diagram, to make it easier to visualize.
\begin{ex}
A bubble diagram that represents the DFA that accepts all strings with an even number of $0$s and an odd number of $1$s is as follows:
% http://madebyevan.com/fsm/ is a great tool for generating these.
% It generates somewhat verbose code, unfortunately.
\begin{center}
\begin{tikzpicture}[scale=0.2]
\tikzstyle{every node}+=[inner sep=0pt]
\draw [black] (11.1,-17.2) circle (3);
\draw (11.1,-17.2) node {$q_0$};
\draw [black] (38.2,-14.3) circle (3);
\draw (38.2,-14.3) node {$q_1$};
\draw [black] (38.2,-14.3) circle (2.4);
\draw [black] (55.8,-13.6) circle (3);
\draw (55.8,-13.6) node {$q_2$};
\draw [black] (27.5,-32.2) circle (3);
\draw (27.5,-32.2) node {$q_3$};
\draw [black] (13.215,-15.077) arc (130.7768:61.43928:19.862);
\fill [black] (35.68,-12.67) -- (35.22,-11.85) -- (34.74,-12.73);
\draw (23.86,-9.78) node [above] {$1$};
\draw [black] (35.507,-15.62) arc (-66.28802:-101.4959:35.741);
\fill [black] (14.01,-17.92) -- (14.7,-18.57) -- (14.89,-17.59);
\draw (25.15,-19.02) node [below] {$1$};
\draw [black] (24.505,-32.212) arc (-95.55965:-169.33456:14.845);
\fill [black] (24.51,-32.21) -- (23.76,-31.64) -- (23.66,-32.63);
\draw (14.91,-28.88) node [below] {$0$};
\draw [black] (52.963,-14.571) arc (-74.43247:-101.01232:25.813);
\fill [black] (52.96,-14.57) -- (52.06,-14.3) -- (52.33,-15.27);
\draw (47.09,-16.04) node [below] {$0$};
\draw [black] (54.064,-16.046) arc (-37.38097:-75.98969:42.768);
\fill [black] (30.43,-31.58) -- (31.33,-31.87) -- (31.09,-30.9);
\draw (44.57,-26.32) node [below] {$1$};
\draw [black] (55.342,-16.562) arc (-12.78146:-100.5892:21.52);
\fill [black] (55.34,-16.56) -- (54.68,-17.23) -- (55.65,-17.45);
\draw (47.18,-30.28) node [below] {$1$};
\draw [black] (40.512,-12.399) arc (122.65769:61.89753:12.697);
\fill [black] (40.51,-12.4) -- (41.46,-12.39) -- (40.92,-11.55);
\draw (46.83,-9.86) node [above] {$0$};
\draw [black] (13.954,-18.118) arc (68.51255:26.59325:23.447);
\fill [black] (13.95,-18.12) -- (14.52,-18.88) -- (14.88,-17.95);
\draw (22.2,-22.15) node [above] {$0$};
\end{tikzpicture}
\end{center}
\end{ex}
To define $L(A)$, we extend $\delta: Q \times \Sigma \to Q$, to: $\hat{\delta}: Q \times \Sigma^\star \to Q$. Now, we define the language accepted by DFA $A$ as:
$$
L(A) = \set{ w : \hat{\delta}(q_0, w) \in F }
$$
\textbf{Claim}: For this automaton, $L(A) = \set{ w : \#_0(w) \text{ is even, and } \#_1(w) \text{ is odd}}$. But how do we prove this?
\\ \\
Let's rephrase our claim. For every $w \in \set{0, 1}^\star$, if $\#_0(w)$ is even and the $\#_1(w)$ is odd, then $A$ accepts $w$, and otherwise, $A$ does not accept $w$.
\\ \\
Since we said ``for every $w \in \set{0, 1}^\star$'', that should be a give away that we should use structural induction for this proof.
\\ \\
Let's define $\set{0, 1}^\star$ inductively. Note that we are \emph{not} yet defining the set of accepted members for the DFA, but instead, we are aiming to inductively define the set of all possible words across the alphabet $\Sigma = \set{0, 1}$.
\\ \\
We have our core set $A = \set{ \epsilon }$ (not to be confused with the $A$ we're using to denote the DFA). We also have two operations for any string $\sigma$,
$$
P = \bigg\{ \text{\AxiomC{$\sigma$} \UnaryInfC{$\sigma 1$} \DisplayProof, \AxiomC{$\sigma$} \UnaryInfC{$\sigma 0$} \DisplayProof} \bigg\}
$$
Rather than proving our claim directly, it will be easier to prove a stronger claim, which characterizes every state of the machine. This stronger claim (four claims, in fact) will be easier to prove because we'll have more to work with in our induction hypothesis.
\\ \\
For every $w \in \set{0, 1}^\star$:
\begin{itemize}
\item If $\#_0(w)$ is even and $\#_1(w)$ is even, then $\hat{\delta}(q_0, w) = q_0$.
\item If $\#_0(w)$ is even and $\#_1(w)$ is odd, then $\hat{\delta}(q_0, w) = q_1$.
\item If $\#_0(w)$ is odd and $\#_1(w)$ is odd, then $\hat{\delta}(q_0, w) = q_2$.
\item If $\#_0(w)$ is odd and $\#_1(w)$ is even, then $\hat{\delta}(q_0, w) = q_3$.
\end{itemize}
\begin{proof}
We will prove the four claims using structural induction on $\set{0, 1}^\star$.
\\ \\
\textbf{Base case}: $w = \epsilon$. It's easy to see that $\#_0(w)$ is even, $\#_1(w)$ is even, and indeed, $\hat{\delta}(q_0, \epsilon) = q_0$.
\\ \\
\textbf{Induction step}: assume the four claims hold for $w$. We need to show they hold for $w_0$ and $w_1$.
\\ \\
\underline{The operation $w_0$}: if $\#_0(w_0)$ is even and $\#_1(w_0)$ is even, then $\#_0(w)$ is odd and $\#_1(w)$ is even. By the induction hypothesis, $\hat{\delta}({q_0, w}) = q_3$.
\\ \\
By the definition of $\hat{\delta}$, $\hat{\delta}(q_0, w_0) = \delta( \hat{\delta}(q_0, w), 0) = \delta(q_3, 0) = q_0$.
\\ \\
In order to finish this proof, we still need to prove the other three cases for $w_0$ and all four cases for $w_1$.
\end{proof}
Every task carried out by a DFA $A$ has the general form: ``given some input string $w$, decide if $w \in L(A)$''. In particular, we can say each $A$ handles the decision problem of the language $L(A)$. DFAs perform this task by consuming one character of $w$ at a time.
\\ \\
Note that not every language can be represented by a DFA. The following languages can all be handled by a DFA:
\begin{itemize}
\item $\set{ \epsilon }$
\item $\Sigma^\star$
\item For every $a \in \Sigma, \set{a}$
\item For every $w \in \Sigma^\star, \set{w}$
\end{itemize}
\subsubsection{Closure Operations}
Is the family of languages that can be computed by a DFA closed under $\cup, \cap$, set difference, concatenation, $L^\star$, etc?
\\ \\
\textbf{Claim}: if each of $L_1, L_2$ can be computed by some DFA, then so can $L_1 \cap L_2, L_1 \cup L_2$, and $L_1 \backslash L_2$.
\begin{proof}
Let $A_1$ and $A_2$ be DFAs such that $L(A_1) = L_1$ and $L(A_2) = L_2$. Let's construct some DFA $A_3$ such that $L(A_3) = L_1 \cap L_2$.
\\ \\
The general idea is to run the two machines, $A_1$ and $A_2$, in parallel, and if $w \in F$ in both machines, accept $w$ for $L_1 \cap L_2$.
\\ \\
Given any two automata $A_1 = (\Sigma, Q_1, q_0^1, \delta_1, F_1)$ and $A_2 = (\Sigma, Q_2, q_0^2, \delta_2, F_2)$, we define the \textbf{product automaton} as
$$
A_1 \times A_2 = (\Sigma, Q_1 \times Q_2, (q_0^1, q_0^2), \delta_{A_1 \times A_2}, F_{A_1 \times A_2})
$$
where $\Sigma$ is the same alphabet as defined previously, $Q_1 \times Q_2 = \set{ (p, q) : p \in Q_1, q \in Q_2 }$, and $\delta_{A_1 \times A_2}((p, q), a) = (\delta_1(p, a), \delta_2(q, a))$.
\\ \\
$F$ will be defined differently depending on which combination of $L_1$ and $L_2$ we wish to compute. For example:
\begin{itemize}
\item $F_{A_1 \cap A_2} = \set{ (p, q) : p \in F_1 \text{ and } q \in F_2 }$
\item $F_{A_1 \cup A_2} = \set{ (p, q) : p \in F_1 \text{ or } q \in F_2 }$
\item $F_{A_1 \backslash A_2} = \set{ (p, q) : p \in F_1 \text{ and } q \not \in F_2 }$
\end{itemize}
\textbf{Conclusion}: if $L_1$ and $L_2$ both have DFAs, then so do $L_1 \cap L_2, L_1 \cup L_2, L_1 \backslash L_2, L_1 \Delta L_2$, etc. (any boolean operation on sets). However, note that the product automaton does not tell us if a language is closed under $L^\star$ or concatenation.
\\ \\
Is it true that whenever $L$ has a DFA, then so does its complement, $L^C = \Sigma^\star \backslash L$? Yes, its DFA is $A_{\Sigma^\star \times L}$. Let $A = (\Sigma, Q, q_0, \delta, F)$ be a DFA such that $L(A) = L$, then let $A^C = (\Sigma, Q, q_0, \delta, Q \backslash F)$. Then, $L(A^C) = L^C$.
\\ \\
Since in our definition of a DFA we required that for every state $q \in Q$ and every letter $a \in \Sigma$, $q$ has some outgoing edge labeled by $a$ (we are never stuck).
\end{proof}
\subsubsection{The Relationship Between Regular Expressions and DFAs} \lecture{September 26, 2013}
\textbf{Claim}: for every regular expression $r$, there exists some DFA $A$ such that $L(r) = L(A)$.
\begin{proof}
We will prove this claim by structural induction on the set of regular expressions.
\\ \\
\textbf{Base case}: the elements of the core set of regular expressions are $\set{ \epsilon, \underbrace{a, b, \ldots}_{\Sigma}, \emptyset}$. For each of these, we should show that there exists some DFA $A$ such that $L(A) = L(r)$.
\begin{itemize}
\item $L(\epsilon) = \set{ \epsilon }$. We have seen such an $A$.
\item $L(\emptyset) = \emptyset$. We have seen such an $A$.
\item For every $a \in \Sigma, L(a) = \set{a}$ and there exists some DFA $A$ such that $L(A) = \set{a}$.
\end{itemize}
\textbf{Inductive step}: we must show that under each operation the claim continues to hold. Recall the operations we defined on regular expressions to be
$$
P = \bigg\{ \text{\AxiomC{$r_1$} \AxiomC{$r_2$} \BinaryInfC{$r_1 + r_2$} \DisplayProof, \AxiomC{$r_1$} \AxiomC{$r_2$} \BinaryInfC{$r_1 r_2$} \DisplayProof, \AxiomC{$r$} \UnaryInfC{$r^\star$} \DisplayProof, \AxiomC{$r$} \UnaryInfC{$(r)$} \DisplayProof} \bigg\}
$$
\begin{enumerate}
\item \underline{Consider the + operation}. Assume there are automata $A_1, A_2$ such that $L(r_1) = L(A_1)$ and $L(r_2) = L(A_2)$. We need to show that there exists some $A_3$ such that $L(A_3) = L(r_1 + r_2)$.
\\ \\
Recall that $L(r_1 + r_2) = L(r_1) \cup L(r_2)$. We saw that given $A_1, A_2$ we can construct $A_3$ such that $L(A_3) = L(A_1) \cup L(A_2)$.
\item \underline{Consider the concatenation operation}. Assume $L(A_1) = L(r_1)$ and $L(A_2) = L(r_2)$. We need to define $A_3$ such that $L(A_3) = L(r_1 r_2) = L(A_1) L(A_2)$. Note that
$$
L(A_1) L(A_2) = \set{ w w' : w \in L(A_1), w' \in L(A_2) }
$$
One idea on how to approach this is to take DFAs $A_1$ and $A_2$ and merge each final state in $A_1$ with the initial state of $A_2$ (and make the states no longer final states). However, this approach will not suffice because we could then have multiple edges for the same letter, leading out of those newly-merged states.
\\ \\
In order to solve this problem, we need to take a detour to define a new family of machines. We'll return to this proof at a later point in time. \qedhere
\end{enumerate}
\end{proof}
\subsection{Non-deterministic Finite Automata (NFAs)}
NFAs are similar to deterministic finite automata, except they allow more edges on a state than there are letters in $\Sigma$. You can create multiple paths to be followed simultaneously.
\\ \\
We define an NFA $N$ as $N = (\Sigma, Q, q_0, \delta, F)$ where $\Sigma, Q, q_0$, and $F$ are exactly the same as in DFAs. However, for NFAs, we define $\Sigma: Q \times \Sigma \to \set{ B : B \subseteq Q }$.
\begin{ex}
Will this machine (an NFA) accept $a$?
\begin{center}
\begin{tikzpicture}[scale=0.2]
\tikzstyle{every node}+=[inner sep=0pt]
\draw [black] (10.5,-16.9) circle (3);
\draw (10.5,-16.9) node {$q_0$};
\draw [black] (24.6,-33) circle (3);
\draw (24.6,-33) node {$q_1$};
\draw [black] (39.3,-14.5) circle (3);
\draw (39.3,-14.5) node {$q_2$};
\draw [black] (39.3,-14.5) circle (2.4);
\draw [black] (13.278,-15.77) arc (110.16418:79.3631:43.633);
\fill [black] (36.37,-13.85) -- (35.68,-13.21) -- (35.49,-14.19);
\draw (24.57,-12.68) node [above] {$a$};
\draw [black] (9.177,-14.22) arc (234:-54:2.25);
\draw (10.5,-9.65) node [above] {$b$};
\fill [black] (11.82,-14.22) -- (12.7,-13.87) -- (11.89,-13.28);
\draw [black] (13.255,-18.082) arc (62.8233:19.59892:21.711);
\fill [black] (23.79,-30.11) -- (23.99,-29.19) -- (23.05,-29.53);
\draw (20.21,-21.64) node [right] {$a$};
\draw [black] (21.791,-31.955) arc (-114.74396:-162.83382:19.791);
\fill [black] (11.17,-19.82) -- (10.92,-20.73) -- (11.88,-20.44);
\draw (14.64,-28.47) node [left] {$b$};
\draw [black] (39.091,-17.489) arc (-8.58948:-68.35145:18.755);
\fill [black] (27.46,-32.12) -- (28.39,-32.29) -- (28.02,-31.36);
\draw (35.79,-27.78) node [right] {$b$};
\end{tikzpicture}
\end{center}
Whether this machine will accept $a$ is largely up to how we define acceptance. Generally, NFAs are defined so if there is \emph{any} valid path that leads to a final state, the string will be accepted. So, under that definition, $a$ would be accepted by this NFA.
\end{ex}
\subsubsection{Languages Accepted by an NFA}
Let's define the language $L(N)$ that is accepted by an NFA $N$. Intuitively, we will accept a string $w$ if and only if there is a path from $q_0$ following the letters of $w$ and ending inside $F$.
\\ \\
We will extend our formal definition of $\delta$ to $\hat{\delta}: Q \times \Sigma^\star \to \set{ B \subseteq Q }$. $\hat{\delta}(q, w)$ is the set of all states that can be reached from $q$ after reading $w$.
\\ \\
For $w = \epsilon$, $\hat{\delta}(q, \epsilon) = \set{q}$. Assume that $\hat{\delta}(q, w)$ is already defined; say $\hat{\delta}(q, w) = \set{q_{i_1}, q_{i_2}, \ldots, q_{i_k}} \subseteq Q$. Now, define $\hat{\delta}(q, wa)$ for any $a \in \Sigma$ as
$$
\hat{\delta}(q, wa) = \bigcup_{l = 1}^{k} \delta(q_{i_l}, a)
$$
We can then define the language accepted by a particular NFA $N$ as
$$
L(N) = \set{ w : \hat{\delta}(q_0, w) \cap F \ne \emptyset }
$$
Phrased another way, this set contains all words $w$ which, after reading all characters in $w$, ended up in \emph{at least} one accepting state.
\subsubsection{Uses of NFAs}
NFAs sometimes give us a very compact representation of a language that would otherwise be complicated to describe as a DFA.
\begin{ex}
NFAs could be used for keyword searches. For example, suppose we want to create an NFA that matches the keywords ``miley'' and ``britney'', and nothing else. We can form this NFA with the alphabet $\Sigma = {a, b, \ldots, z}$:
\begin{center}
\begin{tikzpicture}[scale=0.2]
\tikzstyle{every node}+=[inner sep=0pt]
\draw [black] (10.5,-16.9) circle (3);
\draw (10.5,-16.9) node {$q_0$};
\draw [black] (21.4,-17.3) circle (3);
\draw (21.4,-17.3) node {$q_1$};
\draw [black] (32.8,-17.3) circle (3);
\draw (32.8,-17.3) node {$q_2$};
\draw [black] (43.7,-17.3) circle (3);
\draw (43.7,-17.3) node {$q_3$};
\draw [black] (55,-17.3) circle (3);
\draw (55,-17.3) node {$q_4$};
\draw [black] (66,-17.3) circle (3);
\draw (66,-17.3) node {$q_5$};
\draw [black] (66,-17.3) circle (2.4);
\draw [black] (21.4,-27.7) circle (3);
\draw (21.4,-27.7) node {$q_6$};
\draw [black] (30.3,-27.7) circle (3);
\draw (30.3,-27.7) node {$q_7$};
\draw [black] (38.9,-27.7) circle (3);
\draw (38.9,-27.7) node {$q_8$};
\draw [black] (46.9,-27.7) circle (3);
\draw (46.9,-27.7) node {$q_9$};
\draw [black] (56.4,-27.7) circle (3);
\draw (56.4,-27.7) node {$q_{10}$};
\draw [black] (64.8,-27.7) circle (3);
\draw (64.8,-27.7) node {$q_{11}$};
\draw [black] (73.4,-27.7) circle (3);
\draw (73.4,-27.7) node {$q_{12}$};
\draw [black] (73.4,-27.7) circle (2.4);
\draw [black] (9.177,-14.22) arc (234:-54:2.25);
\draw (10.5,-9.65) node [above] {$a,\mbox{ }b,\mbox{ }\ldots,\mbox{ }z$};
\fill [black] (11.82,-14.22) -- (12.7,-13.87) -- (11.89,-13.28);
\draw [black] (13.247,-15.724) arc (104.33527:71.46142:9.724);
\fill [black] (18.75,-15.93) -- (18.15,-15.2) -- (17.83,-16.15);
\draw (16.05,-14.88) node [above] {$m$};
\draw [black] (24.091,-15.998) arc (107.31712:72.68288:10.11);
\fill [black] (30.11,-16) -- (29.49,-15.28) -- (29.2,-16.24);
\draw (27.1,-15.04) node [above] {$i$};
\draw [black] (35.544,-16.114) arc (105.09913:74.90087:10.387);
\fill [black] (40.96,-16.11) -- (40.31,-15.42) -- (40.05,-16.39);
\draw (38.25,-15.26) node [above] {$l$};
\draw [black] (46.433,-16.086) arc (105.88913:74.11087:10.656);
\fill [black] (52.27,-16.09) -- (51.63,-15.39) -- (51.36,-16.35);
\draw (49.35,-15.18) node [above] {$e$};
\draw [black] (57.63,-15.886) arc (108.67821:71.32179:8.962);
\fill [black] (63.37,-15.89) -- (62.77,-15.16) -- (62.45,-16.1);
\draw (60.5,-14.91) node [above] {$y$};
\draw [black] (18.453,-28.172) arc (-90.95847:-178.51347:8.544);
\fill [black] (18.45,-28.17) -- (17.66,-27.66) -- (17.64,-28.66);
\draw (11.56,-26.18) node [below] {$b$};
\draw [black] (23.705,-25.839) arc (113.15404:66.84596:5.455);
\fill [black] (27.99,-25.84) -- (27.46,-25.07) -- (27.06,-25.98);
\draw (25.85,-24.9) node [above] {$r$};
\draw [black] (32.633,-25.878) arc (111.77675:68.22325:5.302);
\fill [black] (36.57,-25.88) -- (36.01,-25.12) -- (35.64,-26.05);
\draw (34.6,-25) node [above] {$i$};
\draw [black] (41.4,-26.114) arc (106.30483:73.69517:5.343);
\fill [black] (44.4,-26.11) -- (43.77,-25.41) -- (43.49,-26.37);
\draw (42.9,-25.4) node [above] {$t$};
\draw [black] (49.53,-26.299) arc (106.51296:73.48704:7.458);
\fill [black] (53.77,-26.3) -- (53.14,-25.59) -- (52.86,-26.55);
\draw (51.65,-25.49) node [above] {$n$};
\draw [black] (58.643,-25.777) arc (113.25672:66.74328:4.956);
\fill [black] (62.56,-25.78) -- (62.02,-25) -- (61.62,-25.92);
\draw (60.6,-24.87) node [above] {$e$};
\draw [black] (67.387,-26.237) arc (105.81687:74.18313:6.286);
\fill [black] (70.81,-26.24) -- (70.18,-25.54) -- (69.91,-26.5);
\draw (69.1,-25.5) node [above] {$y$};
\end{tikzpicture}
\end{center}
\end{ex}
\begin{ex}
Suppose we want to define an NFA for the language $L_n$ = \{$w$ : the $n$th-before-last letter in $w$ is 1\} on the alphabet $\Sigma = \set{0, 1}$. We could form the following NFA to represent this:
\begin{center}
\begin{tikzpicture}[scale=0.2]
\tikzstyle{every node}+=[inner sep=0pt]
\draw [black] (12.3,-19.3) circle (3);
\draw (12.3,-19.3) node {$q_0$};
\draw [black] (23.5,-19.3) circle (3);
\draw (23.5,-19.3) node {$q_1$};
\draw [black] (34.8,-19.3) circle (3);
\draw (34.8,-19.3) node {$q_2$};
\draw (48.2,-19.3) node {$\ldots$};
\draw [black] (61.6,-19.3) circle (3);
\draw (61.6,-19.3) node {$q_n$};
\draw [black] (61.6,-19.3) circle (2.4);
\draw [black] (5.9,-19.3) -- (9.3,-19.3);
\fill [black] (9.3,-19.3) -- (8.5,-18.8) -- (8.5,-19.8);
\draw [black] (10.977,-16.62) arc (234:-54:2.25);
\draw (12.3,-12.05) node [above] {$0,\mbox{ }1$};
\fill [black] (13.62,-16.62) -- (14.5,-16.27) -- (13.69,-15.68);
\draw [black] (51.095,-18.525) arc (100.7586:79.2414:20.381);
\fill [black] (58.7,-18.53) -- (58.01,-17.88) -- (57.83,-18.87);
\draw (54.9,-17.67) node [above] {$0,\mbox{ }1$};
\draw [black] (37.713,-18.592) arc (99.80078:80.19922:22.248);
\fill [black] (45.29,-18.59) -- (44.58,-17.96) -- (44.41,-18.95);
\draw (41.5,-17.77) node [above] {$0,\mbox{ }1$};
\draw [black] (26.412,-18.595) arc (98.80333:81.19667:17.888);
\fill [black] (31.89,-18.59) -- (31.17,-17.98) -- (31.02,-18.97);
\draw (29.15,-17.88) node [above] {$0,\mbox{ }1$};
\draw [black] (15.143,-18.362) arc (101.85128:78.14872:13.424);
\fill [black] (20.66,-18.36) -- (19.98,-17.71) -- (19.77,-18.69);
\draw (17.9,-17.58) node [above] {$1$};
\end{tikzpicture}
\end{center}
When using an NFA to represent languages like $L_n$, we need just $n + 1$ states.
\\ \\
\textbf{Claim}: every DFA that accepts $L_n$ must have at least $2^n$ states. Why? For every word $w$ of length less than $n$, there should be a different state. In other words, if $w_1 \ne w_2$ and $\hat{\delta}(q_0, w_1) = \hat{\delta}(q_0, w_2)$, the DFA is bound to fail. Since $w_1 \ne w_2$, the last ``1'' must be in a different position in $w_1$ and $w_2$, which makes this DFA representation fail.
\end{ex}
\textbf{Claim}: a language $L$ has an NFA $N$ such that $L(N) = L$ if and only if there is a DFA $A$ such that $L(A) = A$. In other words, nondeterminism does not increase the power of finite automata.
\begin{proof}
First, note that any DFA can also be viewed as an NFA. For every $q \in Q$ and $a \in \Sigma$, if $\delta_N(q, a) = p$, define $\delta_D(q, a) = \set{p}$. In other words, let $\delta_N(q, a) = \set{\delta_D(q, a)}$.
\\ \\
For the other direction, if we're given some NFA $N = (\Sigma^N, Q^N, q_0^N, \delta^N, F^N)$, we wish to define a DFA $A = (\Sigma^A, Q^A, q_0^A, \delta^A, F^A)$ such that $L(A) = L(N)$. In particular:
\begin{itemize}
\item $\Sigma^A = \Sigma^N$
\item $Q^A = \set{ B \subseteq Q^N }$
\item $q_0^A = \set{ q_0^N }$
\item $\delta^A(q, a) = $ the set of all states that can be reached from $q$ by reading $a$.
\item $F^A = \set{ B \subseteq Q^N : B \cap F^N \ne \emptyset }$
\end{itemize}
\end{proof}
\begin{ex} \lecture{October 1, 2013}
Let's look at an example of converting an NFA to an equivalent DFA. Let $N$ be:
\begin{center}
\begin{tikzpicture}[scale=0.2]
\tikzstyle{every node}+=[inner sep=0pt]
\draw [black] (11.4,-14.5) circle (3);
\draw (11.4,-14.5) node {$q_0$};
\draw [black] (26,-14.4) circle (3);
\draw (26,-14.4) node {$q_1$};
\draw [black] (40.5,-14.4) circle (3);
\draw (40.5,-14.4) node {$q_2$};
\draw [black] (40.5,-14.4) circle (2.4);
\draw [black] (3.8,-13.1) -- (8.45,-13.96);
\fill [black] (8.45,-13.96) -- (7.75,-13.32) -- (7.57,-14.3);
\draw [black] (12.723,-17.18) arc (54:-234:2.25);
\draw (11.4,-21.75) node [below] {$0,\mbox{ }1$};
\fill [black] (10.08,-17.18) -- (9.2,-17.53) -- (10.01,-18.12);
\draw [black] (14.4,-14.48) -- (23,-14.42);
\fill [black] (23,-14.42) -- (22.2,-13.93) -- (22.2,-14.93);
\draw (18.7,-13.94) node [above] {$1$};
\draw [black] (29,-14.4) -- (37.5,-14.4);
\fill [black] (37.5,-14.4) -- (36.7,-13.9) -- (36.7,-14.9);
\draw (33.25,-13.9) node [above] {$0,\mbox{ }1$};
\end{tikzpicture}
\end{center}
$N$ accepts all strings where the second-to-last letter is 1. Note that $N$ is nondeterministic because $q_0$ has two edges for 1, and $q_2$ has no edges. By way of subset construction, we can construct an equivalent DFA:
\begin{center}
\begin{tikzpicture}[scale=0.2]
\tikzstyle{every node}+=[inner sep=0pt]
\draw [black] (11.4,-14.5) circle (3);
\draw (11.4,-14.5) node {$q_0$};
\draw [black] (26,-14.4) circle (3);
\draw (26,-14.4) node {$q_1^D$};
\draw [black] (40.5,-14.4) circle (3);
\draw (40.5,-14.4) node {$q_2^D$};
\draw [black] (40.5,-14.4) circle (2.4);
\draw [black] (40.5,-25.9) circle (3);
\draw (40.5,-25.9) node {$q_3^D$};
\draw [black] (40.5,-25.9) circle (2.4);
\draw [black] (3.8,-13.1) -- (8.45,-13.96);
\fill [black] (8.45,-13.96) -- (7.75,-13.32) -- (7.57,-14.3);
\draw [black] (12.723,-17.18) arc (54:-234:2.25);
\draw (11.4,-21.75) node [below] {$0$};
\fill [black] (10.08,-17.18) -- (9.2,-17.53) -- (10.01,-18.12);
\draw [black] (14.4,-14.48) -- (23,-14.42);
\fill [black] (23,-14.42) -- (22.2,-13.93) -- (22.2,-14.93);
\draw (18.7,-13.94) node [above] {$1$};
\draw [black] (37.875,-15.838) arc (-68.18865:-111.81135:12.449);
\fill [black] (37.88,-15.84) -- (36.95,-15.67) -- (37.32,-16.6);
\draw (33.25,-17.23) node [below] {$0$};
\draw [black] (28.785,-13.296) arc (106.2451:73.7549:15.962);
\fill [black] (28.78,-13.3) -- (29.69,-13.55) -- (29.41,-12.59);
\draw (33.25,-12.16) node [above] {$1$};
\draw [black] (13.144,-12.064) arc (139.32878:41.06501:16.924);
\fill [black] (13.14,-12.06) -- (14.04,-11.78) -- (13.29,-11.13);
\draw (25.92,-5.67) node [above] {$0$};
\draw [black] (41.823,-28.58) arc (54:-234:2.25);
\draw (40.5,-33.15) node [below] {$1$};
\fill [black] (39.18,-28.58) -- (38.3,-28.93) -- (39.11,-29.52);
\draw [black] (37.539,-26.328) arc (-89.29985:-167.53627:11.412);
\fill [black] (37.54,-26.33) -- (36.73,-25.84) -- (36.75,-26.84);
\draw (29.3,-24.35) node [below] {$1$};
\draw [black] (40.5,-22.9) -- (40.5,-17.4);
\fill [black] (40.5,-17.4) -- (40,-18.2) -- (41,-18.2);
\draw (41,-20.15) node [right] {$0$};
\end{tikzpicture}
\end{center}
where $q_0^D = \set{q_0}, q_1^D = \set{q_0, q_1}, q_2^D = \set{q_0, q_2}, q_3^D = \set{q_0, q_1, q_2}$.
\\ \\
The accepting states of this DFA are all states which contain at least one accepting state from the NFA.
\end{ex}
Note that in the worst case, an NFA with $n$ states could be equivalent to a DFA with $2^n$ states. This intuitively follows from the way we construct NFAs -- there are $2^n$ subsets of $n$ items.
\subsection{Nondeterministic Finite Automata with Epsilon Transitions}
We will now introduce a third type of machine, \textbf{the $\boldsymbol{\epsilon}$-NFA}. An $\epsilon$-transition is essentially a transition that can occur spontaneously. That is, an $\epsilon$-transition is a transition that can be followed without reading any characters.
\begin{ex}
This is what an $\epsilon$-NFA could look like:
\begin{center}
\begin{tikzpicture}[scale=0.2]
\tikzstyle{every node}+=[inner sep=0pt]
\draw [black] (11.4,-14.5) circle (3);
\draw (11.4,-14.5) node {$q_0$};
\draw [black] (29,-14.5) circle (3);
\draw (29,-14.5) node {$q_1$};
\draw [black] (43.8,-7.3) circle (3);
\draw [black] (43.8,-18.8) circle (3);
\draw [black] (43.8,-30.3) circle (3);
\draw [black] (29,-26.6) circle (3);
\draw (29,-26.6) node {$q_2$};
\draw [black] (3.8,-13.1) -- (8.45,-13.96);
\fill [black] (8.45,-13.96) -- (7.75,-13.32) -- (7.57,-14.3);
\draw [black] (10.077,-11.82) arc (234:-54:2.25);
\draw (11.4,-7.25) node [above] {$b$};
\fill [black] (12.72,-11.82) -- (13.6,-11.47) -- (12.79,-10.88);
\draw [black] (14.4,-14.5) -- (26,-14.5);
\fill [black] (26,-14.5) -- (25.2,-14) -- (25.2,-15);
\draw (20.2,-15) node [below] {$a,\mbox{ }\epsilon$};
\draw [black] (31.7,-13.19) -- (41.1,-8.61);
\fill [black] (41.1,-8.61) -- (40.16,-8.51) -- (40.6,-9.41);
\draw (37.39,-11.41) node [below] {$b$};
\draw [black] (31.88,-15.34) -- (40.92,-17.96);
\fill [black] (40.92,-17.96) -- (40.29,-17.26) -- (40.01,-18.22);
\draw (35.58,-17.2) node [below] {$b$};
\draw [black] (41.308,-28.631) arc (-125.87866:-147.86483:41.443);
\fill [black] (41.31,-28.63) -- (40.95,-27.76) -- (40.37,-28.57);
\draw (34.82,-24.85) node [left] {$b$};
\draw [black] (13.87,-16.2) -- (26.53,-24.9);
\fill [black] (26.53,-24.9) -- (26.15,-24.04) -- (25.59,-24.86);
\draw (19.25,-21.05) node [below] {$a$};
\draw [black] (40.805,-30.418) arc (-91.63068:-116.44181:22.112);
\fill [black] (40.8,-30.42) -- (40.02,-29.9) -- (39.99,-30.9);
\draw (35.39,-30.33) node [below] {$\epsilon$};
\end{tikzpicture}
\end{center}
\end{ex}
Let's consider a more simplistic example to understand $\epsilon$-NFAs better.
\begin{ex}
Consider $N_1$:
\begin{center}
\begin{tikzpicture}[scale=0.2]
\tikzstyle{every node}+=[inner sep=0pt]
\draw [black] (11.4,-14.5) circle (3);
\draw (11.4,-14.5) node {$q_0$};
\draw [black] (29,-14.5) circle (3);
\draw (29,-14.5) node {$q_1$};
\draw [black] (29,-14.5) circle (2.4);
\draw [black] (3.8,-13.1) -- (8.45,-13.96);
\fill [black] (8.45,-13.96) -- (7.75,-13.32) -- (7.57,-14.3);
\draw [black] (14.4,-14.5) -- (26,-14.5);
\fill [black] (26,-14.5) -- (25.2,-14) -- (25.2,-15);
\draw (20.2,-15) node [below] {$1$};
\end{tikzpicture}
\end{center}
It's trivial to see that $L(N_1) = \set{1}$. Now, consider $N_2$, the same NFA but adding an $\epsilon$-transition:
\begin{center}
\begin{tikzpicture}[scale=0.2]
\tikzstyle{every node}+=[inner sep=0pt]
\draw [black] (11.4,-14.5) circle (3);
\draw (11.4,-14.5) node {$q_0$};
\draw [black] (29,-14.5) circle (3);
\draw (29,-14.5) node {$q_1$};
\draw [black] (29,-14.5) circle (2.4);
\draw [black] (3.8,-13.1) -- (8.45,-13.96);
\fill [black] (8.45,-13.96) -- (7.75,-13.32) -- (7.57,-14.3);
\draw [black] (14.4,-14.5) -- (26,-14.5);