Core 139
Fall, 2004
Algorithms and Complexity:
Why Encryption Systems Work, Part 1
Introduction
In order to understand why modern encryption systems
work, it is necessary to understand some basic conceptsof computer science
involving algorithms and their complexity. These concepts are not
difficult, just clever counting really. Full analysis of some complex
algorithms can, however, get quite difficult. Our objective will
be to understand the following concepts:
-
What is an algorithm?
-
What does it mean to relate the time needed to carry out an algorithm to
the size of the problem to which it is applied -- the complexity relationship?
-
What are "hard" problems in a computational sense?
-
What is a "trapdoor" algorithm and how is it used for encryption?
Algorithms
An algorithm is simply a systematic method for solving
a particular type of problem. It is like a recipe, directions for
cooking a dish which may be varied to cook more or less. It is like
a knitting pattern, directions for knitting a sweater or other garmet
which may be varied to produce different sizes.
The key aspect of an algorithm is that it is precise,
more precise than some recipes (how precise is "a pinch of salt," "season
to taste," or "cook until tender"?), much like a knitting pattern -- "knit
one, pearl two." If you are given an algorithm for a problem and
an instance of that problem, then you need only follow the steps of the
algorithm in order to solve the given instance of the problem.
The analogies between algorithms and recipes and
knitting patterns can be carried further. Just as some recipes are
very simple -- how hard is it to boil an egg -- and some knitting patterns
are quite easy -- knitting a pair of thumbless mittens, for example, some
algorithms are almost too obvious to be worth analyzing -- how hard is
it to describe how to add up a list of N numbers? On the other hand,
some recipes are very complex -- Chicken Kiev Cordon Bleu -- and did you
ever knit an Irish sweater? Some algorithms are also complex to understand;
there is one called Fast Fourier Transforms (FFT) which I have coded, but
I still have difficulty understanding it. The FFT algorithm is important
because it has applications in many areas, such as signal processing and
although it is complex, it is very fast. We will start by looking
at a very simple algorithm and then consider some slightly more complex
algorithms.
Summing a list of numbers
Consider the problem of adding up a list of N numbers.
Here is a simple algorithm:
-
Set the sum to zero.
-
For each number in the list, add that number to the sum.
This algorithm shows a typical characteristic: When you describe
an algorithm for a problem type for which there can be differing amounts
of data (the number of items in the list to be added), the steps of the
algorithm often include some repetition. Here the repetition is the
fact that the step "add that number to the sum" is repeated once for each
item in the list.
Sorting a list of names
A classic algorithm is a method for sorting a list of
N names. In fact, there are many sorting algorithms of which we will
consider two. The first is called selection sort:
-
Start with the whole list, and continue with each unsorted list which is
one item shorter after the smallest item has been moved to the front.
-
For the unsorted list, go through it one item at a time, keeping track
of which item is the first in alphabetical order (the "smallest" name).
-
After completing a pass through the list, switch the smallest name with
the first name.The next unsorted list includes only those items after the
one just switched to the front.
-
When the unsorted list is reduced to one item, the whole list has been
placed in sorted order.
Selection sort in effect works by picking the smallest element in
the list and putting it at the front, then picking the smallest of the
rest and putting it next, etc., until all the items have been placed.
This sorting algorithm can be used on a list of any things for which an
order relationship is described, numbers, names, etc.
Bin sort
A different sorting algorithm can be applied to numbers
or names with a fixed number of digits or letters. The idea of this
algorithm, called radixsort, is to sort the items based on the right-most
(least significant) digit into ten bins for digits (it would be similar
with names with 26 bins for letters). After the first separation
into bins, combine the items in the order of the bins, then sort again
using the same procedure on the second digit, then on the third digit,
etc. Here are the steps:
-
Carry out the following steps first for the right-most digit in the numbers,
then the next digit to the left, and so on to the left-most digit:
-
For each number, place it into the bin according to the digit being checked
on this round.
-
In the next round, sort first the items in the digit "0" bin, then the
items in the "1" bin, on up to the items in the "9" bin.
This sort takes advantage of the fact that we know the numbers we are dealing
with have a limited number of digits. Consequently the number of
times we need to pass through the list is dependent on the number of digits,
not on the number of items in the list, a little different from the selection
sort algorithm.
Factoring an integer
A problem that is important in modern encryption is
factoring a number into its prime factors. We will describe a simple
algorithm for factoring; there are several much more sophisticated ways
to factor numbers. Our algorithm is based on the fact that we can
just try the possible factors one after the other and divide them into
the target number, until the target number is reduced to one. Here
are the steps:
-
First keep dividing the number by 2, counting how many times this works
until 2 will no longer go into the target number evenly. The count
is the number of times the factor 2 occurs in the number.
-
For every odd number starting with 3 and counting up (by twos), carry out
the same procedure, dividing the target by the odd number until it will
no longer divide evenly, counting the number of divides. The count
is the number of times that odd number is a factor.
-
Continue step 2 for successive odd umbers until the target is reduced to
1. You now have a list of the prime factors of the target number
and the "multiplicity" of the factors (the number of times each occurs
as a factor).
This algorithm is inefficient, because many of the numbers we try dividing
by cannot possibly work, because they are not prime. For example,
when we try dividing by 15, it will go into the target number zero times,
that is, it is not a factor. This is because we already divided the
number by 3 and 5 as many times as possible. However, the process
of determining primes as we check out numbers itself takes time.
We can speed up the algorithm by using our knowledge of some of the early
primes, but if we factor large numbers, we will eventually get into territory
where we do not know the primes. Here is an example of the process
of factoring. Suppose our target number is 4840.
-
Divide by 2 as many times as possible, getting 2420, then 1210, then 605.
So 2 is a prime factor three times.
-
For each odd number, do the same process:
-
for 3, we find it does not go evenly, not a factor;
-
for 5, we divide once and get 121,which is not divisible again by 5,
so 5 is a prime factor once;
-
for 7, we cannot divide 121, not a factor;
-
for 9, we cannot divide 121, not a factor;
-
for 11, we divide once and get 11, and we divide again and get 1, so 11
is a factor twice.
-
Since out target number has been reduced to 1, we are done.
The factors of 4840 are 2 (three times), 5(once), 11 (twice). That
is 4840 = (23)(5)(112).
The above example give an idea of what it means to provide an algorithm:
a complete, step-by-step, method for solving a particular type of problem.
Next we need to consider how long different algorithms take.