Algorithms Week 1 Analysis of Algorithms
Analysis of Algorithms
Introduction
Cast of characters.
- Programmer needs to develop a working solution.
- Client wants to solve problem efficiently.
- Theoretician wants to understand.
- Basic blocking and tackling is sometimes necessary.
- Student might play any or all of these roles someday.
Running time.
- Analytic Engine: how many times do you have to turn the crank?
Reasons to analyze algorithms:
- Predict performance.
- Compare algorithms.
- Provide guarantees.
- Understand theoretical basis.
- Primary practical reason: avoid performance bugs.
Discrete Fourier transform.
- Break down waveform of samples into periodic components.
- Applications: DVD, JPEG, MRI, astrophysics, ….
- Brute force: steps.
- FFT algorithm: steps, enables new technology.
N-body simulation.
- Simulate gravitational interactions among bodies.
- Brute force: steps.
- Barnes-Hut algorithm: steps, enables new research.
The challenge.
Q. Will my program be able to solve a large practical input?
Insight. [Knuth 1970s] Use scientific method to understand performance.
A framework for predicting performance and comparing algorithms.
Scientific method.
- Observe some feature of the natural world.
- Hypothesize a model that is consistent with the observations.
- Predict events using the hypothesis.
- Verify the predictions by making further observations.
- Validate by repeating until the hypothesis and observations agree.
Huge problem for quick find.
- union commands on objects.
- Quick-find takes more than operations.
- 30+ years of computer time!
Principles.
- Experiments must be reproducible.
- Hypotheses must be falsifiable.
Feature of the natural world. Computer itself.
Observations
Example: 3-SUM. Given distinct integers, how many triples sum to exactly zero?
1 | % more 8ints.txt |
a[i] | a[j] | a[k] | sum |
---|---|---|---|
Brute-force algorithm.
1 | public class ThreeSum { |
Measuring the running time.
Q. How to time a program?
A. Manual.
Q. How to time a program?
A. Automatic.
1 | public class Stopwatch |
1 | public static void main(String[] args) { |
Empirical analysis.
Run the program for various input sizes and measure running time.
N | time (seconds) ? |
---|---|
Data analysis.
Standard plot. Plot running time vs. input size .
Log-log plot. Plot running time vs. input size using log-log scale.
Regression. Fit straight line through data points:
Hypothesis. The running time is about 1.006 × 10^–10 × N^2.999 seconds.
Prediction and validation.
Predictions.
- seconds for .
- seconds for .
Observations. validates hypothesis!
Doubling hypothesis. Quick way to estimate b in a power-law relationship.
Run program, doubling the size of the input.
ratio converge to a constant.
Hypothesis. Running time is about with ratio.
Caveat. Cannot identify logarithmic factors with doubling hypothesis.
Q. How to estimate (assuming we know ) ?
A. Run the program (for a sufficient large value of ) and solve for .
System independent effects. Determines exponent and constant in power law.
- Algorithm.
- Input data.
System dependent effects. Determines constant in power law
- Hardware: CPU, memory, cache, …
- Software: compiler, interpreter, garbage collector, …
- System: operating system, network, other apps, …
Bad news. Difficult to get precise measurements.
Good news. Much easier and cheaper than other sciences.
Mathematical Models
Total running time: sum of cost frequency for all operations.
- Need to analyze program to determine set of operations.
- Cost depends on machine, compiler.
- Frequency depends on algorithm, input data.
In principle, accurate mathematical models are available.
Cost of basic operations.
operation | example | nanoseconds ? |
---|---|---|
integer add | a + b | 2.1 |
integer multiply | a * b | 2.4 |
integer divide | a / b | 5.4 |
floating-point add | a + b | 4.6 |
floating-point multiply | a * b | 4.2 |
floating-point divide | a / b | 13.5 |
sine | Math.sin(theta) | 91.3 |
arctangent | Math.atan2(y, x) | 129.0 |
... | ... | ... |
operation example nanoseconds †
variable declaration int a c1
assignment statement a = b c2
integer compare a < b c3
array element access a[i] c4
array length a.length c5
1D array allocation new int[N] c6 N
2D array allocation new int[N][N] c7 N 2
string length s.length() c8
substring extraction s.substring(N/2, N) c9
string concatenation s + t c10 N
Novice mistake. Abusive string concatenation.
Example: 1-SUM.
Q. How many instructions as a function of input size ?
Depth of any node is at most . ( = base 2 logarithm)
Pf. When does depth of increase?
Increase by when tree containing is merged in to another tree .
- The size of the tree containing at least doubles since .
- Size of tree containing can double at most times. (Start with , double times, and you will get a tree with nodes.)
algorithm | initialize | union | find |
---|---|---|---|
quick-find | |||
quick-union | |||
weighted QU |
Q. Stop at guaranteed acceptable performance?
A. No, easy to improve further.
Improvement 2: path compression
Quick union with path compression. Just after computing the root of , set the of each examined node to point to that root.
Two-pass implementation: add second loop to root()
to set the id[]
of each examined node to the root.
Simpler one-pass variant: Make every other node in path point to its grandparent (thereby halving path length).
1 | private int root(int i) |
In practice. No reason not to! Keeps tree almost completely flat.
Proposition. Starting from an empty data structure, any sequence of union-find ops on objects makes array accesses.
- Analysis can be improved to .
- Simple algorithm with fascinating mathematics.
Iterate log function.
Linear-time algorithm for M union-find ops on N objects?
- Cost within constant factor of reading in the data.
- In theory, WQUPC is not quite linear.
- In practice, WQUPC is linear.
Bottom line. Weighed quick union (with path compression) makes it possible to solve problems that could not otherwise be addressed.
M union-find operations on a set of N objects
algorithm | worst-case time |
---|---|
quick-find | |
quick-union | |
weighted QU | |
QU + path compression | |
weighted QU + path compression |
Ex. [ unions and finds with objects]
- WQUPC reduces time from 30 years to 6 seconds.
- Supercomputer won't help much; good algorithm enables solution.
Applications
Percolation
A model for many physical systems:
- -by- grid of sites.
- Each site is open with probability (or blocked with probability ).
- System percolates iff top and bottom are connected by open sites.
Likelihood of percolation. Depends on site vacancy probability .
When is large, theory guarantees a sharp threshold *.
- : almost certainly percolates.
- : almost certainly does not percolate.
Q. What is the value of ?
Monte Carlo simulation:
- Initialize -by- whole grid to be blocked.
- Declare random sites open until top connected to bottom.
- Vacancy percentage estimates .
Q. How to check whether an N-by-N system percolates?
- Create an object for each site and name them to .
- Sites are in same component if connected by open sites.
- Percolates iff any sites on bottom row is connected to site on top row. (Brute-force algorithm: calls to connected())
Clever trick. Introduce 2 virtual sites (and connections to top and bottom).
- Percolates iff virtual top site is connected to virtual bottom site.
Q. How to model opening a new site?
A. Connect newly opened site to all of its adjacent open sites.
Q. What is percolation threshold ?
A. About 0.592746 for large square lattices.
Steps to developing a usable algorithm.
- Model the problem.
- Find an algorithm to solve it.
- Fast enough? Fits in memory?
- If not, figure out why.
- Find a way to address the problem.
- Iterate until satisfied.