Iteration & Arrays

LOOPS

`while` Loops

A while statement is like an if statement, but the body of the statement is executed repeatedly, as long as the condition remains true. The following example tests whether n is a prime number by attempting to divide it by every integer in the range 2...n - 1.

public static boolean isPrime(int n) {
    int divisor = 2;
    while (divisor < n) {
        if (n % divisor == 0) { |
            return false;
        }
        divisor++;
    }
    return true;
}

Here's how the loop executes.

When Java reaches this while loop, it tests whether the loop condition divisor < n is true.
If divisor < n, Java executes the loop body {in braces}.
When Java finishes the loop body (i.e. after executing divisor++), it tests again whether divisor < n is true.
If it's still true, Java jumps back up to the beginning of the loop body and executes it again.
If Java tests the loop condition and finds that divisor < n is false, Java continues execution from the next line of code after the loop body.

An iteration is a pass through the loop body. In this example, if n is 2 or less, the loop body won't iterate even once.

`for` Loops

for loops are a convenient shorthand that can be used to write some while loops in a more compact way. The following for loop is equivalent to the following while loop.

1
2
3

for (initialize; condition; next) {
    statement
}

initialize;
while (condition) {
    statements;
    next;
}

By convention, the initialize and next are both expressions that affect a variable that changes every loop iteration and is central to the test. Most commonly, for statements are used to iterate while advancing an index variable over a fixed range of values. isPrime can be rewritten thus:

public static boolean isPrime(int n) {
    for (int divisor = 2; divisor < n; divisor++) {
        if (n % divisor == 0) {
            return false;
        }
    }
    return true;
}

A common mistake among beginning Java and C programmers is to get the condition wrong and do one loop iteration too few. For example, suppose you want to print all the prime numbers in the range 2...n.

public static void printPrimes(int n) {
    int i;
    for (i = 2; i < n; i++) { // ERROR!!! Condition should be i <= n.
        if (isPrime(i)) {
            System.out.print(" " + i);
        }
    }
}

Suppose we correct this method so the loop condition is i <= n. Think carefully: what is the value of i when the printPrimes method ends?

We'll come back to iteration, but first let's investigate something more interesting to iterate on.

ARRAYS

An array is an object consisting of a numbered list of variables, each of which is a primitive type or a reference to another object. The variables in an array are always indexed from zero in increments of one. For example, here is an array of characters.

array

Like any object, an array is only useful if we can reference it, usually through some reference variable like c above. We declare c thusly:

1	char[] c; // Reference to an array (of any length) of characters.

We can construct an array of four characters as follows.

1	c = new char[4];

Now that we have an array object, we may fill in its values by indexing c.

c[0] = 'b'; // Store the character 'b' at index 0.
c[1] = 'l';
c[2] = 'u';
c[3] = 'e';

The characters in a four-element array are indexed from 0 to 3. If we try to address any index outside this range, we will trigger a run-time error.

1	c[4] = 's'; // Program stops with ArrayIndexOutOfBoundsException

A run-time error is an error that doesn't show up when you compile the code, but does show up later when you run the program and the Java Virtual Machine tries to access the out-of-range index.

When c references an array, you can find out its length by looking at the field c.length. You can never assign a value to the length field, though. Java will give you a compile-time error if you try.

Primes Revisited

The printPrimes method is embarrassingly slow when n is large. Arrays can help us write a faster method to identify the primes from 2 to n.

The method uses an ancient algorithm called the Sieve of Eratosthenes. All integers are assumed prime until proven composite. The algorithm iterates through all possible divisors, and marks as non-prime every integer divisible by a given divisor. Here's the beginning of the method.

public static void printPrimes(int n) {
    boolean[] prime = new boolean[n + 1]; // Numbered 0...n.
    int i;
    for (i = 2; i <= n; i++) {
        prime[i] = true; // Prime until proven composite.
    }

Why did we construct an array of length n + 1? Because if we'd constructed an array of length n, its elements would be numbered from 0 to n - 1. But we'd like to have an element numbered n.

To continue the method, we iterate over all possible divisors from 2 to the square root of n. For each prime value of divisor, we mark as non-prime all integers divisible by divisor, except divisor itself.

for (int divisor = 2; divisor * divisor <= n; divisor++) {
    if (prime[divisor]) {
        for (i = 2 * divisor; i <= n; i = i + divisor) {
            prime[i] = false; // i is divisible by divisor.
        }
    }
}

Math question: why do we only need to consider divisors up to the square root of n?

Finally, we print every integer from 2 to n that hasn't been marked non-prime.

    for (i = 2; i <= n; i++) {
        if (prime[i]) {
            System.out.print(" " + i);
        }
    }
}

Observe that elements 0 and 1 of the array are never used. A tiny bit of memory is wasted, but the readability of the code is better for it.

Multi-Dimensional Arrays

A two-dimensional array is an array of references to arrays. A three-dimensional array is an array of arrays of arrays. As an example, consider Pascal's Triangle.

Each entry is the sum of the two nearest entries in the row immediately above. If the rows are numbered from zero, row i represents the coefficients of the polynomial $(x + 1)^i$ . For example, $(x + 1)^4 = x^4 + 4x^3 + 6x^2 + 4x + 1$ .

The following method returns an array of arrays of ints that stores the first n rows of Pascal's Triangle.

1 2	public static int[][] pascalTriangle(int n) { int[][] pt = new int[n][];

Here, we've just declared pt to reference an array of arrays, and constructed an array for it to reference. However, the arrays that this array will reference do not yet exist. They are constructed and filled in by the following loop.

    for (int i = 0; i < n; i++) {
        pt[i] = new int[i + 1]; // Construct row i.
        pt[i][0] = 1; // Leftmost value of row i.
        for (int j = 1; j < i; j++) {
            pt[i][j] = pt[i - 1][j - 1] + pt[i - 1][j]; // Sum 2 entries above.
        }
        pt[i][i] = 1; // Rightmost value of row i.
    }
    return pt;
}

Our array objects look like this:

ourarray

MORE ARRAYS

Automatic Array Construction

Last lecture, we used a loop to construct all the arrays that the top-level array references. This was necessary to construct a triangular array. But if you want a rectangular multi-dimensional array, rather than a triangular one, Java can construct all of the arrays for you at once.

1	int[][] table = new int[x][y];

This declaration constructs an array of x references to arrays. It also constructs x arrays of y ints. The variable "table" references the array of arrays; and each entry in the array of arrays references one of the arrays of ints. All the arrays are constructed for you at once. Similarly, Java can construct three- or ten-dimensional arrays for you, memory permitting.

We could have used a square array to store Pascal's Triangle, but that would have unnecessarily wasted memory. If you have enough memory, you might not care.

When you declare a variable, you can also construct array entries by using initializers.

1 2	Human[] b = {amanda, rishi, new Human("Paolo")}; int[][] c = {{7, 3, 2}, {x}, {8, 5, 0, 0}, {y + z, 3}};

In the second example, Java constructs a non-rectangular two-dimensional array, composed of one array of arrays and four arrays of ints.

Outside of declarations, you need a more complicated notation.

1 2	d = new int[] {3, 7}; f(new int[] {1, 2, 3});

Another subtlety of array declarations is the following.

1
2
3

int[] a, b, c; // a, b, and c all reference arrays.
int a[], b, c[][]; // a is 1D; c is 2D; b is not a reference/array.
int[] a, b[]; // a references a 1D array; b references a 2D array.

Arrays of Objects

When you construct a multi-dimensional array, Java can construct all the arrays for you. But when you construct an array of objects, Java does not construct the objects automatically. The array contains space for references to the objects. (arrays of nulls) You must construct the objects yourself.

1
2
3

String[] sentence = new String[3];
sentence[0] = "Word";
sentence[2] = new String();

arrayofobj

`main()'s` Parameter

What is the array of Strings that the main() method takes as a parameter? It's a list of command-line arguments sent to your Java program, prepared for you by Java. Consider the following program.

class Echo {
    public static void main(String[] args) {
        for (int i = 0; i < args.length; i++) {
            System.out.println(args[i]);
        }
    }
}

If we compile this and type "java Echo kneel and worship Java", java prints

kneel
and
worship
Java

sysargs

MORE LOOPS

`do` Loops

A do loop has just one difference from a while loop. If Java reaches a do loop, it always executes the loop body at least once. Java doesn't check the loop condition until the end of the first iteration. do loops are appropriate for any loop you always want executed at least once, especially if the variables in the condition won't have meaningful assignments until the loop body has been executed.

do {
    s = keybd.readLine();
    process(s);
} while (s.length() > 0); // Exit loop if s is an empty String.

The `break` and `continue` Statements

A break statement immediately exits the innermost loop or switch statement enclosing the break, and continues execution at the code following the loop or switch.

In the loop example above, we might want to skip process(s) when s is a signal to exit (in this case, an empty String). We want a "time-and-a-half" loop--we want to enter the loop at a different point in the read-process cycle than we want to exit the loop at. Here are two alternative loops that do the right thing. They behave identically. Each has a different disadvantage.

s = keybd.readLine();
while (s.length() > 0) {
    process(s);
    s = keybd.readLine();
}

Disadvantage: The line s = keybd... is repeated twice. It's not really a disadvantage here, but if input took 100 lines of code, the duplication would make the code harder to maintain. Why? Because a programmer improving the code might change one copy of the duplicated code without noticing the need to change the other to match.

while (true) { // Loop forever.
    s = keybd.readLine();
    if (s.length() == 0) {
        break;
    }
    process(s);
}

Disadvantage: Somewhat obfuscated for the reader, because the loop isn't aligned with its natural endpoint.

Some loops have more than one natural endpoint. Suppose we want to iterate the read-process loop at most ten times. In the example at left below, the break statement cannot be criticized, because the loop has two natural endpoints. We could get rid of the break by writing the loop as at right below, but the result is longer and harder to read.

for (int i = 0; i < 10; i++) {
    s = keybd.readLine();
    if (s.length() == 0) {
        break;
    }
    process(s);
}

int i = 0;
do {
    s = keybd.readLine();
    if (s.length() > 0) {
        process(s);
    }
    i++;
} while ((i < 10) && (s.length() > 0));

There are anti-break zealots who claim that the loop below is the "correct" way to do things. I disagree, because the above loop is clearly more readable.

Some of the zealots feel this way because break statements are a little bit like the go to statements found in some languages like Basic and Fortran (and the machine language that microprocessors really execute). go to statements allow you to jump to any line of code in the program. It sounds like a good idea at first, but it invariably leads to insanely unmaintainable code. For example, what happens if you jump to the middle of a loop? Turing Award winner Edsger Dijkstra wrote a famous article in 1968 entitled "Go To Statement Considered Harmful", which is part of the reason why many modern languages like Java don't have go to statements.

Both break and return are limited forms of go to statements. Their limitations prohibit the worst abuses of go to. They allow control flow to jump in your program in ways that are straightforward to understand.

WARNING: It's easy to forget exactly where a break statement will jump to. For example, break does not jump to the end of the innermost enclosing if statement. An AT&T programmer introduced a bug into telephone switching software in a procedure that contained a switch statement, which contained an if clause, which contained a break, which was intended for the if clause, but instead jumped to the end of the switch statement. As a result, on January 15, 1990, AT&T's entire U.S. long distance service collapsed for eleven hours. (That code was actually written in C, but Java and C use identical syntax and semantics for loops, switch, and break.)

The continue statement is akin to the break statement, except:

it only applies to loops.
it jumps to the end of the loop body but it doesn't necessarily exit the loop; another iteration will commence if the loop condition is satisfied.

Finally, I told you that for loops are identical to certain while loops, but there's actually a subtle difference when you use continue. What's the difference between the following two loops?

int i = 0;
while (i < 10) {
    if (condition(i)) {
        continue;
    }
    call(i);
    i++;
}

for (int i = 0; i < 10; i++) {
    if (condition(i)) {
        continue;
    }
    call(i);
}

Answer: when continue is called in the while loop, i++ is not executed. In the for loop, however, i is incremented at the end of every iteration, even iterations where continue is called.

CONSTANTS

Java's final keyword is used to declare a value that can never be changed. If you find yourself repeatedly using a numerical value with some meaning in your code, you should probably turn it into a final constant.

BAD: if (month == 2) {

GOOD:

1
2
3

public final static int FEBRUARY = 2; // Usually near top of class.
...
if (month == FEBRUARY) {

Why? Because if you ever need to change the numerical value assigned to February, you'll only have to change one line of code, rather than hundreds.

You can't change the value of FEBRUARY after it is declared and initialized. If you try to assign another value to FEBRUARY, you'll have a compiler error.

The custom of rendering constants in all-caps is long-established and was inherited from C. (The compiler does not require it, though.)

For any array x, x.length is a final field.

You can declare local parameters final to prevent them from being changed.

1
2
3

void myMethod(final int x) {
    x = 3; // Compiler ERROR. Don't mess with X's!
}

final is usually used for class variables (static fields) and parameters, but it can be used for instance variables (non-static fields) and local variables too. It only makes sense for these to be final if the variable is declared with an initializer that calls a method or constructor that doesn't always return the same value.

1
2
3

class Bob {
    public final long creationTime = System.currentTimeMillis();
}

SCOPE

The scope of a variable is the portion of the program that can access the variable. Here are some of Java's scoping rules.

Local variables and parameters are in scope only inside the method that declares them. Furthermore, a local variable is in scope only from the variable declaration down to the innermost closing brace that encloses it. A local variable declared in the initialization part of a for loop is in scope only in the loop body.
Class variables (static fields) are in scope everywhere in the class, except when shadowed by a local variable or parameter of the same name.
Instance variables (non-static fields) are in scope in non-static methods of the class, except when shadowed.