Encapsulation
A module is a set of methods that work together as a whole to perform some task or set of related tasks. A module is encapsulated if its implementation is completely hidden, and it can be accessed only through a documented interface.
As you know, an abstract data type (ADT) is an encapsulated data structure. Not all encapsulated modules are ADTs, though. Algorithms (like list sorters) and applications (like network routing software) can also be encapsulated, even if they are distinct from the data structures they use.
So far, I’ve discussed encapsulation as a way of preventing "evil tamperers" from corrupting your data structures. Who are these evil tamperers? Sometimes, they’re your coworkers, or other programmers who will work on a project long after you’re gone. Often the evil tamperer is you.
A Cautionary Tale
Doug Whole, a programmer at a Silicon Valley startup, implements a singlylinked list much like the one you used in Homework 3, but all its fields are public
. Doug also writes application code that uses linked lists. One day, Doug needs to write code that splices the second node out of a list. It would only take one line, and he doesn’t foresee ever needing to use the same operation anywhere else. Being lazy, Doug doesn’t feel like adding a new method to the List class. Instead, he just does the work directly.
1 | public class ListMangler { |
Two years later, another programmer, Jeannie Yess, decides to improve the speed of their list data structure. After careful thought, she decides to reprogram the List
class so that it uses doubly-linked lists internally. A previous
field is added to ListNode
, and the List methods are rewritten.
Jeannie tests her new List implementation extensively, and can find no bugs. But when she replaces Doug’s List class with her own, the company’s landmark ListMangler
application repeatedly produces the wrong results. After two long days of debugging, Jeannie discovers the culprit: Doug’s single line of code.
This kind of bug is one of the most difficult to find and fix. It’s also very common in commercial software systems, and it can have far-reaching effects.
You see, Doug’s line of code is not the only one that reads or modifies the list data structure directly. Jeannie still has to debug 100,000 lines of Doug’s code in other failing applications, as well as 500,000 lines more written by other programmers who also directly manipulated ListNodes. The List
improvement project is abandoned.
A Remedy: Encapsulation
You "encapsulate" a module by defining an interface through which the outside world can use, inspect, or manipulate it. Recall that the interface is the set of prototypes and behaviors of the methods (and sometimes fields) that access the module or data structure. Think of a module or an ADT as a closed box. Data can ONLY go in and out through the interface. Other attempts to access the internals of the module or ADT are outlawed.
Why encapsulation is your friend:
- The implementation is independent of the functionality. A programmer who has the documentation of the interface can implement a new version of the module or ADT independently. A new, better implementation can replace an old one.
- Encapsulation prevents Doug from writing applications that corrupt a module’s internal data. In real-world programming, encapsulation reduces debugging time. A lot.
- ADTs can guarantee that their invariants are preserved.
- Teamwork. Once you’ve rigorously defined interfaces between modules, each programmer can independently implement a module without having access to the other modules. A large, complex programming project can be broken up into dozens of pieces.
- Documentation and maintainability. By defining an unambiguous interface, you make it easier for other programmers to fix bugs that arise years after you’ve left the company. Many bugs are a result of unforeseen interactions between modules. If there’s a clear specification of each interface and each module’s behavior, bugs are easier to trace.
- When your Project 2 doesn’t work, it will be easier to figure out which teammate to blame.
An interface is a CONTRACT between module writers, specifying exactly how they will communicate.
Enforcing Encapsulation
Many languages offer only one construct for enforcing the encapsulation of ADTs: self-discipline.
As we’ve seen, Java offers facilities that fortify your self-discipline, especially Java packages and the private
, package
, and protected
modifiers for field and method declarations.
Java’s facilities aren’t always enough, though. There are circumstances in which you’ll want to have multiple modules in the same package. For instance, in Project 2 it would be reasonable to put all your modules in the "player" package. If you do that, you’ll have to fall back on self-discipline. This means defining your modules and interfaces before you start programming, and resisting the temptation to let one module snoop through or change another module’s data structures.
One way to find this self-discipline is, wherever one module uses another, to have a different team member work on each module. If neither team member reveals their code to the other, it’s much harder to yield to temptation.