Code Complete is programming classic. It is 900 pages of intelligent and fascinating discussion about coding software.
Introduction
This book is not about how to program in a particular language, how to use SSADM or other methodologies. This book is about improving the way that you work throughout the development cycle.
The author describes the development process as being everything from the technical, detailed design stage, right through to the integrated testing. This is the jurisdiction, or domain, of the programmer.
With an immense bibliography, the author has combined theories, common practices and hard data to make his points. Sometimes, however, he completely contradicts the commonplace practices. Here is a man who supports established practices, but backs his own convictions where they differ. Subjectivity surfaces on occasion with statements such as "If you come across one of these clowns, ask him the following". On the whole, there is the distinct impression that the author was motivated to write this book in order to help programmers, and related IT staff.
References to personal experience punctuate bibliographic references in order to put the point across. Clearly this man knows his stuff, and is not simply trying to pedal his own particular point of view that has never been proven. The author has developed his techniques over time, and has then decided to write a book on the subject. Where the purpose of his methodologies is nebulous, he backs them up with hard data.
Summary of Points
Here is a summary of the points that I believe are particularly of note.
- Software Accretion Method
The author recommends an accretion approach to software construction. This means the method of initially building the most basic working system possible, and then adding on layers piece by piece.
- Prerequisites
It seems obvious, but it is important to have all of the requirements in place before time is spent on detailed design or construction. Ensure that all the system prerequisites are outlined.
The author also describes the "Myth of Stable Requirements". The important thing is to manage changes properly.
- Use of PDL
This is a section that I found particularly fascinating, in the section dealing with designing Routine. PDL stands for Programming Definition Language. PDL is a language similar in purpose to pseudo code, but is more abstract, using statements that resemble spoken English. The method here is to design a routine in PDL and then use the statements to form the basis of the routine. The PDL statements become the comment lines in the routine, and the functionality is filled in between the comments. This method allows the design of the routine to remain a constant, and to be clearly visible when viewing the code.
- Routines and Modules
The author spends a lot of time describing quality. This is particularly apparent with coding modules and routines. The author lists good and bad reasons for writing routines, and how to write quality routines and modules.
- Abstraction and Naming
One of the most recurring themes throughout the book is abstraction. This theme is prominent when the author discusses naming standards. For routines, modules, variables, constants and literals, abstract naming standards are extremely useful. They allow code reading to be made easier, and help to describe the program in the problem domain.
- Layout and Style
After mentioning quality in routine design and use, the author outlines the correct layout and styles of coding to use. Good and bad examples are shown, and the reasons for the choice of layout are mentioned.
Also mentioned is how much of a sensitive area layout and programming styles are, including the hundred years war that is the GOTO debate.
- Management
The book is not only focused on the programmer. There is a section that is for the attention of the Programming Manager. This deals with everything from planning and scheduling right through to managing the people. When discussing planning, the importance of measurement is also mentioned. Apart from progress, quality also needs to be measured. Several methods for measuring are suggested, including formal and informal reviews.
- Testing
There are more opinions and suggestions for testing than just about anything else in software development. This book is no exception, however, the author describes testing in sections of Unit testing, Functional Testing, Integration Testing and Live Testing. Methodologies for all of these areas are outlined.
- Optimisation
There may be times when a program needs to become more streamlined. The author discusses the use of code tuning techniques, and the most important issue of when to optimise and when to leave it alone.
Organising Straight Line Code.
Organising Straight Line Code.
Statements that must be in a specific order.
- Organise the code so that dependencies are obvious.
- Name routines so that dependencies are obvious.
- Use routine parameters to make dependencies obvious.
- Document unclear dependencies.
Statements who's order does not matter.
- Make Code read from top to bottom.
- Localise variable references.
- Keep variable references close to where they are used.
- Keep variables live for as short a time as possible.
- (Span = number of lines where between variable uses)
- (Live = Number of lines that use the variable)
- Group Related Statements.
Using Conditions.
IF statements.
- Write the normal program path, and then introduce conditions.
- Branch correctly on equality.
- Put the normal path in the IF condition, and the unusual conditions in the ELSE.
- Follow an IF with a meaningful statement.
- Consider what to put in the ELSE statement. Implement the ELSE with a comment if code is not required.
- Check for inadvertant confusion between the IF and ELSE conditions.
IF chains:
- Simplify complex IF blocks with a boolean function.
- Put the most common cases in the IFs and less common in the ELSE.
- Ensure that all possible cases are covered.
- Consider replacing with CASE statements.
CASE statements.
- Order cases alphabetically or numerically.
- Place the normal case first.
- Order cases by likely frequency.
Tips:
- Keep actions simple.
- Don't make up fake variables just to use CASE. Use IFs if it is correct to do so.
- Use the ELSE/Default clause to identify unexpected cases.
Controlling Loops.
Select the kind of Loop to use.
- If you know the number of times to loop, use FOR/NEXT.
- If you do not know, use DO/WHILE.
When to use WHILE:
- Always test the condition at the start of the loop.
- Only test at the end of the condition if it is impossible not to test at the start.
- It is correct to test at the end of the loop when the loop will always require at least one pass.
When to use EXIT DO:
- Sometimes it is necessary to conditionally exit a loop part way through the code block.
- Put all EXIT DOs together in one place.
- Use comments.
- Don't perform a GOTO into the middle of a loop.
When to use FOR/NEXT:
- When the number of times to loop is known in advance.
Controlling the Loop.
- When building a loop code block, use the same principles for building a routine.
- Simplify the code.
Entry points:
- There should be one point of entry.
- Initialise the loop variables directly before the start of the loop.
- Don't use a FOR/NEXT loop when a WHILE loop is more appropriate.
Process the Middle of the Loop:
- Use BEGIN/END, or {} (where language supports it).
- Avoid having empty loops.
- Perform loop housekeeping operations at either the start or end of the loop.
- Each loop should perform only one function.
Exiting the Loop:
- Be assured that the loop will terminate, and not run infinitely.
- Make termination conditions obvious.
- Don't bodge a FOR/NEXT loop just to force it to exit.
- Avoid code that relies on the final value of a loop counter.
- Conside the use of safety counters.
- Only use EXIT DOs with care.
- Be wary of loops with lots of EXIT DOs
Checking End Points.
- Manually check all exit points.
- Check for errors.
Using Loop Variables.
- Use enumerations for loop limits.
- Use integers (whole number variables) only.
- Use meaningful variable names when nesting loops.
- Use meaningful variable names to avoid referencing the wrong variable in
- the wrong loop (Cross Talk).
How Long should a Loop be?
- Short enough to view on one page (printed or screen).
- Avoid nesting more than three levels.
- Make long loops especially clear.
Creating Loops form the inside out.
- Code the Main condition.
- Code other conditions.
- Fill in the rest of the code.
Correspondance between loops and arrays.
Use FOR EACH or other language functions where possible.
Unusual Control Structures.
GOTO.
- Use GOTO only in non structured languages.
- Other use violates structured programming.
- GOTOs can eliminate duplicate code.
- ON ERROR GOTO use is acceptable.
- It is best not to use GOTOs.
RETURN/EXIT.
- Minimise the use of.
- Use only if it enhances readability.
Recursion.
These are routines that call themselves.
They can provide very elegant solutions.
These can fill up stack space.
Tips:
- Ensure that recursion stops.
- Use safety counters to prevent infinite recursion.
- Limit recursion to one routine.
- Watch the stack.
- Don't use recursive routines for Factorial or Fibonacci numerals.
General Control Issues.
Boolean Expressions.
- Use "True" and "False" not "1" and "0".
- Use implicit comparisons.
- Simplifying Boolean Expressions:
- Break them up into multiple tests.
- Move complicated tests into boolean functions. These are easier to read.
- Consider using lookup tables.
- Formatting Boolean Expressions positively:
- Write the IF statement to read as a true condition.
- This can contradict writing the IF with the normal condition first.
- Consider what is best to use.
- Convert: "IF NOT A OR NOT B THEN" to "IF ( NOT (A AND B) ) THEN".
- Use parentheses to clarify expressions.
Write Numeric expressions in numerical value order (MIN < X AND X < MAX)
Compound Statements.
- Write BEGIN/END or {} first.
- Fill in the code.
NULL Statements.
- These are conditional statements that do nothing.
- Call attention to them.
- Use comments.
Taming Dangerously Deep Nesting.
- If there is dangerous nesting, restructure the tests:
- Simplify by re-testing conditions.
- Convert nesting to IF/THEN/ELSE.
- Consider replacing with CASE statements.
- Factor Code into routines.
- Simply Redesign the tests!
The Power of Structured Programming.
- Have one entry point and one exit point.
- Use three components:
- Sequence.
- Selection.
- Iteration.
Control Structures and Complexity.
Good design reduces complexity.
How to reduce complexity:
Measure complexity by using a count:
- Start with 1 for the main path.
- Add 1 for each IF, DO, FOR, AND, OR, ELSE statement.
- Add 1 for each case in a CASE statement.
Evaluate Complexity:
- 0-5 OK.
- 6-10 Consider simplifying code.
- 10+ Split the routine up into smaller routines.
Consider other ways of measuring complexity:
- The amount of data.
- Nesting levels.
The Power of Data Names
The Power of Data Names
Considerations in choosing good names
The main point of naming variables is that the name accurately describes
the purpose.
The name should be problem-orientated. This means that the name is
related to what is being achieved not how it is being achieved.
The optimum name length is suggested to be between 8 and 20 characters,
typically about 16.
The Effect of Scope on Variable names.
Typically, longer names are better than shorter ones. This is not
always the case: if a variable is to be used as a loop counter, it is
acceptable to call it "a", or "i". The reason for this is that the
variable is a temporary variable only, which will not exist outside of
the scope of the procedure in which it is used.
If a variable is to be used throughout a module, then it should be named
more meaningfully.
Basically, the length of a variable should reflect it's importance.
Computed-Value Qualifiers in Variable Names.
Many programmers use prefixes to denote calculated values (e.g. Ttl,
Sum, Avg). If using this approach, the important thing to remember is
to be consistent.
Naming Specific Types of Data
- Loop Indexes. Use simple names, except if they are used afterwards.
- Booleans. The name should imply a true condition. Don't use the word "Not" at the beginning, as this could be confusing: IF NOT (NotSomething).
- Enumeratiuons. Ensure that the names are similar.
- Constants. These should always have an abstract name.
The Power of Naming Conventions
- More can be taken for granted. If a good convention is used, then it is possible to make assumptions about variables.
- Knowledge can be transferred more easily across projects, by using the
same convention for each project.
- Code can be learnt more quickly on a new project.
- Naming proliferation is reduced.
- Naming Conventions can compensate for language weaknesses. Named
constants and enumerations can be emulated.
- If non-structured data is used, naming conventions can emphasise
relationships between variables (e.g. All Customer fields start with
"Cus").
Any naming convention is better than none at all.
Informal Naming Conventions
Here are some guidelines for creating a language-independent naming
convention.
- Identify Global Variables. Ensure that all Global Variables have a
specific prefix.
- Identify Module Variables. As above.
- Identify Type Definitions.
- Identify Named Constants.
- Identify Enumerations.
- Identify I/O variables in languages that don't enforce them.
- Format all names to enhance readability.
Hungarian Notation
This naming convention consists of naming the variable in three parts:
- Variable Type.
- Prefix.
- Qualifier.
This convention is widely used in C.
The main disadvantage with this is that the variables will never have
abstract names.
Short Names
When using short names, follow these guidelines:
- Use standard abbreviations.
- Remove all non-leading vowels.
- Use the first (typically four) letters of each word.
- Truncate words.
- Use up to three significant words.
- Remove useless suffixes.
- Keep the most noticeable sound in each syllable.
It is noted that some programmers use phonetic names (before = b4), but
I would not recommend this.
Kinds of Names to avoid
- Misleading names and abbreviations.
- Names with similar meanings.
- Similar names with different meanings.
- Similar sounding names.
- Numerals.
- Bad spelling.
- Commonly misspelled words.
- Don't differentiate by using capitals.
- Avoid Library routine names.
- Don't use irrelevant names.
- Don't use hard to read characters.
Summary
- Establish a naming convention and stick to it.
- Any Naming Convention is better than none at all.
General Issues in Using Variables
Scope
The key is to minimise the scope of variables. If variables are global,
the program is likely to be easier to write, but if they are not, the
program will be easier to read.
Ensure that variable references are kept together.
Persistence
- Avoid misreading the persistence of variables.
- Use Debugging code and assertions.
- Use code that doesn't make assumptions about variables.
- Initialise Data just before use.
Binding Time
There are three types of data binding.
- Code Time Binding
This refers to hard-coded variables that are assigned values in the
source code.
- Compiler Time Binding
This refers to variables that are assigned values from constants.
- Run Time Binding
This occurs when the program is running, and variables are assigned
dynamic values.
Relationship between Data Structures and Control Structures
- Data Structured design: Modify the Input to get the Output.
- Sequential Data. This refers to a list of sequential statements and actions.
- Selective Data. This refers to IF statements.
- Iteration. This refers to repeated actions, such as for..next, do..until.
Use each Variable for one purpose only
It is confusing to use a single variable for more than one purpose. It
is possible to do so subtly, but it is not recommended.
Avoid variable names with hidden meanings. The meaning may be clear to
the developer, but not to anybody else.
Declare all used variables and remove declarations of unused variables.
Global Variables
Global variables are tricky. They can be very useful, but also extremely risky to use.
- Common Problems
- Inadvertent changes to global data.
- Aliasing. This is a strange situation, where a global variable is
passed to a routine as a parameter, and the routine changes the global
data.
- Re-entrant code problems. These can occur when multiple threads of an
application use the same global data.
- Hinders code re-use. A routine can't be plugged in if global data
needs to be set up.
- Can't Modularise. If global data is used, the system can't be
separated into modules.
- Reasons to use Global Data
- Preserves Global Values.
- Allows substitution of named constants in languages that do not
support this.
- Streamlines use of very commonly used data.
- Reduces "Tramp Data". This refers to data that is passed as
parameters to a routine that are only passed to another routine within
the first. They are not actually used within the first routine.
- How to Reduce the Risks of Using Global Data
- Only use Global Data if necessary.
- Differentiate between global data and module data.
- Use Naming Conventions.
- Create a list of Global Variables.
- Lock Global Variables when they are in use. Do this by implementing a
status variable for each global variable.
- Don't simply create one single global structure and pass it
everywhere. This is just pretending to not use Global Data.
- Use Access Routines instead of Global Data.
Advantages are:
- Centralised control.
- Ensures that References are firewalled.
- Promotes Information Hiding.
- Allows Abstract Naming.
Advantages:
- All Routines must go through one Access Routine.
- Splits Module Data and Global Data.
- Builds a level of abstraction into the code.
- Keeps accesses at the same level of abstraction.
Summary
- Use Variables correctly, for one purpose only.
- Name variables appropriately.
Fundamental Data Types
General Numbers
General things to remember:
- Use Named Constants instead of literals.
- Use 0s and 1s if necessary, for initialising loops and incrementing.
- Anticipate divisions by zero.
- Make type conversions obvious.
- Don't make comparisons between different data types.
Integers
- Check when dividing. Remainders, and decimal values.
- Check for overflow values.
- Check for overflows in intermediate results.
Floating Point Numbers
- Avoid + and - operations on numbers with vastly different magnitudes.
(5,000,000.01 + 0.01)
- Avoide exactly equal comparisons. Instead define an acceptable range
of accuracy, such as + or - 0.01.
- Check for rounding errors.
Characters and Strings
Use Named constants.
- These allow for changes more easily.
- Allow for changes in international versions.
- Strings can take up a lot of memory. These problems are easier to
solve if the string values are independent of the code.
- Cryptic values can be easier to understand with abstract naming.
Booleans
- Use these to document programs.
- Use them to simplify complex evaluations.
- They can be defined if needed.
Enumerations
- These improve readability.
- Also improve reliability.
- Programs can be modified more easily.
- These can be used as alternatives to booleans for system state
variables. This is especially useful where a new state is introduced.
- Invalid values can be checked for.
- When using enumerations, ensure that the first value is invalid/unset.
Named Constants
- Use them in Data declarations, to define lengths, array limits etc.
- Avoids using literals.
- These can be simulated with global variables if named constants are not supported.
- Use them consistently.
Arrays
- Ensure that all indexes are within the bounds of the array.
- Consider the use of sequential data structures, such as stacks and
queues if dynamic access is not required.
- Check the end points of the array.
- The array is multidimensional, then ensure that the subscripts are
used in the correct order.
- Ensure that the correct subscript value/variable is used.
- Add an extra element at the end of the array.
Summary
Always ensure that the correct data type is used for the correct
purpose.
©John Mann, 2000