|
Code
Complete |
|
Steve
McConnell
Microsoft Press 1993 |
Summary
Code Complete is programming classic. It is 900
pages of intelligent and fascinating discussion about coding software.
Introduction
This book is not about
how to program in a particular language, how to use SSADM or other
methodologies. This book is about improving the way that you work throughout
the development cycle.
The author describes the
development process as being everything from the technical, detailed design
stage, right through to the integrated testing. This is the jurisdiction, or
domain, of the programmer.
With an immense
bibliography, the author has combined theories, common practices and hard data
to make his points. Sometimes, however, he completely contradicts the
commonplace practices. Here is a man who supports established practices, but
backs his own convictions where they differ. Subjectivity surfaces on occasion
with statements such as "If you come across one of these clowns, ask him
the following". On the whole, there is the distinct impression that the
author was motivated to write this book in order to help programmers, and
related IT staff.
References to personal
experience punctuate bibliographic references in order to put the point across.
Clearly this man knows his stuff, and is not simply trying to pedal his own
particular point of view that has never been proven. The author has developed
his techniques over time, and has then decided to write a book on the subject.
Where the purpose of his methodologies is nebulous, he backs them up with hard
data.
Summary
of Points
Here is a summary of the
points that I believe are particularly of note.
1. Understanding
Software Construction 8
1.1. Metaphors 8
1.2. Writing Code 8
1.3. Summary 9
1.4. Forwarding Actions 9
2. Prerequisites to Construction 9
2.1. Importance 9
2.2. Problem
Definition 10
2.3. Requirements 11
2.4. Architecture 11
2.5. Language 12
2.6. Programming
Conventions 12
2.7. Time to spend on
Pre-Requisites 12
2.8. Adapting
Pre-Requisites 12
2.9. Summary 13
2.10. Forwarding Actions 13
3. Building a Routine 13
3.1. Summary of Steps 13
3.2. PDL for Pros 13
3.3. Design the
Routine 14
3.4. Code the Routine 14
3.5. Formal Checking 14
3.6. Summary 14
3.7. Forwarding
Actions 14
4. High Quality
Routines 15
4.1. Valid Reasons to
Create a Routine 15
4.2. Good Routine
Names 15
4.3. Strong Cohesion 15
4.4. Loose Coupling 16
4.5. How Long Can a
Routine be? 16
4.6. Defensive
Programming 17
4.7. Use of Routine
Parameters 17
4.8. Consider the use
of Functions 17
4.9. Summary 17
4.10. Forwarding Actions 18
5. Modules 18
5.1. Modularity:
Cohesion and Coupling 18
Cohesion 18
Coupling 18
5.2. Information
Hiding 18
5.2.1. Secrets and the
Right to Privacy 19
5.2.2. Common Secrets 19
5.2.3. Barriers to
Information Hiding 19
5.3. Good Reasons to
Create a Module 20
5.4. Summary 20
5.5. Forwarding
Actions 20
6. High Level Design 20
6.1. Introduction to
Software Design 20
6.2. Structured
Design 20
6.2.1. Choosing
Components to Modularise 21
6.3. Object-Oriented
Design 21
6.3.1. Key Ideas. 21
6.3.2. Design Steps 22
6.3.3. Typical Components 22
6.4. Comments on
Popular Methodologies 22
6.4.1. When to use
Structured Design 22
6.4.2. When to use
Information Hiding 22
6.4.3. When to use Object
Oriented Design 22
6.5. Round Trip
Design 22
6.6. Design is a
Heuristic 23
6.7. How to solve it 23
6.8. Summary 23
6.9. Forwarding
Actions 23
7. Creating Data 23
7.1. Reasons to
create your own Data Types 23
7.2. Guidelines for
creating Data Types 24
7.3. Making Variable
Declarations Easy 24
7.4. Guidelines for
Initialising Data 24
7.5. Summary 24
7.6. Forwarding
Actions 24
8. The Power of Data
Names 25
8.1. Considerations
in choosing good names 25
8.1.1. The Effect of
Scope on Variable names. 25
8.1.2. Computed-Value
Qualifiers in Variable Names. 25
8.2. Naming Specific
Types of Data 25
8.3. The Power of
Naming Conventions 25
8.4. Informal Naming
Conventions 26
8.5. Hungarian
Notation 26
8.6. Short Names 26
8.7. Kinds of Names
to avoid 26
8.8. Summary 27
8.9. Forwarding
Actions 27
9. General Issues in
Using Variables 27
9.1. Scope 27
9.2. Persistence 27
9.3. Binding Time 27
9.3.1. Code Time Binding 27
9.3.2. Compiler Time
Binding 27
9.3.3. Run Time Binding 27
9.4. Relationship
between Data Structures and Control Structures 28
9.5. Use each
Variable for one purpose only 28
9.6. Global Variables 28
9.6.1. Common Problems 28
9.6.2. Reasons to use
Global Data 28
9.6.3. How to Reduce the
Risks of Using Global Data 28
9.6.4. Use Access
Routines instead of Global Data 29
9.7. Summary 29
9.8. Forwarding
Actions 29
10. Fundamental Data
Types 29
10.1. General Numbers 29
10.2. Integers 29
10.3. Floating Point
Numbers 29
10.4. Characters and
Strings 30
10.5. Booleans 30
10.6. Enumerations 30
10.7. Named Constants 30
10.8. Arrays 30
10.9. Summary 31
10.10. Forwarding Actions 31
11. Organising
Straight Line Code. 31
11.1. Statements that
must be in a specific order. 31
11.2. Statements who's
order does not matter. 31
12. Using
Conditions. 31
12.1. IF statements. 31
12.2. CASE statements. 32
13. Controlling
Loops. 32
13.1. Select the kind of
Loop to use. 32
13.2. Controlling the
Loop. 32
13.3. Creating Loops
form the inside out. 33
13.4. Correspondence
between loops and arrays. 33
14. Unusual Control
Structures. 33
14.1. GOTO. 33
14.2. RETURN/EXIT. 34
14.3. Recursion. 34
15. General Control
Issues. 34
15.1. Boolean
Expressions. 34
15.2. Compound
Statements. 34
15.3. NULL Statements. 35
15.4. Taming Dangerously
Deep Nesting. 35
15.5. The Power of
Structured Programming. 35
15.6. Control Structures
and Complexity. 35
16. Layout and Style 35
16.1. Fundamentals. 36
16.2. Layout Techniques. 36
16.2.1. White
Space. 36
16.2.2. Parentheses. 36
16.3. Layout Styles. 36
16.3.1. Pure
Blocks. 36
16.3.2. Endline
Layout. 37
16.3.3. BEGIN-END
Block Boundaries. 37
16.4. Laying Out Control
Structures. 37
16.4.1. Fine
Points of Formatting Control Structures. 37
16.4.2. Other
Considerations. 38
16.5. Laying Out
Individual Statements. 38
16.5.1. Statement
Length. 38
16.5.2. Using
spaces for clarity. 38
16.5.3. Aligning
Related Statements. 38
16.5.4. Format
Continuation Lines. 38
16.5.5. Use
Only One Statement Per Line. 38
16.5.6. Laying
Out Data Declarations. 39
16.6. Laying Out
Comments. 39
16.7. Laying Out
Routines. 39
16.8. Laying Out File,
Modules and Programs. 39
16.9. Summary. 39
16.10. Forwarding Actions 39
17. Self-Documenting
Code. 39
17.1. External
Documentation. 39
17.2. Programming Styles
as Documentation. 40
17.3. Commenting. 40
17.3.1. Types
of Comments 40
17.3.2. Commenting
Efficiency 40
17.4. Commenting
Techniques. 40
17.4.1. Individual
Lines. 40
17.4.2. Commenting
Paragraphs 40
17.4.3. Commenting
Data Declarations. 41
17.4.4. Commenting
Control Structures. 41
17.4.5. Commenting
Routines. 41
17.4.6. Commenting
Files, Modules and Programs. 41
17.4.7. Using
the "Book" Paradigm for commenting. 42
17.5. Summary 42
17.6. Forwarding Actions 42
18. Programming
Tools. 42
18.1. Design Tools. 42
18.2. Source Code Tools. 42
18.2.1. Editing. 42
18.2.2. Browsing. 43
18.2.3. Analysing
Code Quality. 43
18.2.4. Restructuring
Source Code. 43
18.2.5. Version
Control. 43
18.2.6. Data
Dictionaries. 43
18.3. Executable Code
Tools. 44
18.3.1. Code
Creation. 44
18.3.2. Debugging. 44
18.3.3. Testing. 44
18.3.4. Code
Tuning. 44
18.4. Tool-Orientated environments. 44
18.5. Building your own
tools. 45
18.6. Summary 45
18.7. Forwarding Actions 45
19. How Program size
Affects Construction 45
19.1. Effect of Project
Size on Development Activities 45
19.2. Effect of Project
Size on Errors 45
19.3. Effect of Project
Size on Productivity 45
20. Managing
Construction 46
20.1. Encouraging Good
Coding. 46
20.1.1. Considering
in Setting Standards. 46
20.1.2. Techniques. 46
20.2. Configuration
Management 46
20.2.1. What
is configuration Management? 46
20.2.2. Software
Design Changes 46
20.2.3. Software
Code changes 46
20.3. Estimating a
Construction Schedule 47
20.3.1. Approaches 47
20.3.2. Establish
Objectives 47
20.3.3. Influences
on Schedule 47
20.3.4. Estimation
vs. Control 47
20.3.5. What
to do if you are behind 48
21. Software
Metrics. 48
21.1. Treating
Programmers as people 48
21.2. Variations in
performance and quality 48
21.3. Religious Issues 48
21.4. Physical
Environment 49
21.5. Summary 49
22. The Software
Quality Landscape 49
22.1. Characteristics of
Software Quality 49
22.2. Techniques for
improving Software Quality 49
22.3. Relative
Effectiveness of techniques 50
22.3.1. Percentage
of Errors found 50
22.3.2. Cost
of finding defects 50
22.3.3. Cost
of fixing defects 50
22.4. When to do a QA 50
22.5. General Principle
of Software Quality 51
22.6. Summary 51
22.7. Forwarding Actions 51
23. Reviews. 51
23.1. The role of
reviews. 51
23.1.1. Reviews
complement other QA techniques. 51
23.1.2. Reviews
remove corporate structure. 51
23.1.3. Reviews
assess Quality and Progress. 51
23.1.4. Reviews
also apply before construction. 51
23.2. Inspections. 51
23.2.1. Roles
During Inspections 52
23.2.2. Procedure
for Inspections 52
23.3. Other kinds of
reviews 52
23.3.1. Walkthroughs. 52
23.3.2. Code
Reading. 53
23.3.3. "Dog
and Pony shows" 53
24. Unit Testing 53
24.1. The Role of Unit
Testing 53
24.2. Testing During
Construction 53
24.3. The Testing Bag of
Tricks 53
24.3.1. Incomplete
Testing. 53
24.3.2. Structured
Basis Testing. 53
24.3.3. Data
Flow Testing. 54
24.3.4. Equivalence
Partitioning. 54
24.3.5. Error
Guessing. 54
24.3.6. Boundary
Analysis. 54
24.3.7. Classes
of Bad Data. 54
24.3.8. Classes
of Good Data 54
24.3.9. Use
test cases that allow easy manual checks. 55
24.4. Typical Errors 55
24.4.1. Which
routines contain the most errors? 55
24.4.2. Errors
by Classification. 55
24.4.3. Proportion
of Errors Resulting from faulty construction 55
24.4.4. How
many errors should you expect to find. 55
24.4.5. Testing
itself 55
24.5. Test Support Tools 55
24.5.1. Scaffolding 56
24.5.2. Results
Comparators 56
24.5.3. Test
Data Generators 56
24.5.4. Coverage
Monitors 56
24.5.5. Symbolic
Debuggers 56
24.5.6. System
Perturbers 56
24.5.7. Error
Databases. 56
24.6. Improving Testing 56
24.7. Planning to test 56
24.7.1. Re
Testing 57
24.8. Keeping Test
Records 57
24.9. Summary 57
24.10. Forwarding Actions 57
25. Debugging 57
3.1 Overview of Issues 57
25.0.1. Role
of Debugging 57
25.0.2. Variations
in Debugging Performance 57
25.0.3. Errors
as Opportunities 57
25.0.4. An
Ineffective approach 58
25.0.5. Debugging
by superstition 58
25.1. Finding an Error 58
25.1.1. Use
a Scientific Method 58
25.1.2. Tips
on Finding Errors 58
25.1.3. Syntax
Errors 59
25.2. Fixing an Error 59
25.3. Psychological
Considerations 59
25.4. Debugging Tools 59
25.5. Summary 59
26. System
Integration 60
26.1. Importance of the
Integration Method. 60
26.2. Phased vs.
Incremental Integration. 60
26.2.1. Phased
Integration 60
26.2.2. Incremental
Integration 60
26.2.3. Benefits
of Incremental Integration 60
26.3. Incremental
Integration strategies 60
26.3.1. Top
Down Integration 60
26.3.2. Bottom
Up Integration 61
26.3.3. Sandwich
Integration 61
26.3.4. Risk
Orientated Integration 61
26.3.5. Feature
Orientated Integration 61
26.4. Evolutionary
Delivery 61
26.4.1. General
Approach 61
26.4.2. Benefits 61
26.4.2. Relationship
of Evolutionary Delivery to Prototyping 61
26.4.3. Limitations 62
26.5. Summary 62
26.6. Forwarding Actions 62
27. Code Tuning
Strategies 62
27.1. Performance
Overview 62
27.1.1. Quality
Characteristics and Performance 62
27.1.2. Performance
and Code Tuning 62
27.2. Introduction to
Code Tuning 62
27.2.1. Old
Wives' Tales. 62
27.2.2. The
Pareto Principle 63
27.2.3. Measurement. 63
27.2.4. Compiler
Optimisations 63
27.2.5. When
to use Code Tuning 63
27.2.6. Iteration 63
27.3. Common Sources of
Inefficiency 63
27.4. Summary of
Approach to Code Tuning 63
27.5. Summary 63
28. Code Tuning
Techniques 64
28.1. Loops 64
28.2. Logic 64
28.3. Data
Transformation 64
28.4. Expressions. 64
28.5. Routines 65
28.6. Re-Code in
Assembler 65
29. Software
Evolution 65
29.1. Guidelines 65
29.2. Making New Routines 65
30. Themes in
Software Craftsmanship 65
30.1. Conquer Complexity 65
30.1.1. Ways
to reduce complexity 65
30.1.2. Hierarchies
and Complexity 66
30.1.3. Abstraction
and Complexity 66
30.1.4. Pick
your Process 66
30.2. Write programs for
people first, and computers second 66
30.3. Focus your
attention with the help of conventions 66
30.4. Programming in
terms of the problem domain 66
30.5. Watch for
"Falling Rocks" 66
30.6. Iterate 66
30.7. "Thou Shalt
Render Religion and Software Asunder" 67
30.7.1. Software
Oracles 67
30.7.2. Eclecticism 67
30.7.3. Experimentation 67
This section noted the importance of metaphors in software construction. Metaphors are a highly useful means of communicating ideas and concepts.
When technical people need to communicate with non-technical people, the language barrier typically separates them. This is because non-technical people do not have a deep understanding of the fundamentals of software construction, nor should they. It is important for technical people to be able to communicate with the non-technical.
This makes sense, as it is then possible to communicate an idea to another person, using something that both people understand, as a metaphor.
If a technical person is trying to communicate an idea to a non-technical person, metaphors and analogies can be used to help instigate a clear understanding of the concept.
The use of metaphors can be referred to as “Modelling”.
Many different types of metaphors are used when interpreting technical issues. It is important to remember that there is never a “perfect” metaphor. This means that there is not a definitive “right” versus “wrong” metaphor. It is inevitable that, over time, different metaphors will arise that may describe technical issues better than older ones. This does not mean that the old metaphor was wrong, and the new one is right. They are simply different, with one being more correct than another; “better” and “worse”.
Software metaphors are not algorithms, they are heuristics. To define:
An algorithm is a definite method that gives an answer.
· A Heuristic is an approach that helps to find an answer.
Many errors that occur in Software Construction are conceptual errors. They are problems with what is being achieved not how it is being achieved.
Quotes:
· A report issued in 1980 stated that on average, 50% of software development happens after the first release.
· A book in 1975 stated that programmers should “plan to throw one away”.
These quotes imply that a lot of work will prove to be pointless, and is typically discarded and started again. This comes from the old school of programming.
A new approach to software construction should be considered. Take the metaphor of farming. Code is generated gradually, a little at a time. The author states that “If you buy into this idea, you will end up talking about Fertilising the system plan, Thinning the detailed design, increasing code yield through effective land management and Harvesting code.”
This metaphor is useful, but a better one is that of Oyster Farming. An oyster makes a pearl by the gradual addition of material around an irritant. A computer program can be initially developed to perform the most basic function possible, as long as the full cycle of the program is followed. This is essentially a skeleton program. From here, more functionality can be added little by little. The whole process is incremental.
This can help identify any conceptual or design issues earlier than developing the entire system on a screen by screen, or module by module basis. This helps to check the integrity of the design.
· Metaphors help break the technical to non-technical language barrier.
· Metaphors explain one thing, using something else as a model.
· There is no “Right” and “Wrong”, only “Worse” and “Better”.
· Metaphors can help to educate non-technical people.
· The Accretion approach to software development will help to identify any problems with the integrity of the design sooner than other methods.
· Use the Accretion approach to development.
Why is software written? The answer should be “To solve a problem”. At the most basic level, this should be the goal of a software product. With this in mind, it is important to have a few essentials before development starts. At the very least, the reason for the software is required.
Prerequisites are a list of items that the developer really should have before coding starts. Even if the developer knows what the problem is, that does not always constitute everything they need before they start coding.
By identifying all of the prerequisites before coding starts, problems and issues can be identified and resolved before they are encountered, therefore not holding up the development process while the problems and issues are resolved.
There are two main contributors to this problem:
· Developers.
Developers have a tendency to want to start coding as soon as possible. This should be corrected with self-discipline to ensure that the prerequisites are met first.
· Managers.
Managers want to see Developers doing what they’re paid for – Developing. If they are seen writing documents and other non-code activities, managers can demand that they start coding.
Developers have had a tendency to think “Well, he must know what he’s talking about…”, and then get the grief when problems arise later.
The author described an occasion when working on a US military project. The Project Manager was a large Army General. He arrived at the office one day and wanted to see some code. He was told that the project was in it’s requirements phase, and everyone was writing documents, and talking to customers. Nevertheless, the general wanted to see some code. He went around all 100 developers until he found one writing what looked like code. In fact, the developer was writing a document-formatting utility. However, because the general wanted to see some code, and found what looked like code, he went away happy.
If a manager demands that a developer starts coding, there are four options identified by the author:
· Say “No”. Refuse to start coding before the prerequisites are in place. “This is dependant on your relationship with your manager, and the state of your bank account.”
· Pretend. Dig out some old program listings and place them on your desk. Meanwhile do the prerequisites. Let the manager think that you’re coding.
· Educate the manager. Explain the reasons for the prerequisites. “There are few enough managers that understand how things should be done, and why”.
· Change Jobs. “There is a shortage of good programmers”.
A useful metaphor is to think of the development process as a food chain. It passes from requirements, through to architecture, and down to the developers. If the environment is polluted, that is to say, the requirements are erroneous, these problems pass down to the developer.
Problems detected in the requirements phase are much less costly than ones discovered during development.
A report identified that if an error in requirements is found in development or maintenance, then it will cost 50-200 times more than if it is found during requirements or design.
IBM issued the following table to show the increase in cost of resolving errors during particular phases.
|
Time Detected |
Time Introduced |
||
|
Requirements |
Design |
Coding |
|
|
Analysis |
1x |
- |
- |
|
Design |
2x |
1x |
- |
|
Testing (Passive) |
5x |
2x |
1x |
|
Testing Structural) |
15x |
5x |
2x |
|
Testing (Functional) |
25x |
10x |
5x |
This table should communicate clearly the reason for fulfilling prerequisites before coding. It costs a lot less to start things correctly, than it does to constantly change things.
Another consideration is the time involved, and late nights that may arise. Getting it right at the start makes life a lot easier.
As mentioned earlier, software is written to solve a problem. Ensure that the problem is clearly articulated. Then problem definition should be just that, a definition of a problem. It should not be a statement of how something is to be improved, This is an entirely different statement.
The problem definition is the basic building block of the software development. The foundation.
This ensures that the software does not solve a different problem to the one that is required.
Once the problem is identified, the requirements of the solution can be identified. As mentioned earlier, it costs more to fix fundamental requirements problems further down the development cycle than it does to correct them at the requirements stage.
Getting the requirement right will reduce the number of changes later.
The following table describes the increases in cost of an error in requirements:
|
Stage |
Cost Ratio |
|
Requirements |
1x |
|
Design |
5x |
|
Coding |
10x |
|
Testing |
50x |
|
Maintenance |
100x |
“Stable Requirements” are a myth, or at least, the developer’s equivalent to the Holy Grail. A situation where a customer will never change their minds is unheard of. A report concluded that, on an average system of 1million lines of code, 25% will be changed.
The trick is to manage the customer. The author says “When customers get an idea for a new feature, their blood thins, they become excited and giddy, and completely forget the many meetings that they have had before. The best way to deal with them is to say ‘Gee that sounds like a good idea, I’ll write a specification and change control for it, and provide you with a revised schedule and cost estimate’. The words ‘Schedule’ and ‘Cost’ are much more sobering than a cup of black coffee and a cold shower under those circumstances”.
When presented with these circumstances, there are several options available.
· Implement Change Control Procedures.
· Accommodate the changes by using short life-cycles, and prototypes.
· Dump the Project.
The third option should not always be considered. However, an imaginary state must be identified, one that would justify dumping the project. Work out how close to this state things are.
The architecture provides a high level design of the system. This does not involve itself with such issues as screen layouts, report layouts and field definitions. The purpose of the architecture is to test the conceptual integrity of the system.
The main components of architecture are:
· Program Organisation. How does the system fit together? Where are the boundaries of the modules?
· Change Strategy. This shows allowance for new features. How easy it will be to implement them, and what is likely to come up.
· Buy vs. Build. What is to be developed, and what is to be bought from a 3rd party. The biggest gains in software come from re-use of software.
· Major Data Structures. This will determine what data is to be held. Alternatives are to be included, and justifications for the choices.
· Major Objects. Containing Interactions, Hierarchies, States and Persistence. Include alternatives and justifications.
· Key Algorithms. Again, specify alternatives and justification. Also state any assumptions made.
· Generic Functionality. State which routines are modularised. Which forms, menus and reports have the same look/feel and functionality.
· Error Processing. Over engineering – make modules more robust than they need to be. Include Assertions – Self checking code, e.g. if a number will never be more than 256, assert an error if it is larger. Assumptions must be included.
· Fault Tolerance. Detecting Errors and recovering. Choices are typically: Backing up through steps until no errors have occurred; Auxiliary code, to be used if an error occurs; Replace an erroneous value with a “safe” value that will not corrupt any other parts of the system.
· Performance. List goals.
· General Quality. Discuss modules and data. Don’t do things in a certain way “because we’ve always done it that way”. Make software environment independent. Don’t compromise time and quality in one area for another. Nothing about the system should make you uneasy, if it does, discuss it.
It is a good idea to define the language that is to be used to develop the software. Using the wrong tool for the job is an inefficient use of time. Use a language that will help to get the job done.
Using the right language increases productivity, as does using familiar languages. It has also been shown that using high level languages is 5 times more productive as using low level languages. This is due to the nature of the languages.
It is good practice to identify programming standards for a system. In-house development usually has it’s own set of standards. It is always easier to maintain a system with good programming standards used, than one that does not.
Project planning should typically take 20%-30% of the time. This does not include detailed design, this is done as part of development.
It is important to allow for uncertainty when planning. It is impossible to schedule time for tasks that are uncertain.
Every project is likely to be different, so it is up to the developer to decide how formal the pre-requisites should be, based on the current project.
· Developers need to know what is required before development starts.
· Correcting errors is design is much cheaper at the start of the development process, than at the end.
· Prerequisites help identify problems and issues before they arise in development.
· Define the Problem.
· Get the requirements right at the start. Correcting them later costs more.
· Manage changes to requirements correctly.
· Ensure the Architecture is sound.
· Use the correct language.
· Define Programming standards.
Begin
Design the Routine Check the Design
Check the Code Code the routine
Done
PDL stands for Program Design Language. This is similar to Pseudo Code, however PDL is more removed from the programming language level. When using PDL, use the following guidelines:
· Use English satements.
· Avoid Programming Language Syntax
· Write PDL at the level of intent. Describe the meaning of the approach, not how the approach will look in the target language.
· Don’t make it too high-level. It should be simple to generate code from PDL in any language.
PDL statements should be easy to follow. A good test is to write a routine In PDL, then write the PDL statements as comments into the target language.
Benefits of using PDL:
· Reviews are easier. It is easier to correct approach methods in PDL than in code.
· PDL designs can be refined again and again. It is easier to get the design right before coding starts.
· It is easier to change PDL than code.
· Commenting the final code is easier. PDL forms that basis of the comments.
· PDL is easier to maintain than other forms of design documentation. If the comments in the the code match the PDL statements, then the two are in line.
· Check the Prerequisites. Ensure the purpose is clearly defined.
· Define the Problem. Architecture should provide this.
· Name the routine. This should clearly describe what it does.
· Decide how to test it.
· Efficiency. If design is modular, then routines can be replaced later with faster low-level components.
· Re-use code. This is efficient, and a great time saver.
· Write the PDL.
· Think about the data. If data manipulation is required, design the data structures.
· Check the PDL. Regress. Ask someone else. Review.
· Iterate. Review the design as many times as necessary. Investigate alternatives.
· Write the Procedure Declaration. Name and Parameters.
· Comment introduction, description.
· Use the PDL for comments.
· Fill in the code between comments.
· Informally (mentally) check the code.
· Clean up leftovers. These include unused parameters, inaccurate variable names, infinite loops, one off errors, documentation.
· Mentally check the routine for errors.
· Compile the routine. This will highlight errors and warnings. The goal is to have no errors or warnings.
· Use a debugger to test the routine.
· Remove errors.
· Design the routine outside of code (PDL).
· Naming should be clear.
· Re-use any available code.
· Loop the design cycle as many times as necessary.
· Code the routine when the design is sound.
· When designing, regress, or ask someone else’s opinion.
· Explore alternatives.
· Check the routine informally, and formally.
· Reducing complexity.
· Allows re-use of Code.
· Avoids duplication of large chunks of code.
· Limit the effect of change. Changing 20 lines of code 10 times takes longer than changing once.
· Hiding Sequences. If possible hide the order in which events happen. Decide once, and place in a routine. This routine is then called from x places.
· Performance. Optimising code in one place, rather then 50.
· Centralise Control Points. Reading or writing to a file can be controlled from a single point.
· Hiding data structures. Complex data that is used only in one area will be contained.
· Hiding global data. If the data format is changed, the change only needs to be implemented once.
· Hiding pointer operations. These are difficult to follow. A routine would have a name that reflects the intent.
· Planning for additions to the program. These additions can use the same routines.
· Readability. Bundling code into a routine with a sensible name is more readable than a block of code.
· Improves portability.
· Isolates complex operations.
· Isolates non-standard language functions.
· Simplifies complicated Boolean tests.
Do not write a routine for the sake of it.
Even small routines are valid. Even 2 or 3 lines of code in a routine can be helpful.
· Use a strong verb followed by an object.
· For a function, use a description of the return value.
· For Objects, just use the verb.
· The length can be as long as necessary.
· Establish naming conventions.
This means that each routine does one thing, and does it well. If it does more than one thing, then it may need to be split into other routines. If the name can clearly define what the routine does, then it is OK.
Acceptable Cohesion:
· Functional Cohesion. The routine performs does one thing.
· Sequential Cohesion. Routines do things in a pre-defined order (OpenFile, ReadFile, CloseFile). If steps are required, then convert to a single routine (GetFileData – this does all three steps).
· Communicational Cohesion. If two different types of operation are carried out on the same data. If this happens often, then it is OK. However, it does not do just one thing.
· Temporal Cohesion. Things that happen at the same time (e.g. Startup). Convert to have Startup call other routines that are functionally cohesive.
Unacceptable Cohesion:
· Procedural. Created to perform things in a certain order, for the sake of it.
· Logical. Many different sections of code in the same routine, only one is run each time, based on a control parameter.
· Coincidental Cohesion. No reason to be in a routine.
Coupling refers to the link between routines.
Good Coupling Criteria:
· Size. Small is better. The less parameters, the less connection ther is.
· Intimacy. Data passed as parameters, as opposed to global data.
· Visibility. Point out the connections. Use a definite parameter list of values required.
· Flexibility. If a routine needs two parts of a data structure, pass the parts as parameters, don’t use the structure. This way, the routine can be used without the need for a data structure.
Levels of Coupling:
· Simple Data Coupling. “Normal Coupling” individual values are passed as parameters, and are not structured. This is best.
· Data Structure Coupling. Values are passed as a single parameter, which is a data structure. This is not as flexible. This is acceptable.
· Control Coupling. This is when one routine passes data to another, which tells the first routine what to do. This is bad.
· Global Data Coupling. Data is passed via global data, not parameters. This is risky, because other parts of the system can access global data.
· Pathological Coupling. One routine uses another routine’s internal data. This fails all requirements for good coupling, there is a definite connection between the two.
This debate has been going on since routines were invented! Theoretically, 66 to 132 lines is sufficient. IBM once limited routine size to 50 lines. It is said that smaller is better, however, some resons for larger routines are:
· Number of errors in a routine is inversely proportional to it’s size.
· Larger routines (65+ lines) are cheaper to develop, and have a lower fault rate.
· Small routines (-145 lines) have 23% more errors than large ones.
· A student comprehension study showed that , compared with programs with no routines, programs with routines of 10 lines showed no increase in comprehension. However, programs with routines of 25+ lines showed a 65% increase.
· Routines of 100 to 150 lines are changed least.
This means that the routine will check any parameters that may be passed, to see if they are damaging. The routine will check, even though it is someone else’s fault that the data was bad.
· Allow for bad data. Check it.
· Use assertions. Check assumptions.
· Take responsibility for checking data, parameters and return values.
· Don’t produce an error if passed bad data.
· Decide how to handle bad parameters.
· Handle exceptions.
· Anticipate changes. If change is likely, there should be little impact.
· Remove debugging aids. Have debug and release versions.
· Use Debugging aids early.
· Contain Errors (Firewall). Hide information about routines, ensuring less assumptions about the routines, therefore, less errors can be produced. Define “safe” areas of the program.
How much defensive code to leave in a released version:
· Leave checks for important errors.
· Remove trivial error checks.
· Remove code that causes program crash.
· Leave “Gracefully” crashing code, e.g. code that saves everything first.
· Ensure that error messages are helpful.
Be defensive about defensive programming. Otherwise programs will be slow.
Guidelines:
· Make sure that local variables match parameter variables(e.g. Not: Parameter = integer, Local = byte).
· Order parameters by Input, Changeable, Output.
· For similar routines, use the sme parameter order.
· Don’t pass parameters that are never used.
· Don’t work with parameter variables. Use local variables that are set to the parameters, and then used.
· Pass Status and Error parameters last.
· Document any assumptions about parameters.
· Limit the number of parameters. For Psychological reasons, seven is the most to use. Use Data Structures if more are required.
· Use a naming convention for parameters.
· Pass only the parts of a data structure that a routine needs.
· Don’t assume anything about the parameter passing mechanism.
Functions are routines that return a value. Include a check of the returning value as well as parameters.
· Even though there are other types of cohesion, Functional Cohesion is the best to use. This performs one function, and is named accordingly.
· Loose Coupling. Limit interface to only the required parameters.
· Be defensive. Check parameters and return values.
· Use parameters well.
Modules are a collective bundle of data structures and routines. These are essentially “black boxes” containing large chunks of reusable code.
By following the rules of Cohesion and Coupling, “black box” routines can be created that are reusable, and reliable. If all similar operations within a large program use the same modular routines, then maintainability is increased. It has also been proven that system comprehension is made easier.
It is not always possible to create the perfect module, because there may be shared data between modules.
Module Cohesion is similar to Routine Cohesion. Essentially, the priciple of Modular Cohesion is to place together routines and data that belong together.