Refactoring
When developing an application, inelegantly structured sections can accumulate in the source code which impairs the usability and compatibility of the program. The solution is either an entirely new source code or restructuring in small steps. Many programmers and companies increasingly opt for code refactoring in order to optimize functioning software over the long term and make it more legible and clearer for other programmers.
During the refactoring process, the question is raised about which problem in the code should be solved with which method. Refactoring is meanwhile considered to be among the basics when learning to code and is becoming more and more important. Which methods are used to this end and what are the advantages and disadvantages?
What is refactoring?
Programming software is a lengthy process that can involve multiple developers. Written source code is often revised, changed, and expanded during this work. As a result of time pressure or outdated practices, inelegant sections can accumulate in the source code. These are known as code smells. These weak spots that accrue over time endanger the usability and compatibility of the program. To prevent this gradual erosion and deterioration of the software, refactoring is necessary.
In principle, refactoring is similar to editing a book. The practice of editing does not create a completely new book, but instead a more understandable text. Just like various approaches exist in editing such as cutting, reformulating, deleting, and restructuring, code refactoring likewise encompasses a number of methods like encapsulation, reformatting, or extraction in order to optimize a code without changing its function.
This process is much more cost-effective than preparing an entirely new code structure. Especially in iterative and incremental software development, as well as agile software development, refactoring plays a major role, since programmers frequently need to alter software in these cyclical models. In this context, refactoring is a fixed step in the workflow.
When source code deteriorates: spaghetti code
First, it’s important to understand how code can age and mutate into spaghetti code. Whether due to time pressure, lack of experience, or unclear instructions, programming code can lead to a loss of functionality as a result of unnecessarily complicated commands. A code deteriorates increasingly, the faster and more complex an area of application is.
Spaghetti code refers to confusing, unreadable source code that can only be interpreted by programmers with great difficulty. Simple examples of confusing code include superfluous jump commands (GOTO) that instruct the program to skip back and forth in the source code, or unnecessary for/while loops and if commands.
Projects involving many software developers are particularly susceptible to unclear source code. When code passes through many hands and if the original already contains some weak points, a growing mess resulting from “workaround solutions” can hardly be avoided, necessitating a costly code review. In severe cases, spaghetti code can jeopardize the entire development of software. If the problem gets that far, it may even be too late for code refactoring.
Code smells and code rot are not quite so disastrous. Over time, a code can start to smell – metaphorically – with all its inelegant sections. Difficult-to-understand parts become worse as other programmers intervene or add new strings. If refactoring is not performed at the first signs of code smell, the source code will gradually lose functionality as a result of code rot.
The aim of refactoring
The intention behind refactoring is simply to achieve better code. Effective code allows new code elements to be integrated better without introducing new errors. Programmers who can effortlessly read the code will be able to familiarize themselves with a developing application faster and remove or avoid bugs more easily. Another goal of refactoring is to improve error analysis and the maintainability of software. The work of programmers reviewing code is therefore simplified considerably.
What sources of errors does refactoring solve?
The techniques applied in refactoring are as varied as the errors they’re intended to remove. Essentially, code refactoring is defined by its errors and encompasses the steps required to shorten or remove a solution approach. Sources of errors that can be resolved with refactoring methods include:
- Confusing or excessive code: Command strings and blocks are so long that external programmers will be unable to understand the internal logic of the software.
- Code duplications (redundancies): Unclear code often contains redundancies that have to be changed separately at each occurrence during maintenance, thereby wasting time and resources.
- Excessive parameter lists: Objects are not assigned directly to a method but their attributes are conveyed in a parameter list.
- Classes with too many functions: Classes with too many functions defined as methods – also known as god objects –make adjusting the software almost impossible.
- Classes with too few functions: Classes with so few functions defined as methods that they are unnecessary.
- Overly general code with special cases: Functions with too specific special cases that hardly ever occur – if at all – and therefore make adding necessary extensions more difficult.
- Middle man: A separate class acts as a “middle man” between methods and various classes, instead of directing calls from methods directly to a class.
What approach does refactoring involve?
Refactoring should always be performed before changing a program function. It ideally involves very small steps, with code changes tested using software development processes like test-driven development (TDD) and continuous integration (CI). In a nutshell, TDD and CI refer to the continuous testing of small, new code sections that programmers build, integrate, and test in terms of their functionality – often with automated test runs.
As a rule, only change the program in small steps internally, without affecting the external function. After each change, you should run an automated test run if possible.
What techniques exist
A range of refactoring techniques exist. A complete overview can be found in the comprehensive book on refactoring by Martin Fowler and Kent Beck: Refactoring: Improving the Design of Existing Code. Here’s a brief summary:
Red-green development
Red-green development is a test-driven method of agile software development. It is used when a new function is to be integrated into existing code. Red stands for the first test run prior to implementing a new function in the code. Green stands for the simplest possible code section required for the function in order to pass the test. As a result, an extension is prepared with constant test runs to filter out defective code and increase functionality. Red-green development provides a foundation for continuous refactoring in continuous software development.
Branching by abstraction
This refactoring method describes a gradual change to a system and the conversion of old, implemented code into new, integrated sections. Branching by abstraction is typically used for large applications that involve class hierarchies, inheritance, and extraction. By implementing an abstraction that remains linked to an old implementation, other methods and classes can be linked with the abstraction and the functionality of the old code section can be replaced by abstraction.
This often occurs via pull-up or push-down methods. They link to a new, better function with the abstraction and transfer the links to it. In doing so, they either move a sub-class to a higher class (pull-up) or divide a higher class into sub-classes (push-down).
You can then delete the old functions without endangering the overall functionality. With these small changes, the system works unchanged while you gradually replace inelegant code with neat code, section by section.
Compiling methods
Refactoring is intended to make code methods as legible as possible. Ideally, external programmers should be able to grasp the internal logic of a method when reading the code. There are a number of different techniques for efficiently compiling methods. The aim of each change is to harmonize methods, remove redundancies, and split excessively long methods into separate sections, thereby opening them up to future changes.
Such techniques include:
- Method extraction
- Method inlining
- Removing temporary variables
- Replacing temporary variables with a request method
- Introducing descriptive variables
- Separating temporary variables
- Removing assignments to parameter variables
- Replacing a method with a method object
- Replacing an algorithm
Moving attributes between classes
To improve code, you need to move attributes or methods between classes. Here, the following techniques are used:
- Move method
- Move attribute
- Extract class
- Inline class
- Hide delegate
- Remove class in the middle
- Introduce extrinsic method
- Introduce local extension
Data organization
This method aims to divide data into classes and keep them as neat and clear as possible. You should remove unnecessary links between classes, which impair the software functionality in the event of minor changes, and divide them into coherent classes.
Examples of techniques include:
- Encapsulating own attribute accesses
- Replacing own attributes with an object reference
- Replacing a value with a reference
- Replacing a reference with a value
- Linking observable data
- Encapsulating attributes
- Replacing a dataset with a data class
Simplifying conditional expressions
While refactoring, you should simplify conditional expressions as far as possible. The following techniques can be applied to this end:
- Stripping conditions
- Merging conditional expressions
- Merging repeated instructions in conditional expressions
- Removing control switches
- Replacing nestled conditions with guard clauses
- Replacing case distinctions with polymorphism
- Introducing zero-objects
Simplifying method requests
Method requests can be run faster and more easily using the following methods, for example:
- Renaming methods
- Adding parameters
- Removing parameters
- Replacing parameters with explicit methods
- Replacing error codes with exceptions
Refactoring example: renaming methods
The following example shows that the method naming in the original code does not make its functionality clear and easy to understand. The method is intended to output a ZIP code for an office address, but it doesn’t indicate this task directly in the code. To formulate the code more clearly, it’s a good idea to rename the method in the process of code refactoring.
Before:
String getPostalCode() {
return (theOfficePostalCode+“/“+theOfficeNumber);
}
System.out.print(getPostalCode());
After:
String getOfficePostalCode() {
return (theOfficePostalCode+“/“+theOfficeNumber);
}
System.out.print(getOfficePostalCode());
Refactoring: advantages and disadvantages
Advantages | Disadvantages |
---|---|
Better comprehensibility facilitates maintenance and the extendibility of the software | Imprecise refactoring could introduce new bugs and errors into the code |
Restructuring the source code is possible without altering the functionality | There is no clear definition of “neat code” |
Improved legibility improves the comprehensibility of the code for other programmers | An improved code is often difficult for the customer to recognize, since the functionality stays the same, i.e. the benefit is not self-evident |
Removed redundancies and duplications improve the effectiveness of the code | In the case of larger teams working on refactoring, the coordination effort required could be surprisingly high |
Self-contained methods prevent local changes from having an effect on other parts of the code | |
Clean code with shorter, self-contained methods and classes is characterized by better testability |
In general, when refactoring, introduce new functions only when the existing source code is to remain unchanged. Only alter the source code – i.e. carry out refactoring – when you are not adding any new functions.