Legacy Code: Treasure to Cherish, Pain to Maintain, Part 3
By Koos Huijssen
This blog is part of a series that is also available as a whitepaper.
In our practice as scientific software engineers at VORtech, we work for organizations that have computational software as an essential part of their intellectual capital. If successful, such a computational software package can exist for many years and even decades. Over time, it evolves. At the same time however, the code base grows and becomes more and more difficult to manage. The efforts required to maintain and extend this legacy code become extensive. This could go up even to the point that, despite the enormous investments, a from-scratch replacement seems the only way forward.
This is part 3 of a series of blogs in which we discuss how to deal with legacy code. How to keep it in a manageable state, and how to revive it if necessary. In the first part, we gave our vision on legacy software, consisting of respect for the value, understanding of the history and the present and a clear vision on the shortcomings of aged software.
In the second part, we discussed typical undesirable features of legacy code that we have come across, and how these features have slipped into the code over time. In this final blog, we will describe what approaches can be taken to improve the code sustainability, and our general strategy to make the legacy treasure shine and prosper again.
Three approaches for dealing with legacy code
On a high level, three approaches can be selected to continue working with the legacy application:
- Encapsulation. The legacy code kernel is treated as a black box, which is packed in a container and that is not be opened anymore. A wrapper is created that takes care of the access to and from the kernel, preferably in a modern language such as Python.
Typically, this approach satisfies specific use cases such as running in parallel, interacting with other software or a new user interface.
An advantage of this approach is that the risk for introducing errors and bugs is relatively low. The kernel code is not touched, and the focus lies on the development of the new feature. This also makes the initial investment smaller than the approaches below.
On the other hand, in the end this approach can lead to a more complex (multi-application, cross-language) situation. And as the kernel remains untouched, there is no insight gained on its functionality. The legacy code is ‘mummified’ instead of ‘revived’, and in the long term it may result in an even more inflexible and unworkable situation. - Revival. The code is gradually refactored to a manageable state and towards higher flexibility. Typical steps in this refactoring process include (not necessarily in this order):
- Gradual modularization by separating functionality into program units such as the user interface, file communication, computational kernel, administrative layer. Work towards the single responsibility principle, where each section of code (a procedure or function) performs one task alone.
- Extension of the test suite. The test suite contains test cases that execute the workflows, including all options that should be supported. It is the condensation of the desired functionality. As such it consolidates the application’s behavior, and it guarantees that this behavior is conserved in future developments.
- Potentially: migration of (parts of) the code to a modern language. With a higher degree of modularity, this can be done in a partial/gradual approach, which makes it more manageable than a full migration of the entire application.
- Incorporation of new demands. With the modernization of the code through modularization, improved testing and migration, the current day needs and future perspectives can be met at a significantly lower cost. Examples of these demands include
- new applications, models, and features
- new interfaces or GUIs (for example web-based)
- thread-safety and concurrency
- improved performance
- running on new platforms, such as operation in the cloud or running on accelerators (GPUs)
The main advantages of this approach are predominantly its graduality and its aim for reviving the legacy code. It is relatively easy to divide the development into separate phases and to manage them in terms of time and result. Along the way, the insight grows about the purpose of the code, and thus the code documentation is gradually improved. The legacy is more and more unlocked, and its original luster is restored. Most often, hidden bugs that lurked in the code for a long time are uncovered and resolved. With the growing modularization, the new features and demands can be much better integrated into the code.
Downsides of the modernization approach are that the risk of introducing bugs and of managing the process are higher than with the encapsulation approach. Moreover, it feels like the ‘hard’ way to dive into the code to understand and refactor it. But this is often the only way. And the other approaches do not prevent the need for diving into the code, as both also require knowledge on what the code is actually supposed to be doing.
- Full replacement. This is typically the last resort and it comes with high cost and high risk. However, especially when the legacy code depends on obsolete platforms or technologies, the reincarnation approach is sometimes unavoidable. The desirable situation is that at the end of the replacement project, an application is realized that performs the same as the legacy application but that is ready for the future.
It is typically hard to sell to the management that the replacement, after considerable investment, will (hopefully!) provide the same capabilities as the original application. For applications that are still actively being developed this requires a period of double work, as the latest improvements also need to be incorporated in the replacement. On top of this, the risks of running into technical complications and consequent project delays are high. Apart from successful endeavors, we have also seen projects which dragged on and where the replacement software itself turned into legacy code, sometimes even before it was released.
How to relieve the pain in legacy code while keeping the treasure
To determine the best strategy for a specific application, several aspects must be clarified regarding the state of the code and its longer-term perspective, expectations and road map as laid out by the application owner. Whenever we embark on a new software modernization project, we typically begin with an assessment of the application and its perspectives. We perform an analysis of the state of the code and its technical surroundings. Moreover, we obtain clarity on the current and desired use, on today’s requirements and tomorrow’s perspective through discussions with the development team and the management. From this assessment, we draw up our recommendations for the best renovation strategy and we present an action plan. This analysis, which we refer to as the ‘Model Scan’, provides a long-term vision for the legacy application, and a guide for prioritizing and planning the application development. A Model Scan will range from as little as two days when the situation is clear up to ten days for more complex situations.
Conclusion
With a revived code base, the legacy has turned into an active project that is ready for the coming decades to support new applications, features and platforms. As an added benefit, the renovation process has increased your insight in the internal operation of the application, and this has inspired further advancements. And the cost of maintaining and expanding the software package has been considerably reduced. The legacy code has regained its luster and it has turned from an innovation blocker into an enabler.
Want to know more?
If you are dealing with a situation around legacy code, feel free to contact us. We can assist you with consultancy, project management and hands-on work. Contact us through info@vortech.nl or +31(0)15 285 01 25 for an appointment with one of our experts.