According to Robin Yeman, there are several challenges in building hardware-reliant cyber-physical systems, such as hardware lead times, organisational structure, common language, system decomposition, cross-team communication, alignment, and culture. A solution to such challenges is to apply agile at the systems level, and to architect both hardware and software into modular components.
Robin Yeman spoke about building cyber-physical systems with agile at QCon New York 2023.
When building hardware-reliant cyber-physical systems, we frequently get the software done rapidly, but it has to sit on the shelf until the hardware is ready, Yeman mentioned. Cyber-physical systems often have to manage long lead times for hardware, delaying feedback. Once the hardware is ready we need to update the software because of technology changes during that time, Yeman said.
There are several challenges when using agile for large scale for initiatives building cyber-physical safety-critical systems. Yeman explained that companies are often organised by function, not by product or outcome, leading to multiple handoffs and delays in delivery:
When we try to organise by outcome, the different functions such as hardware engineering and software engineering do not have a common language, making communication hard.
Another challenge Yeman mentioned was that many engineers struggle to decompose the system into small enough modules that allow them to incrementally implement them within timeboxes while building a cohesive piece of functionality that can be validated.
Many organisations experience cross-team communication challenges based on varying priorities, Yemen said. Instead of looking at the system as a whole, each team is only aware of their small portion of the system and is less concerned with timelines from other teams, leading to a lack of alignment.
Yemen mentioned that culture still remains the biggest challenge for organisations. Large companies that have been established for many years have experienced success in how they have completed work in the past.
In addition, their existing organisation structure and the architecture of the system have significant impact on dependencies:
If you move all of your teams to cross-functional teams but you have not changed your architecture, you could bring everything to a stop. If you change the architecture without updating the organisation, we may increase the number of bottlenecks.
Adapting to apply agile at the systems level requires a change in organisational structure, educating engineers on modular open system decomposition, applying multiple horizons of planning, ensuring all of your teams are on the same cadence, and ensuring you have test environments early in development, Yemen said. Hardware in the loop, software in the loop, and model-based testing environments are often not invested in until much later in the project lifecycle, which reduces our ability to obtain the benefits from Agile.
Yeman suggested intentionally architecting both hardware and software into modular components that can be modified independently, and only connect to other modules through standardised interfaces.
InfoQ interviewed Robin Yeman about using agile for building cyber-physical systems.
InfoQ: What made you decide to explore agile for building cyber-physical systems?
Robin Yeman: In order to deliver capabilities to our customers at the speed of relevance which is the frequency at which customers can take deliveries, we needed to apply agile practices at the system level. As I applied agile practices to larger systems, I found that the majority of practices were agnostic to the type of work to be completed, for example timeboxing or iterative development can be applied to any type of work, from writing a book to building a satellite. The speed of relevancy varies based on the type of system, for example we may need less frequent updates to a weapon system than we do for a phone application.
InfoQ: How do you adapt and apply agile for planning large systems?
Yeman: Agile practices break down into over a hundred practices; we would never use them all, but identifying the specific goals for your system will allow you to pick the right practices. For example, if the ability to adapt to changing needs is important, use multiple planning levels with feedback loops. For large complex space vehicles, we would have a five-year plan decomposed into an annual plan, decomposed into a quarterly plan, decomposed into a fortnightly plan, and finally decomposed into a daily plan. Each of these planning horizons will yield data that we need to use to further inform the next planning horizon.
For example, if my sprint plan is consistently only completing 75% of the planned backlog, it’s likely that the quarterly plan needs to be reduced or the staffing profile and tools adjusted. We use empirical data to adjust the plan.
You might wonder, with such a large plan, how does one maintain agility? The key to this form of planning is a regular cadence dedicated to updating the plan based on empirical data, with the fidelity of the plan decreasing with each planning horizon.