By Jeanine Murphy and Michael Sioufas
Many enterprises are pursuing cloud transformations to increase agility and keep pace with evolving technology and customer expectations. However, the agile product teams supporting these efforts often struggle to meet the technical demands of modern cloud-native architectures, including a rapid release schedule for secure and reliable applications.
This challenge largely exists because enterprise product teams lack a structured and standardized approach to prioritizing technical excellence during application development. Technical excellence focuses on an app’s nonfunctional requirements (NFRs) or operational aspects, such as performance, reliability, and cost. Often, product teams assess NFRs only after a significant release or when problems, such as frequent outages, occur. Yet NFRs can also affect how well an application fulfills its functional requirements, the behavioral aspects that product teams use to assess whether the app is successful.
Consider the case of one product team that wanted to move a containerized application from a containerized on-premises environment to the cloud. Unfortunately, the team realized the existing application infrastructure would not work in the new environment, and they did not consider the cost implications of such a transition during planning. This example illustrates a common assumption that cloud-computing applications will be more reliable and cost effective—but this is true only if they are designed for the cloud environment from the ground up.
To better identify these kinds of obstacles and improve NFRs that might affect application performance, teams should employ a well-architected framework based on five pillars—operational excellence, security, reliability, performance efficiency, and cost optimization—against which to evaluate technical excellence. Indeed, using the framework to help build solutions for the cloud environment can enable companies to achieve all the benefits of a cloud transformation.
The well-architected framework at work
A well-architected framework is the North Star for technical excellence. Over time, companies that use this framework will develop consistent and standardized criteria and guidelines that are easily adoptable by product teams and help assess the success of NFRs. As performance improves, the criteria will evolve as well. Therefore, regularly evaluating NFRs and technical improvements should become part of such established agile practices as sprint planning, backlog1 prioritization, and backlog grooming.2 The well-architected framework can even be used with commercial, off-the-shelf software or software as a service (SaaS) during the transformation’s evaluation phase to provide a basis for probing questions and comparative analysis.
The well-architected framework for evaluating technical excellence is based on five pillars:
- Operational excellence: run and monitor systems to provide business value while continually improving support processes and procedures
- Security: protect information, systems, and assets while delivering business value through risk assessments and mitigation strategies
- Reliability: ensure systems can recover from infrastructure or service disruptions, dynamically acquire computing resources to meet demand, and mitigate disruptions such as misconfigurations or transient network issues
- Performance efficiency: use resources efficiently to meet system requirements and to maintain performance as demand changes and technologies evolve
- Cost optimization: run systems that provide business value at the lowest price point by minimizing or avoiding unnecessary costs
In our experience, cost optimization is often the most neglected pillar, though it is critical in today’s pay-as-you-go cloud-deployment model. Organizations should understand what systems or operating practices contribute to application costs to make sound financial-management decisions. These findings can help product teams better define criteria to assess the cost implications of a business decision and balance resources more efficiently across both functional and nonfunctional requirements.
Organizations could, for instance, analyze service-usage patterns to identify workloads that could move to different geographic regions and even limit or reduce unnecessary or underused infrastructure during nonpeak periods to decrease operating costs. Automation and other cost models from cloud providers could also reduce costs.
Security is less likely to be overlooked, as an insecure system or application can substantially damage business reputation and application performance. However, inconsistent or ill-defined assessment criteria make it hard to identify issues, even when security teams spend significant time outlining, codifying, and documenting the necessary controls and measures to protect an enterprise’s data, systems, intellectual property, and assets. The well-architected framework can guide product teams toward typical security requirements, including standardized identity- and access-management approaches, infrastructure and data protection, and incident detection and response.
One product team was moving a traditional (three-tier) virtualized application stack to a new cloud environment. The group used the framework as a guide to help them rethink the platforms used in each of the tiers—for instance, whether to use Oracle or AWS RDS—and consider the security considerations and controls implemented in the new cloud environment. The team also considered additional security requirements and cost-optimization approaches. Ultimately, the framework saved the team hundreds of hours during both the planning and development phases by providing guidance on how to evaluate different techniques and approaches.
In another organization, matrices have been developed using the well-architected framework as a basis to help measure how product teams are incorporating NFRs in their individual products. Heat maps are then used to show and compare products to assist leadership in prioritizing, budgeting, and planning releases. This approach has helped measure the progress and maturity of applications and the teams developing them.
Teams that use the well-architected framework to improve NFRs and overall technical excellence can begin by asking specific questions to align their focus with each of the five pillars:
- Is the application designed to use resources in the most cost-effective way?
- Have the necessary security controls been implemented to ensure the integrity of the application’s data and to detect and address security events?
- Are we taking measures to meet the reliability and performance expectations of our user community?
As organizations increase their reliance on technology and transition to the cloud, performance, reliability, and security expectations also increase, making the well-architected framework an indispensable tool.
Jeanine Murphy is director of data architecture in McKinsey’s New York office, where Michael Sioufas is director of domain architecture.
1 A backlog is a prioritized list of the features that an agile product team plans to build.
2 Backlog grooming is an agile team’s review of the backlog to ensure it has only relevant items before assigning the next steps.