Select Page
Affiliate Disclosure: This page may contain affiliate links. When you click and make a purchase, we may receive a commission at no additional cost to you. Thanks for supporting our content.

Load testing often doesn’t work as intended. In this episode of the SMC Journal, Scott Moore draws on 30 years of consulting experience to highlight five major pitfalls that organizations continue to encounter. These aren’t the only reasons for failure, but they are some of the most common and detrimental.

Moore’s perspective, refined over countless engagements with global companies, aims to help performance testers, SREs, and performance engineers avoid these common mistakes. When load testing goes bad, it often leads to questions from management about why production problems still occur despite testing. The reasons can be complex, potentially involving the tools, the people, the process, or a combination.

Why Load Testing Still Fails

Here are five key reasons load testing often fails, as discussed in the podcast:

  • Wrong Roles or Unskilled People: A frequent issue is assigning load and performance testing responsibilities to individuals who lack specialized training or experience in the field. This might include expecting functional QA testers, marketing personnel, or even developers/SREs to inherently know everything (“unicorns”) about performance simply because they are involved in other aspects of IT. This expectation, coupled with a lack of proper training, commercial tools, or sufficient time, creates a “recipe for failure”. Organizations might push individuals to learn using free online resources or tools with unrealistic two-week deadlines, highlighting a fundamental issue.
  • Bad Environments: Testing in environments that do not accurately mirror production is a major roadblock. Development, QA, or staging environments are often “nothing like production”. While container and Kubernetes-based applications might make spinning up production-like infrastructure easier through infrastructure as code, challenges remain with data modeling and setup complexity. Legal and regulatory concerns often prevent using actual production data, requiring complex data models. Even when environments are scaled down (e.g., to 10% of production), differences must be documented, and testers must understand these limitations. The ideal scenario, though often expensive and complicated, is testing in a production environment before it goes live. Fixing environments is presented as crucial for effective testing.
  • Too Developer-Centric or Pipeline-Centric: The emphasis on “shifting left” and integrating performance testing into CI/CD pipelines has sometimes led to a focus on tests that are convenient for developers but don’t reflect real-world user behavior or business processes. Running simple tests with limited virtual users (e.g., 50 VUs with JMeter) in a pipeline might provide a checkmark but fails to capture how integrated features behave under realistic load or how users interact with the system. This narrow focus can lead to problems slipping through the cracks, such as issues with CDNs or the combined impact of various features. While shifting left and continuous performance are important, the testing thought about early in the development cycle may differ significantly from the integrated testing needed before production deployment.
  • Weak Analysis: Moore strongly emphasizes that the true value of any load testing tool lies in the analysis. Merely generating load or requests (“load throwers”) to see if a system breaks doesn’t provide enough insight. Effective analysis requires understanding why something failed. This involves collecting comprehensive metrics, including end-user response times (like page rendering). Crucially, it requires understanding the resource cost or impact across the entire stack: back-end resources, browser resources, JavaScript execution, and third-party calls. Without robust analysis and visualization capabilities, it’s difficult to “tell that story” and understand the root causes of performance issues.
  • No Perceived Value by the Business: This is identified as the primary reason (“numero uno”) load testing fails to have an impact. If business leaders and executives do not recognize the value of performance engineering, they will minimize investment. This often translates to seeking the least costly ways to perform testing, using less skilled personnel, providing insufficient training, and purchasing cheaper tools with potentially weak analysis capabilities. Even if testing is mandated by regulations or SLAs, a lack of perceived value means the results may not be reviewed or acted upon.

Beyond the Load Testing: Showing Value

To combat these issues and ensure load testing makes a difference, performance professionals must demonstrate their value constantly. This goes beyond simply running tests and producing reports. It requires understanding the analysis, interpreting the data, and effectively communicating the findings to the business. By connecting performance issues to potential business impacts (e.g., lost users or revenue if a system fails under load, as illustrated by an example of finding missing database indexes), performance testers can show they “paid for yourself with a single exercise”. Being able to tell this story makes a performance engineer a “vital” and valuable asset that cannot easily be automated away. The ability to provide information that enables valid business decisions is key to survival and influence in the industry. The host also briefly touched on the importance of working for companies that value performance and are willing to invest in their performance professionals.

Summary

In a nutshell:

  • 5 – Wrong roles with wrong skills
  • 4 – Bad Environments
  • 3 – Too Developer Centric – focused on pipelines
  • 2 – Weak Analysis
  • 1 – No Perceived Value By The Business

Conclusion

In conclusion, while tools and processes are important, the success of load testing ultimately hinges on having skilled people, testing in realistic environments, focusing on relevant scenarios beyond simple pipeline checks, conducting thorough analysis, and, most importantly, ensuring the business understands and values the critical insights performance testing provides. There is hope that roles like Site Reliability Engineers (SREs) and a strong DevOps culture may help address some of these challenges.

Check out this other episode about load testing.

🔥 Like and Subscribe 🔥

Connect with me 👋
TWITTER ► https://bit.ly/3HmWF8d
LINKEDIN COMPANY ► https://bit.ly/3kICS9g
LINKEDIN PROFILE ► https://bit.ly/30Eshp7

Want to support the show? Buy Me A Coffee! https://bit.ly/3NadcPK

🔗 Links: