How to Successfully Address Your Scaling Problems

If you’re thinking about how to scale server capacity, the first step is figuring out what problem you’re trying to solve.
Tammy Xu
August 3, 2020
Updated: October 6, 2020
Tammy Xu
August 3, 2020
Updated: October 6, 2020

At the end of March, Slack CEO Steward Butterfield posted a screenshot that graphed the number of newly created work teams on the platform from January through March. The graph is flat for the first two and a half months, then suddenly shoots up around March 12 — the time when offices around the country began to close and workers started working from home due to COVID-19.

Slack uses AWS, so it was able to autoscale and keep up with demand. But while using a cloud service can be a good way of handling a sudden increase in usage, not all companies use external cloud services, and scaling problems can be complex.

MORE ON ENGINEERINGWhen Microservices Aren’t the Answer

 

metrics illustration
Looking at performance metrics and usage data together helps determine whether you're seeing a scaling problem. | Image: Shutterstock

Would It Help to Scale Your Application?

First of all, how do you even know whether your company has a scaling problem? If a company’s software isn’t behaving correctly, how do you tell whether that’s due to scaling or other code errors?

“There are lots of [tools] out there now for monitoring your systems,” said Roger Campbell, owner of Albuquerque-based load testing company LoadStorm. The tools work similarly to Google Analytics, collecting data from a company’s servers.

“Google Analytics measures what your users are doing, keeps track of what pages they go to and whether those pages are functioning properly. These other monitoring services measure what’s happening on your servers or in your infrastructure.”

These tools, made by companies like New Relic and Datadog, provide useful information about the performance of servers that can be used to determine which issues are scaling problems.

“They’ll be able to measure things like whether your CPU levels are high or low, whether your memory usage is high or low,” Campbell said.

CPU usage levels correspond to how well the computer’s processor is able to handle all the software’s necessary calculations, and memory usage corresponds to whether the computer has space to store all the data the application needs to function at any given time.

“If you’re working with a web app these days and it doesn’t respond for a few seconds, that’s pretty noticeable.”

“If CPU and memory use are very high, that means you’re probably having a surge in traffic — or it could just mean that something is not functioning very well,” Campbell said. “That combination of looking at that kind of tool and seeing in hindsight that, ‘Well, we had 1,000 users at 11 o’clock yesterday morning and our servers showed that the CPU levels were high at that time’ — those two things would indicate that we probably need to prepare for higher normal usage, because our normal usage is growing.”

In other words, high CPU and memory usage indicates that a program is having performance problems, but it’s only after looking at whether those issues are correlated with increased user activity that it can be pinned down as a scalability issue.

“Performance and scalability are really kind of related,” Campbell said. “When we think about performance, we’re usually thinking about how fast the system responds to a user if it’s a user interface. If you’re working with a web app these days and it doesn’t respond for a few seconds, that’s pretty noticeable.”

What counts as being too slow, of course, has changed over time. Campbell said that, 20 years ago, developers followed the 15-second rule, which meant that applications were expected to load within 15 seconds — otherwise it was too slow.

“That’s kind of unthinkable now,” he said.

“When we test an application, usually the web page loads in a few seconds without much load on it,” Campbell said. “But once we start adding concurrent users — and it might be 100, might be 10,000 concurrent users — you’ll see that performance number slow down.”

Once companies determine cause and effect, working to scale up correctly will allow the application to handle large loads and alleviate any performance issues caused by load as well.

“The ability to handle massive use and the ability to deliver fast response, they go together, and you can solve them by scaling better,” Campbell said.

 

cloud illustration
Hosting applications on the cloud can be helpful when it comes to scaling up. | Image: Shutterstock

Different Ways of Scaling Up

Solutions for scaling up depend on the type of application and the size of the usage load on it.

“It kind of depends on what scale you’re talking about,” Campbell said. “If you’ve got a little WordPress site on a shared, cheap hosting, that’s likely to fall apart at 10 to 100 concurrent users. Something like that can be scaled up pretty quickly and simply just by moving to a dedicated server.”

Larger servers can handle more traffic, but they’re also more expensive. Campbell said that, for sites like WordPress, which are generally pretty uniform and don’t involve more complex operations such as receiving user input, hosting on content delivery networks (CDN) works well for scaling. CDNs work by serving a site’s cached data from different locations, which cuts down on the amount of traffic to individual servers.

But for more complex applications, it can be more difficult to scale well.

“When you’re talking about e-commerce, where you have to do checkout interactions and there’s a database involved, getting both read and write access at the same time, it’s a lot more complicated,” Campbell said. “The old way — 20, 30 years ago — of scaling a system like that was just to get bigger servers. You could get a lot more horsepower, a lot more memory, a more expensive database.”

“In the case of a Facebook or Amazon or Google, they’ve got thousands and thousands of machines that are handling their load on a regular basis.”

Campbell said that method of scaling — where companies pay for bigger servers and scale applications up as monolithic systems — is still the easiest way of scaling, “but it only takes you so far.”

Larger companies that serve more users can move from vertical to horizontal scaling. Whereas vertical scaling is buying bigger and more powerful servers; horizontal scaling means spreading out the load on a lot of small servers.

“In the case of a Facebook or Amazon or Google, they’ve got thousands and thousands of machines that are handling their load on a regular basis,” Campbell said.

Although most companies don’t have the usage level of those companies, they can still take the same basic approach. Having sites hosted in the cloud allows companies to horizontally scale, and to quickly allocate more servers if usage suddenly increases.

Campbell said cloud providers offer advantages such as Amazon’s Lambda and Microsoft’s Azure Functions, which allow companies to pay for usage rather than dedicated servers.

“The system is built for scalability at the beginning,” Campbell said. “It’s pretty trivial, you’re just changing configurations to scale it.”

 

software developer illustration
When figuring out complex scaling problems, nothing beats having experienced developers. | Image: Shutterstock

Scaling Up Won’t Fix Your Code Problems

But even hosting an application on the cloud and scaling up won’t fix all performance problems. Sometimes high CPU and memory usage is a result of bad code rather than high load. When LoadStorm does load testing for its customers, the most difficult changes done to fix performance problems are code changes.

“That’s usually kind of a last step, or a worst case, when you’ve tried everything else,” Campbell said. “There are situations where the system is coded poorly in the beginning, and you can’t really fix that by scaling up.”

He gave the example of an application that searches a database in two steps, with the first step getting a list of products and the following step getting the details of each product. If the code is written so that these are separate queries, the application could end up waiting for a large number of database queries to finish. The better approach is to consolidate queries into one or two larger ones instead of a lot of smaller ones, which cuts down on time-consuming database calls.

“There are situations where the system is coded poorly in the beginning, and you can’t really fix that by scaling up.”

“If you’ve done it the wrong way in a bunch of different places while you’re building your system and it works well for a small number of users, it’s a very difficult one to fix after the fact,” Campbell said. “It might be the kind of thing where it’s going to take weeks to get it fixed.”

Preventing those types of issues, he said, is something that comes with experience and having good engineers.

“The newer technologies certainly make a big difference if you know how to use them well,” Campbell said.

But scaling can still be a complex and tricky issue to solve, and a lot of knowing how to write an application both quickly and flexibly is gained through trial and error.

“That’s one of the reasons that a lot of us are in computer science. It’s fascinating, and the rate that the internet and storage and all the related technologies are growing makes it challenging,” Campbell said. “The scalability problems of today are similar to the ones that we had 10 years ago, but the scale is a whole lot different, and that keeps it interesting.”

MORE ON ENGINEERING19 Automation Worst Practices to Avoid

Great Companies Need Great People. That's Where We Come In.

Recruit With Us