What Is Latency? (Definition, How to Reduce, Examples)

Summary: Latency is the delay in data transmission between devices, influenced by distance, network type and infrastructure. Low latency is essential for real-time applications like gaming, high-frequency trading and cloud computing, where speed directly affects performance.

Latency, or delay, is usually measured in time (fractions of a second, or microseconds). Minimizing this time interval is a critical requirement for many modern software applications across most, if not all, major industry sectors. Gaming and finance are key examples of areas of our economy where this requirement’s importance is most critical.

How Can I Reduce Latency?

Common methods for reducing latency are:

Use content delivery networks (CDN) to cache data and store it closer to the location where data needs to be transmitted.
Minimize the physical distance between input and output computers (points A and B).
Switch connection types (ethernet vs Wi-Fi) for a more reliable experience and/or upgrade network infrastructure with optimized cabling.
Minimize the number of applications running at the same time.
Optimize application code.
Use pre-fetching methods, which means sending information from point A to point B before data is even requested in the first place from point B, essentially anticipating that a request for the same data will come in the near future.

More From the Built In Tech DictionaryWhat Is a Software Requirement Specification?

How Does Latency Work?

Latency is a concept based on the foundation and architecture of our modern internet, an integrated network of computers that constantly send and receive communications in the form of chunks of data (commonly referred to as “packets”) all over the network.

Latency works, and is conceptualized as, the time delay of information transmission, which is a function of many factors including:

the physical distance between the point of arrival and destination
the medium (cable, satellite) over which data travels
the network’s architecture itself (3G vs 4G vs 5G in cellular networks)

We can measure latency as the time it takes for information to get to Point A to B (one-way latency) and also from B back to A (round-trip latency), which is what we’re talking about most often.

It is important to note that latency is inherently a function and a consequence of the physical limits of our world, where the velocity at which things happen is always less than, if not equal to, the speed of light.

Therefore, any data communication will experience latency; the important thing to remember is that this latency can vary and be minimized based on many factors, such as the ones above, which we can control to varying degrees.

What Is System Latency? | Video: NVIDIA GeForce

Why Is Low Latency Important?

Achieving low latency is of huge importance for most modern applications in the software world, since receiving information promptly is critical to sustaining the flow of communications and the user experience across a computer network.

Imagine talking to a person over the phone. Anytime you finish a sentence it takes 10 seconds for the other person to receive and hear it. The overall user experience would be very poor compared to the close to real-time communication you experience most days and would make the conversation unbearable.

The same principle applies to sending text messages, video communication and video game interactions since these are all expressions of data that need to be communicated and sent across a network.

That’s why latency, or more accurately, achieving low latency, is a key technical backbone on which most modern communication technology experiences rely. Latency influences both the decisions of the end customer or data consumer (when, for example, deciding to buy the latest 5G iPhone versus the less modern 4G version), as well as of any company that needs reliable data communication speeds to operate effectively.

Latency: Key Applications and Current Trends

As we’ve seen, minimizing latency is crucial for most industry sectors, but is especially important for a few key areas. Here’s a brief overview of those industries and why latency plays such a critical role in ensuring operational success.

High-Frequency Trading (HFT)

HFT stands for high-frequency trading and is a subset of the finance field. In HFT, many market orders (typically trading securities of different kinds, like stocks and bonds) are traded algorithmically, that is by making use of powerful computer programs to actually run and execute orders.

These orders typically run on high-performing cables (usually made of optical fiber and other even more high-performing materials). The cables link HFT firms’ orders with the public markets in which these are ultimately executed.

In HFT, minimal fractions of a second make the difference between selling at a loss or at profit. As a result, the technical backbone on which HFT firms are built is of vital importance for their success in the market. Minimizing the time to order execution by minimizing latency is crucially important.

Gaming

The gaming sector is also particularly sensitive to latency since more and more games are played online in multiplayer experiences.

Many bestselling titles, such as Call of Duty, rely on players connecting to a shared experience with other players. These shared experiences need to effectively be the same so players can actually play together on a shared version of the virtual game. This means latency needs to be minimized for game event data to reach all players in the gaming pool simultaneously.

It’s clear then why latency is a key technical tenet in this field, and gaming firms spend a lot of their time devising workarounds to minimize it. These workarounds might include pooling gamers together into specific geographical regions to reduce the physical distance or introducing late-rendering solutions into game dynamics that effectively ask certain players to wait for the more latent player to catch up to the latest game experience.

Cloud Computing

The cloud-computing industry has exploded in recent years, and today’s cloud-computing giants (AWS, Microsoft, Google, and Alibaba among others) provide the technical backbone to many applications and online experiences.

This technical backbone is formed by several data centers across all regions of the world that allow clients to rent and build computing power and networking architectures for their software and data communication needs.

As such, these tech giants tend to develop a strong focus on minimizing latency for their clients by:

routing end-users to regional data centers that cache data from the company’s main geographical hub
offering vast geographical coverage to their clients so end users can be served as closely as possible for optimal user experiences
building and offering custom services that are specifically developed to solve latency problems

Future Trends

Looking into the near and medium term, latency minimization is expected to play an even more significant role in enabling key tech trends. These include augmented reality (AR) and virtual reality (VR) experiences. The computing power needed to render optimal AR and VR interactions rely heavily on minimal latency, so we can expect much more innovation and technological investment on the latency side as the arms race to conquer these nascent industries continues.

Frequently Asked Questions

What is latency in information technology?

Latency refers to the time delay between the transmission and receipt of data between two points, such as computers or servers.

How is latency measured?

Latency is typically measured in fractions of a second or microseconds and can be assessed as one-way or round-trip time between devices.

Why does latency occur?

Latency is affected by several factors, including the physical distance between devices, the type of transmission medium (like cables or satellites) and the network architecture.

How can I reduce latency?

Common methods include using content delivery networks (CDNs), reducing physical distance between systems, switching to wired connections, optimizing code, limiting background applications and using pre-fetching techniques.