Definition
Latency is the time delay between a request being sent and the response arriving. When a visitor clicks a link on your website, latency is the gap between that click and the moment the server begins delivering the page. It is measured in milliseconds and affected by physical distance, network congestion, server processing time, and how many steps the request passes through. Lower latency means faster responses; higher latency means noticeable delays.
Why It Matters
Latency is one of the biggest factors in how fast your website or application feels. Even if your server can process requests quickly, high network latency adds a delay that users experience as sluggishness. Research consistently shows that users abandon pages that take more than a few seconds to load, and much of that load time is often latency rather than actual processing. Latency also matters for real-time features like live chat, video calls, and payment processing, where even small delays create a poor experience. Reducing latency through server location choices, CDNs, and efficient architecture is one of the most effective ways to improve user experience.
Example
A UK-based SaaS company hosts their application on a single server in London. UK users experience fast response times, but their growing US client base reports that the dashboard feels sluggish. The delay is caused by latency — every request from a US user travels across the Atlantic and back. By deploying the application to an additional server in the eastern United States, the company halves the latency for US users and the sluggishness disappears.