Behind the Scenes of YouTube's Massive Video Delivery Network

The paper "Vivisecting YouTube: An Active Measurement Study" discusses the design and deployment of YouTube's scalable and distributed delivery infrastructure, enabling the platform to deliver videos efficiently to its users worldwide. By leveraging basic web protocols such as DNS and HTTP along with anycast and unicast routing, YouTube designed a flexible and scalable 3-tier Content Delivery Network (CDN) spanning over 38 locations worldwide. This post exposes the system architecture and design principles to understand better physical vs. logical server grouping, tiered caching infrastructure, DNS namespacing, anycast/unicast routing, and HTTP redirection to design a global caching infrastructure that servers billions of users at scale.  

System Design Principles

Video servers are logically grouped into 5 DNS namespaces mapped to a 3-tier physical cache hierarchy consisting of primary, secondary, and tertiary servers.

YouTube's Logical and Physical Cache Architectural Design

One of the main concepts to grasp in this paper is the logical vs. physical grouping of servers. YouTube’s video servers are physically disbursed worldwide in different data centers close to users. Video servers in the same data center might be physically located together but logically serve different roles in the caching infrastructure.

YouTube organized its video servers into a multi-layered cache hierarchy. Each layer is strictly ordered in one direction, so if one layer does not have a given video, then the request is redirected to another layer in the cache hierarchy. Within the primary physical tier, there are two logical DNS namespaces called lscache and nonxt cache. They contain 192 logical video servers mapped to the primary cache locations in YouTube’s physical cache. There is one DNS namespace in the secondary layer called tccache . The final tertiary layer has two DNS names, cache and altcache.

The instances in each physical layer might have different hardware and storage capabilities depending on their role in the caching infrastructure. Servers in the primary layer might have Solid State Drives (SSDs), which use flash memory to store data instead of traditional hard drives for faster read/write speed. By organizing physical machines into different logical groupings, you can ensure that each layer is optimized for a given role. You can also dynamically scale each logical layer by adding more physical servers without changing the DNS hostname.

YouTube Anycast (first five) and Unicast (last two) Namespaces

YouTube leverages basic web protocols such as DNS resolution and HTTP redirects with anycast and unicast routing to deliver a scalable, geographically distributed, multi-tiered video caching infrastructure.

YouTube routes a request to stream a video to the physical video server using DNS resolution and HTTP redirection between different tiers using anycast routing and unicast routing within a given tier. Anycast routing is a method where a single destination IP is mapped to multiple nodes in a network. Load balancers use anycast routing to map a static IP address to a group of healthy nodes, a one-to-many mapping. Unicast routing is where you assign a single IP to an individual node in a network, which is a one-to-one mapping.

YouTube logically groups servers using anycast DNS namespaces, so a request to one cache layer is routed to the best node in the cache hierarchy to serve the request. The authors found that each hostname in the primary lscache namespace, such as v23.lscache5.c.youtube.com is mapped to more than 75 unique IP addresses distributed across the 38 primary cache locations. Within a caching layer such as the primary layer, requests are routed to individual nodes using unicast routing to find a server that can stream the requested video to a user.

Only the hostnames within the first lscache namespace are visible in the URLs or HTML pages sent to the user. The rest of the anycast and unicast namespaces are accessed only via dynamic HTTP request redirections when a user tries to play a video. YouTube uses locality-aware DNS resolution to perform fine-grain load balancing where a user’s request is routed to the closest server that contains the video a user wants to stream.

Youtube uses a fixed-length (unique) identifier for each video and employs a fixed hashing to map the video id space to logical namespaces.

When you visit a YouTube video like https://www.youtube.com/watch?v=OwHeBvU6oEg it contains a unique video id in the v query parameter such as OwHeBvU6oEg. This is a flat identifier of 11 literals long and can be [A-Z], [0-9], - or  _. The total size of video id space is 64^11, an incredibly large number that’s difficult to comprehend its magnitude.

Video ids are mapped to a fixed hostname in lscache layer out of 192 possible names. The video id space is uniformly divided into 192 sections, and each lscache DNS name represents a logical video cache server. This fixed mapping makes it easier for backend web servers or frontend clients to generate HTML pages with URLs pointing to the relevant videos without knowing where a user is located or how a logical server is mapped to a physical server that has the video to stream.

There is also a fixed and consistent map between the anycast DNS namespaces in different caching tiers. For example, there is a one-to-one mapping between the 192 hostnames in the primary lscache namespace and the hostnames in secondary tccache layer. Having a fixed mapping between video id and a logical namespace from the actual physical servers gives YouTube a lot of flexibility in managing its infrastructure and load balancing. They can easily add more physical serves around the world in new or existing cache locations. The hostname in a different tier can also be computed using static mapping without knowledge of the server load. DNS resolution can perform fine-grain load balancing to the correct server within a cache tier.  

System Architecture

Youtube Video Delivery System Diagram
  • Web Server
    • Servers HTML web page to users with embedded video URLs to lscache layer generated using a fixed hashing to 192 hostnames.
  • Video Flash video server
    • Streams videos to the user’s browser
  • Locality-aware DNS resolution with HTTP redirection
    • Requests are routed to 38 cache locations close to the users.
  • Video ID Space
    • 11-character fixed-length (unique) identifier mapped to a space of 64^11.
  • Multi-layered organization of multiple anycast with 5 DNS anycast namespaces mapped to a 3-tier physical server cache hierarchy consenting of Primary, Secondary, and Tertiary.
  • Cache misses are handled by either fetching the content from a backend data center and serving it to the client or redirecting the client to a server that has the content.

User Flow

  • User visits the www.youtube.com
    • The browser resolves the hostname using a local DNS server (LDNS)
  • HTTP request is directed to one of YouTube’s web servers controlled by YouTube DNS System.
  • The web server returns an HTML page with one or more embedded URLs such as v23.lscache5.c.youtube.com directing it to get video from the Flash video server.
  • The user clicks play, and another round of DNS resolution occurs, resolving v23.lscache5.c.youtube.com to a YouTube Flash video server that has a public IP address. This video server streams the video to the user’s browser.
    • The first YouTube video server might redirect via HTTP request redirection to another video server, which may again redirect to another video server until it reaches a server that can server the video to the browser.
    • Each HTTP redirection adds an additional latency as the client has to start a new HTTP session with a new server and resolve the hostname being redirected to. Also, the new server might be located farther away than the first server a user was routed to. All of these factors add delay to the video download time.

Key Terms

DNS resolution

DNS resolution is the process by which a domain name is translated into an IP address that a computer can use to communicate over the Internet. DNS, which stands for Domain Name System, is a hierarchical and distributed database that maps domain names to IP addresses.

HTTP Redirection

HTTP redirection is a technique used to redirect users from one URL to another URL, usually to make an old or outdated URL point to a new or updated one. When a user tries to access a web page that has been redirected, the web server sends a response to the user's web browser, indicating that the requested resource has been moved or replaced. The server also includes the new URL in the response, and the user's web browser automatically redirects the user to the new URL. There are several types of HTTP redirection, including permanent (301), temporary (302), and conditional (307) redirects. HTTP redirection is commonly used for various purposes, such as to improve the user experience, to redirect outdated or non-existent URLs to new ones, or to redirect traffic from one domain to another.

Anycast Routing

Anycast routing is a network addressing and routing technique that allows multiple servers to share the same IP address. When a client sends a request to an anycast IP address, the network routes the request to the closest server that is advertising the anycast address. The routing protocol used for anycast is typically the same as the one used for unicast routing, such as Border Gateway Protocol (BGP). Anycast routing is often used for high availability and load balancing, as it distributes traffic across multiple servers and provides automatic failover in the event of a server failure. It is commonly used for services such as content delivery networks (CDNs), Domain Name System (DNS), and distributed denial-of-service (DDoS) mitigation.

Unicast Routing

Unicast routing is a network communication technique that is used to transfer data between a single sender and a single receiver over a network. When a sender transmits data, it specifies the IP address of the intended recipient, and the data is sent directly to that recipient's IP address. Unicast routing can be performed using a variety of routing protocols, such as Routing Information Protocol (RIP), Open Shortest Path First (OSPF), and Border Gateway Protocol (BGP), among others. Unicast routing is the most common type of routing used in computer networks, and it is typically used for applications such as email, web browsing, and file sharing, where data needs to be transmitted between two specific devices.

You've successfully subscribed to Fullstack Data Engineer
Great! Next, complete checkout to get full access to all premium content.
Error! Could not sign up. invalid link.
Welcome back! You've successfully signed in.
Error! Could not sign in. Please try again.
Success! Your account is fully activated, you now have access to all content.
Error! Stripe checkout failed.
Success! Your billing info is updated.
Error! Billing info update failed.