YouTube grew incredibly fast, to over 100 million video views per day, with only a handful of people responsible for scaling the site. How did they manage to deliver all that video to all those users? And how have they evolved since being acquired by Google?.
What’s Inside the Company ?
* Supports the delivery of over 100 million videos per day.
* Founded 2/2005
* 3/2006 30 million video views/day
* 7/2006 100 million video views/day
* 2 sysadmins, 2 scalability software architects
* 2 feature developers, 2 network engineers, 1 DBA
Recipe for handling rapid growth
This loop runs many times a day.
* NetScalar is used for load balancing and caching static content.
* Run Apache with mod_fast_cgi.
* The Python web code is usually NOT the bottleneck, it spends most of its time blocked on RPCs.
* Python allows rapid flexible development and deployment . This is critical given the competition they face.
* Creative and risky tricks can help you cope in the short term while you work out longer term solutions.
* Know what’s essential to your service and prioritize your resources and efforts around those priorities.
* Pick your battles. Don’t be afraid to outsource some essential services. YouTube uses a CDN to distribute their most popular content. Creating their own network would have taken too long and cost too much. You may have similar opportunities in your system. Take a look at Software as a Service for more ideas.
* Keep it simple! Simplicity allows you to rearchitect more quickly so you can respond to problems.
* Sharding helps to isolate and constrain storage, CPU, memory, and IO. It’s not just about getting more writes performance.
* Constant iteration on bottlenecks:
– Software: DB, caching
– OS: disk I/O
– Hardware: memory, RAID
* Have a good cross discipline team that understands the whole system and what’s underneath the system. People who can set up printers, machines, install networks, and so on. With a good team all things are possible.