A single google query uses 1000 machines in 0.2 seconds

What do you do when you want to check something online? Most of us will likely enter a search string in the Google search bar and wait for it to come up with its results. There’s hardly any “wait” these days as the tumaround for most searches are nearly next to nothing for practical terms.

Did you know that a single Google query actually uses 1000 machines in just 0.2 seconds? Even though Google is usually secretive about their search infrastructure, they revealed certain details in 2009. During a keynote talk at WSDM 2009, a Google Fellow revealed Google’s exponential growth during the decade from 1999 to 2009. As compared to 12 machines that were used earlier, 1000 machines were employed by 2009 in order to hold the complete search index in memory. Crawler updates, which used to take months in 1999, were down to just minutes by 2009. Search queries and processing power went up by a factor of 1000, while latency went down from around 1000 ms to 200 ms or 0.2 seconds.

Also, according to Jeff Dean, Google puts the search index all in memory several years ago and displays search results almost instantaneously to the person who is trying to search, so for each query, 2 , It is said that thousands of machines are working in tandem instead of three dozen machines. Google has developed various index compression technologies over the past several years and finally put it in a format that combines the four deltas of the position in order to minimize the number of replacement work required for decompression I told you it was solved. Google is paying attention to where their data is located on the disk, and data that needs to be read immediately is placed on the outer circumference of the disk which can read data at a higher speed even in the hard disk , It seems that cold data (data that does not need to be read out quickly, data with low reading frequency) and short data are placed on the inner circumference of the disk.
Also, in usual server applications, we use ECC memory at a higher price than usual which can correct errors themselves, but Google uses non-parity memory, so we created our own program to recover from errors, My own disk scheduler. The Linux kernel has also made a number of corrections to meet the needs.
Regarding the physical server as well, in the first phase it was a self-made server without a case, then it became a server to fit in a normal rack, but now it is back to a custom server without case again.

Credit : Giga zine

Picture Credit : Google 

Leave a Reply

Your email address will not be published. Required fields are marked *