The Anatomy Of Search
Last Updated on 17 March 2018
I have always been fascinated with how search engines work. How wondrous it is that we are able to type in a few keywords into a search box and as if by magic we are presented with numerous results or what the search engine perceives to be our desired resource, ranked according to relevance.
Most of us use this feature of the modern web without a second thought- without considering what happens behind the scenes to make this seemingly routine task possible at the click of a button. In order to gain a better understanding of the series of events and the technical mastery that goes into play to achieve this apparently routine task, let us look at the anatomy of search with the Google search engine as a basis, considering that it holds about 91% market share of search at the time of updating this post.
If you do not have a background in computer programming of any sort, it may be easy for you to assume that the tasks of bringing relevant search results to you on your device are simple. Far from it; a numerous and complex array of tasks are executed at incomprehensible speeds to bring you results within millisecond of you pressing the Go button. In a simplified breakdown of events, when you click the button, your web browser sends a request to the search engine which processes your request through the steps of querying the search index, analysing web pages in the index for relevance, evaluating the reputation of each site for the relevant pages and finally ranking the pages before returning the results to you. Note that at each of these steps several servers are queried and the results correlated, a mean and impressive task considering that you get the result in the blink of an eye.
Guessing and Crawling
Bearing in mind that each individual is different and has different interests, the algorithm that determines search results has to do a fine job of anticipating what pages you are looking for when you carry out a search. However, this is not a wild guess. The anticipation is based on rankings and reputation which is established in two main ways. One way is for the search engine to go through each and every website- a process called web crawling – and then indexing the results over time so that it has an idea of what’s out there. This is carried out routinely and repeatedly with sites being revisited at a frequency determined by observable rates of change to see if the site has new changes and additions. Because the web is always expanding at an incredible rate, search engines undertake to index as large a part of the web as they possibly can and to keep revisiting websites that have already been indexed in order to stay relevant.
In order to order to establish reputation, search engines look for the number of links coming to a website or page from other websites. These are known as inbound links. The rationale for this reasoning is that if someone finds the content on your website interesting or useful, they are likely to create a link to it on their website. No website will intentionally link to another website unless they felt that that website has something worthwhile for that particular website’s visitors. Take Wikipedia or Youtube for example, a lot of people will link to pages on Wikipedia as references or link to videos on Youtube from their website. This means that they think these links will add value to their own website and is a high endorsement for Wikipedia and Youtube as a result. Therefore, by looking at how many websites point to a particular website, its reputation can be reliably compared to other websites with fewer or more inbound links.
Determining the relevance of search results is no easy task. How does a search engine determine that the results it is offering you are relevant to the search? This is achieved by analysing the contents of the page itself after crawling and subsequent indexing has already occurred. The search engine analyses aspects of the page such as the occurrence of the search words and phrases in the article title as well as the body of text. Frequency and appropriate use of the words in pages is also taken into account. Search engines also use metadata to pages in the form of the article description and keywords to aid in determining relevance. Metadata are data that accompanies the pages themselves and add a descriptive value to give more information about the page and are not necessarily visible to the typical reader of the page. The search engines factor in all these aspects according to the weighting determined by the proprietary mechanisms of the search algorithm to return results in order of relevance.
The Future of Search
It may be said that change is the only constant and nowhere else does this run truer than in technology circles and to that extent search is no different. If we cast an eye towards the near future, what sort of improvements can we anticipate on the horizon? Can we see a radical change in how we conduct searches online or we are forever tied to inputting phrases within a search box and anticipating results ranked by relevance listed on a page?
Although currently underway, but not yet significantly matured; search context is the next frontier through semantic search. Context in search is the idea of being able to provide relevant results to users as and when they are needed. For example if you are booking a flight, then you should also be interested in taxi fares and hotel rates in the vicinity of your destined airport and all this should be carried out and presented in an integrated way. Recent emerging technologies such as Neural Networks and Deep Learning (Machine Learning) coupled with the Internet of Things will allow the collection of search parameters form user devices, especially from mobile devices and presentation in a format that is integrated seamlessly with our everyday activities.
Another interesting development is Voice UI or Voice User Interface. This development is an interesting one because instead of typing phrases in a search box, why not ask questions to the search engine? Emerging trends that are accompanying personal assistants such as Google Assistant or Alexa mean that we can now talk to search engines instead of communicating with them via text inputs.
In Search We Trust
Search has evolved significantly over the years from its humble beginnings of aggregating links across the web and indexing them so that we no longer have to remember all those important web addresses, to the point where search engines are presently inching towards perfecting semantic search. Search engines bring us convenience that is indispensable and allow us to find relevant information at the click of a button and bring the results ever faster, despite the technological feats that go along with the process behind the scenes. As search evolves, we are likely to see it take more prominence in our online lives than it already does and we should anticipate trusting as well as relying on it even more in all areas of our daily lives.