What is a Search Engine Algorithm?
A search algorithm is defined as a math formula that takes a problem as input and returns a solution to the problem, usually after evaluating a number of possible solutions. A search engine algorithm uses keywords as the input problem, and returns relevant search results as the solution, matching these keywords to the results stored in its database. These keywords are determined by search engine spiders that analyze web page content and keyword relevancy based on a math formula that will vary from one search engine to the next.
Types of Information that Factor into Algorithms
Some services collect information on the queries individual users submit to search services, the pages they look at subsequently, and the time spent on each page. This information is used to return results pages that most users visit after initiating the query. For this technique to succeed, large amounts of data need to be collected for each query. Unfortunately, the potential set of queries to which this technique applies is small, and this method is open to spamming.
Another approach involves analyzing the links between pages on the web on the assumption that pages on the topic link to each other, and authoritative pages tend to point to other authoritative pages. By analyzing how pages link to each other, an engine can both determine what a page is about, and whether that page is considered relevant. Similarly, some search engine algorithms figure internal link navigation into the picture. Search engine spiders follow internal links to weigh how each page relates to another, and considers the ease of navigation. If a spider runs into a dead-end page with no way out, this can be weighed into the algorithms as a penalty.
Original search engine databases were made up of all human classified data. This is a fairly archaic approach, but there are still many directories that make up search engine databases, like the Open Directory (also known as DMOZ), that are entirely classified by people. Some search engine data are still managed by humans, but after the algorithmic spiders have collected the information.
One of the elements that a search engine algorithm scans for is the frequency and location of keywords on a web page. Those with higher frequency are typically considered more relevant. This is referred to as keyword density. It’s also figured into some search engine algorithms where the keywords are located on a page.
Like keywords and usage information, meta tag information has been abused. Many search engines do not factor in meta tags any longer, due to web spam. But some still do, and most look at Title and Descriptions. There are many other factors that search engine algorithms figure into the calculation of relevant results. Some utilize information like how long the website has been on the Internet, and still others may weigh structural issues, errors encountered, and more.