Algorithm of Yahoo search engine

The algorithm of a search engine is a complex thing. It has to analyze all the text on all the pages and relevant options are seen relating to what the user has searched. The factors that are used in search engine algorithms are the hard factors. There are external influences in the algorithms. There are usually off page and on page factors in the page. The search engine usually analyzes that also.

The search engine uses Boolean operation as its basic operation. For example, if there is a term and you have to see if that term is present is the page or not. So you use Boolean operation which gives 1 for true and 2 for false. AND OR and NOT operators are used in it. Your first basic step is the tokenization of the text. The web page is analyzed with the help of zone indexes. This divides the web page in different zones. Like tittle, description and body. We calculate a score for each zone. This is very difficult because there are different structures in the web page for each document.

To determine context of the page, the search engine divide it into blocks. In this manner it is easy to analyze if blocks are important or not. Blocks are divided by a basic method of text/code ratio. There is a great benefit of zone index method that you can easily calculate the score for each block.

Term frequency and vector model are one of the essential components of a search engine; however, they are complex so cannot be explained here. If the process is very slow, you can speed up by adding static values. The relevant feedback is also employed in which you can assign more or less value to the term based on its importance.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s