# Page ranking algorithm and its implementation

Python Methods and Functions

PageRank (PR) — it is the algorithm used by Google search to rank sites in search results. PageRank was named after Larry Page, one of the founders of Google. PageRank — it is a way of measuring the importance of the pages on a site. According to Google:

PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. The underlying assumption is that more important websites are likely to receive more links from other websites.

This is not the only algorithm used by Google to organize search results, but it is the first algorithm used by the company. and it is the most famous.

The above measure of centrality is not implemented for multigraphs.

Algorithm
PageRank Algorithm displays a probability distribution used to represent the probability that a person accidentally clicks on links to land on any particular page. PageRank can be calculated for collections of documents of any size. Several research papers suggest that the distribution is evenly distributed among all documents in a collection at the beginning of the computational process. PageRank calculations require several passes, called “iterations”, through the collection to adjust the estimated PageRank values ​​to more closely reflect the theoretical true value.

Simplified algorithm
Assume that the small universe consists of four web pages: A, B, C, and D. Links from a page to itself or multiple outbound links from one page to another page are ignored. PageRank is initialized to the same value for all pages. In the original form of PageRank, the sum of PageRank across all pages was the total number of pages on the Internet at that time, so each page in this example would have an initial value of 1. However, later versions of PageRank and For the remainder of this section, assume a probability distribution between 0 and 1 Therefore, the starting value for each page in this example is 0.25.

The PageRank carried over from this page to its outbound link targets in the next iteration is split equally among all outbound links.

If the only links in the system were from pages B, C and D to A, each link would pass 0.25 PageRank to A on the next iteration, for a total of 0.75.

Suppose instead that page B contains a link to pages C and A, page C contains a link to page A, and page D contains links on all three pages. So in the first iteration, page B will pass half of its existing value, or 0.125, to page A, and the other half, or 0.125, to page C. Page C will pass all of its existing value, 0.25, only page, to which he links, A. Since D had three outbound links, it would move one third of its existing value, or approximately 0.083, to A. At the end of this iteration, page A will have a PageRank of approximately 0.458.

In other words, the PageRank assigned to an outbound link is equal to the document's own PageRank divided by the number of outbound links L ().

In general, the PageRank value for any page u can be expressed as:

,

i.e. the PageRank value for page u depends on the PageRank values ​​for each page v contained in the set Bu (the set containing all pages that link to page u) divided by the number L (v) of links from page v. The algorithm includes a damping factor to calculate the page rank. This is similar to the income tax that the government extracts from it, despite the fact that it pays it itself.

Below is the code to calculate the page rank.

` `

` def pagerank (G, alpha = 0.85 , personalization = None , max_iter = 100 , tol = 1.0e - 6 , nstart = None , weight = 'weight' ,   dangling = None ) : & quot; & quot; & quot; Returns the PageRank of nodes in a graph.   PageRank calculates the ranking of nodes in column G based on Structure of incoming links. It was originally designed as a web page ranking algorithm.   parameters ---------- G: graph Chart NetworkX. Non-directional plots will be converted to oriented plots a graph with two directed edges for each undirected edge.   alpha: float, optional Damping parameter for PageRank, default = 0.85.   personalization: dict, optional "Personalization vector" consisting of a dictionary with a key for each graph node and a non-zero personalization value for each node. Equal distribution is used by default.      max_iter: integer, optional Maximum number of iterations in the power method for solving eigenvalues.   tol: float, optional used to test convergence in the power-law solver.   nstart: dictionary, optional Initial value iterate PageRank for each node.     weight: key, optional Edge data key to use as weight. If No, the weights are set to 1.   hanging: dict, optional Highlights should be assigned to any "dangling" nodes, that is, nodes without any side effects The dict key is the node pointed to by the output and the key The value is the weight of this output. Dangling nodes are given by default crashes according to personalization vector (if not to be specified). This must be selected to result in an irreducible transition matrix (see notes under google_matrix) ... It may be common to have the hanging dictate will be the same as the personalization dictate.   Returns ------- Pagerank: dictionary Dictionary of nodes with PageRank as value   Notes   --- - Eigenvector calculation is performed by the power iteration method and has no convergence guarantee. The iteration will stop after max_iter iterations or error number_of_nodes (G) * Tol was achieved.   The PageRank algorithm was designed for directed graphs, but this the algorithm does not check the directionality of the input graph and   execute on undirected graphs by converting each edge to directed graph with two edges.     "" "   if len (G) = = 0 :   return {}   if not G.is_directed (): D = G.to_directed () else : D = G      # Create a copy in (right) stochastic form W = nx.stochastic_graph (D, weight = weight) N = W.number_of_nodes ()   # Select a fixed starting vector if not specified   if nstart is None : x = dict . fromkeys (W, 1.0 / N) else : # Normalized nstart vector   s = float ( sum (nstart.values ​​()))   x = dict ((k, v / s) for k, v in nstart.items ())      if personalization is None :     # Assign a single personalization vector if not specified   p = dict . fromkeys (W , 1.0 / N) else : missing = set (G) - set (personalization) if missing: raise NetworkXError ( 'Personalization dictionary' 'must have a value for every node ... '   ' Missing nodes% s' % missing) s = float ( sum (personalization.values ​​())) p = dict ((k, v / s) for k , v in personalization.items ())     if dangling is None :   # Use personalization vector if no dangling vector is specified dangling_weights = p else : missing = set (G) - set (dangling)   if missing: raise NetworkXError ( 'Dangling node dictionary' 'must have a value for every node. '   ' Missing nodes% s' % missing) s = float ( sum (dangling.values ​​())) dangling_weights = dict ((k, v / s) for k , v in dangling.items ()) dangling_nodes = [n for n in W if W.out_degree (n, weight = weight) = = 0.0 ]   # power iteration: before max_iter iterations for _ in range (max_iter):   xlast = x x = dict . fromkeys (xlast.keys (), 0 ) danglesum = alpha * sum (xlast [n] for n in dangling_nodes) for n in x:   # this matrix multiplies looks weird because it # do left multiplication x ^ T = xlast ^ T * W for nbr in W [n]: x [nbr] + = alpha * xlast [n] * W [n] [nbr] [weight] x [n] + = danglesum * dangling_weights [n] + ( 1.0 - alpha) * p [n]   # convergence check, l1 norm err = sum ([ abs (x [n] - xlast [n]) for n in x] ) if err & lt; N * tol: return x raise NetworkXError ( 'pagerank: power iteration failed to converge' 'in% d iterations.' % max_iter) `

The above code is function that was implemented in the networkx library.
To implement the above in networkx, you will need to do the following:

 ` & gt; & gt; & gt; ` ` import ` ` networkx as nx ` ` & gt; & gt ; & gt; G ` ` = ` ` nx.barabasi_albert_graph (` ` 60 ` `, ` ` 41 ` `) ` ` & gt; & gt; & gt; pr ` ` = ` ` nx.pagerank (G, ` ` 0.4 ` `) ` ` & gt; & gt; & gt; pr `

Below is the output you get on IDLE after the required settings.

 ` {` ` 0 ` `: ` ` 0.012774147598875784 ` `, ` ` 1 ` `: ` ` 0.013359655345577266 ` `, ` ` 2 ` ` : ` ` 0.013157355731377924 ` `, ` ` 3 ` `: ` ` 0.012142198569313045 ` `, ` ` 4 ` `: ` ` 0.013160014506830858 ` `, ` ` 5 ` `: ` ` 0.012973342862730735 ` `, ` ` 6 ` `: ` ` 0.012166706783753325 ` `, ` ` 7 ` `: ` ` 0.011985935451513014 ` `, ` ` 8 ` `: ` ` 0.012973502696061718 ` `, ` ` 9 ` `: ` ` 0.013374146193499381 ` `, ` ` 10 ` `: ` ` 0.01296354505412387 ` `, ` ` 11 ` `: ` ` 0.013163220326063332 ` `, ` ` 12 : 0.013368514624403237 , 13 : 0.013169335617283102 , 14 : 0.012752071800520563 ,  ```` 15 : 0.012951601882210992 , 16 : 0.013776032065400283 , 17 : 0.012356820581336275 ,  18 : 0.013151652554311779 , 19 : 0.012551059531065245 , 20 : 0.012583415756427995 , 21 : 0.013574117265891684 , 22 : 0.013167552803671937 , 23 : 0.013165528583400423 , 24 : 0.012584981049854336 , 25 : 0.013372989228254582 , 26 : 0.012569416076848989 ,                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      27 : 0.013165322299539031 , 28 : 0.012954300960607157 , 29 : 0.012776091973397076 , 30 : 0.012771016515779594 , 31 : 0.012953404860268598 , 32 : 0.013364947854005844 ,  33 (adsbygoogle = window.adsbygoogle || []).push({}); Tutorials (adsbygoogle = window.adsbygoogle || []).push({}); Python OpenCV | cv2.putText () method numpy.arctan2 () in Python Python | os.path.realpath () method Python OpenCV | cv2.circle () method Python OpenCV | cv2.cvtColor () method Python - Move item to the end of the list time.perf_counter () function in Python Check if one list is a subset of another in Python Python | os.path.join () method Python | os.path.expanduser () method Replace negative value with zero in Numpy array Python | Convert nested dictionary list to Pandas dataframe TimeField — Django Models Python | os.path.isdir () method Python OpenCV cv2.line () method numpy.poly1d () in Python Numpy.sqrt () in Python Split nested list into two lists in Python Python program to convert camel case string to snake case Python | Pandas DataFrame.reset_index () Binning in Data Mining Python | Add suffix / prefix to strings in a list Dunn index Python Python - Print list vertically Books for developers Python Data Science Handbook 22/08/2021 CEH v11 Certified Ethical Hacker Study Guide 12/08/2021 Data Science from Scratch. First Principles with Python 23/09/2020 Python Workout: 50 TEN-MINUTE EXERCISES 23/09/2020 © 2022 Python.Engineering Best Python tutorials books for beginners and professionals Python.Engineering is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to amazon.com Python in Italiano Python auf Deutsch Python en Français Python en Español Türk dilinde Python Python: мануалы на русском ITIL v4 Computations Development Cryptography For dummies Machine Learning Big Data Loops Counters NumPy NLP PHP Regular Expressions File Handling Arrays String Variables Knowledge Database X Submit new EBook \$(document).ready(function () { \$(".modal_galery").owlCarousel({ items: 1, itemsCustom: false, itemsDesktop: [1300, 1], itemsDesktopSmall: [960, 1], itemsTablet: [768, 1], itemsTabletSmall: false, itemsMobile: [479, 1], singleItem: false, itemsScaleUp: false, pagination: false, navigation: true, rewindNav: true, autoPlay: true, stopOnHover: true, navigationText: [ "", "" ], }); \$(".tel_mask").mask("+9(999) 999-99-99"); }) ```