PageRank (PR) — it is the algorithm used by Google search to rank sites in search results. PageRank was named after Larry Page, one of the founders of Google. PageRank — it is a way of measuring the importance of the pages on a site. According to Google:
PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. The underlying assumption is that more important websites are likely to receive more links from other websites.
This is not the only algorithm used by Google to organize search results, but it is the first algorithm used by the company. and it is the most famous.
The above measure of centrality is not implemented for multigraphs.
Algorithm
PageRank Algorithm displays a probability distribution used to represent the probability that a person accidentally clicks on links to land on any particular page. PageRank can be calculated for collections of documents of any size. Several research papers suggest that the distribution is evenly distributed among all documents in a collection at the beginning of the computational process. PageRank calculations require several passes, called “iterations”, through the collection to adjust the estimated PageRank values to more closely reflect the theoretical true value.
Simplified algorithm
Assume that the small universe consists of four web pages: A, B, C, and D. Links from a page to itself or multiple outbound links from one page to another page are ignored. PageRank is initialized to the same value for all pages. In the original form of PageRank, the sum of PageRank across all pages was the total number of pages on the Internet at that time, so each page in this example would have an initial value of 1. However, later versions of PageRank and For the remainder of this section, assume a probability distribution between 0 and 1 Therefore, the starting value for each page in this example is 0.25.
The PageRank carried over from this page to its outbound link targets in the next iteration is split equally among all outbound links.
If the only links in the system were from pages B, C and D to A, each link would pass 0.25 PageRank to A on the next iteration, for a total of 0.75.
Suppose instead that page B contains a link to pages C and A, page C contains a link to page A, and page D contains links on all three pages. So in the first iteration, page B will pass half of its existing value, or 0.125, to page A, and the other half, or 0.125, to page C. Page C will pass all of its existing value, 0.25, only page, to which he links, A. Since D had three outbound links, it would move one third of its existing value, or approximately 0.083, to A. At the end of this iteration, page A will have a PageRank of approximately 0.458.
In other words, the PageRank assigned to an outbound link is equal to the document's own PageRank divided by the number of outbound links L ().
In general, the PageRank value for any page u can be expressed as:
,
i.e. the PageRank value for page u depends on the PageRank values for each page v contained in the set Bu (the set containing all pages that link to page u) divided by the number L (v) of links from page v. The algorithm includes a damping factor to calculate the page rank. This is similar to the income tax that the government extracts from it, despite the fact that it pays it itself.
Below is the code to calculate the page rank.
def
pagerank (G, alpha
=
0.85
, personalization
=
None
,
max_iter
=
100
, tol
=
1.0e

6
, nstart
=
None
, weight
=
'weight'
,
dangling
=
None
) :
& quot; & quot; & quot; Returns the PageRank of nodes in a graph.
PageRank calculates the ranking of nodes in column G based on
Structure of incoming links. It was originally designed as a
web page ranking algorithm.
parameters

G: graph
Chart NetworkX. Nondirectional plots will be converted to oriented plots
a graph with two directed edges for each undirected edge.
alpha: float, optional
Damping parameter for PageRank, default = 0.85.
personalization: dict, optional
"Personalization vector" consisting of a dictionary with
a key for each graph node and a nonzero personalization value for each node.
Equal distribution is used by default.
max_iter: integer, optional
Maximum number of iterations in the power method for solving eigenvalues.
tol: float, optional
used to test convergence in the powerlaw solver.
nstart: dictionary, optional
Initial value iterate PageRank for each node.
weight: key, optional
Edge data key to use as weight. If No, the weights are set to 1.
hanging: dict, optional
Highlights should be assigned to any "dangling" nodes, that is, nodes without
any side effects The dict key is the node pointed to by the output and the key
The value is the weight of this output. Dangling nodes are given by default
crashes according to personalization vector (if not
to be specified). This must be selected to result in an irreducible transition
matrix (see notes under google_matrix) ... It may be common to have
the hanging dictate will be the same as the personalization dictate.
Returns

Pagerank: dictionary
Dictionary of nodes with PageRank as value
Notes
 
Eigenvector calculation is performed by the power iteration method
and has no convergence guarantee. The iteration will stop
after max_iter iterations or error
number_of_nodes (G) * Tol was achieved.
The PageRank algorithm was designed for directed graphs, but this
the algorithm does not check the directionality of the input graph and
execute on undirected graphs by converting each edge to
directed graph with two edges.
"" "
if
len
(G)
=
=
0
:
return
{}
if
not
G.is_directed ():
D
=
G.to_directed ()
else
:
D
=
G
# Create a copy in (right) stochastic form
W
=
nx.stochastic_graph (D, weight
=
weight)
N
=
W.number_of_nodes ()
# Select a fixed starting vector if not specified
if
nstart
is
None
:
x
=
dict
. fromkeys (W,
1.0
/
N)
else
:
# Normalized nstart vector
s
=
float
(
sum
(nstart.values ()))
x
=
dict
((k, v
/
s)
for
k, v
in
nstart.items ())
if
personalization
is
None
:
# Assign a single personalization vector if not specified
p
=
dict
. fromkeys (W ,
1.0
/
N)
else
:
missing
=
set
(G)

set
(personalization)
if
missing:
raise
NetworkXError (
'Personalization dictionary'
'must have a value for every node ... '
' Missing nodes% s'
%
missing)
s
=
float
(
sum
(personalization.values ()))
p
=
dict
((k, v
/
s)
for
k , v
in
personalization.items ())
if
dangling
is None
:
# Use personalization vector if no dangling vector is specified
dangling_weights
=
p
else
:
missing
=
set
(G)

set
(dangling)
if
missing:
raise
NetworkXError (
'Dangling node dictionary'
'must have a value for every node. '
' Missing nodes% s'
%
missing)
s
=
float
(
sum
(dangling.values ()))
dangling_weights
=
dict
((k, v
/
s)
for
k , v
in
dangling.items ())
dangling_nodes
=
[n
for
n
in
W
if
W.out_degree (n, weight
=
weight)
=
=
0.0
]
# power iteration: before max_iter iterations
for
_
in
range
(max_iter):
xlast
=
x
x
=
dict
. fromkeys (xlast.keys (),
0
)
danglesum
=
alpha
*
sum
(xlast [n]
for
n
in
dangling_nodes)
for
n
in
x:
# this matrix multiplies looks weird because it
# do left multiplication x ^ T = xlast ^ T * W
for
nbr
in
W [n]:
x [nbr]
+
=
alpha
*
xlast [n]
*
W [n] [nbr] [weight]
x [n]
+
=
danglesum
*
dangling_weights [n]
+
(
1.0

alpha)
*
p [n]
# convergence check, l1 norm
err
=
sum
([
abs
(x [n]

xlast [n])
for
n
in
x] )
if
err & lt; N
*
tol:
return
x
raise
NetworkXError (
'pagerank: power iteration failed to converge'
'in% d iterations.'
%
max_iter)
The above code is function that was implemented in the networkx library.
To implement the above in networkx, you will need to do the following:

Below is the output you get on IDLE after the required settings.
