background preloader

Clustering coefficient

Clustering coefficient
In graph theory, a clustering coefficient is a measure of the degree to which nodes in a graph tend to cluster together. Evidence suggests that in most real-world networks, and in particular social networks, nodes tend to create tightly knit groups characterised by a relatively high density of ties; this likelihood tends to be greater than the average probability of a tie randomly established between two nodes (Holland and Leinhardt, 1971;[1] Watts and Strogatz, 1998[2]). Two versions of this measure exist: the global and the local. The global version was designed to give an overall indication of the clustering in the network, whereas the local gives an indication of the embeddedness of single nodes. Global clustering coefficient[edit] The global clustering coefficient is based on triplets of nodes. Watts and Strogatz defined the clustering coefficient as follows, "Suppose that a vertex has neighbours; then at most edges can exist between them (this occurs when every neighbour of ). over all

Small-world experiment The "six degrees of separation" model The small-world experiment comprised several experiments conducted by Stanley Milgram and other researchers examining the average path length for social networks of people in the United States. The research was groundbreaking in that it suggested that human society is a small-world-type network characterized by short path-lengths. Historical context of the small-world problem[edit] Guglielmo Marconi's conjectures based on his radio work in the early 20th century, which were articulated in his 1909 Nobel Prize address,[1] may have inspired[citation needed] Hungarian author Frigyes Karinthy to write a challenge to find another person to whom he could not be connected through at most five people.[2] This is perhaps the earliest reference to the concept of six degrees of separation, and the search for an answer to the small world problem. Milgram sought to devise an experiment that could answer the small world problem. The experiment[edit] Results[edit]

Degree distribution In/out degree distribution for Wikipedia's hyperlink graph (logarithmic scales) Definition[edit] The degree distribution P(k) of a network is then defined to be the fraction of nodes in the network with degree k. The same information is also sometimes presented in the form of a cumulative degree distribution, the fraction of nodes with degree greater than or equal to k. Observed degree distributions[edit] The degree distribution is very important in studying both real networks, such as the Internet and social networks, and theoretical networks. (or Poisson in the limit of large n). See also[edit] References[edit]

Power law An example power-law graph, being used to demonstrate ranking of popularity. To the right is the long tail, and to the left are the few that dominate (also known as the 80–20 rule). In statistics, a power law is a functional relationship between two quantities, where a relative change in one quantity results in a proportional relative change in the other quantity, independent of the initial size of those quantities: one quantity varies as a power of another. For instance, considering the area of a square in terms of the length of its side, if the length is doubled, the area is multiplied by a factor of four.[1] Empirical examples of power laws[edit] Properties of power laws[edit] Scale invariance[edit] One attribute of power laws is their scale invariance. , scaling the argument by a constant factor causes only a proportionate scaling of the function itself. That is, scaling by a constant simply multiplies the original power-law relation by the constant . and A power-law only if Universality[edit]

Average path length Concept[edit] The average path length distinguishes an easily negotiable network from one, which is complicated and inefficient, with a shorter average path length being more desirable. However, the average path length is simply what the path length will most likely be. The network itself might have some very remotely connected nodes and many nodes, which are neighbors of each other. Definition[edit] with the set of vertices . , where denote the shortest distance between and . if cannot be reached from . is: is the number of vertices in Applications[edit] In a real network like the World Wide Web, a short average path length facilitates the quick transfer of information and reduces costs. Most real networks have a very short average path length leading to the concept of a small world where everyone is connected to everyone else through a very short path. As a result, most models of real networks are created with this condition in mind. References[edit]

The Small-World Phenomenon: An Algorithmic Perspective 1 Jon Kleinberg 2 Abstract: Long a matter of folklore, the ``small-world phenomenon'' -- the principle that we are all linked by short chains of acquaintances -- was inaugurated as an area of experimental study in the social sciences through the pioneering work of Stanley Milgram in the 1960's. But existing models are insufficient to explain the striking algorithmic component of Milgram's original findings: that individuals using local information are collectively very effective at actually constructing short paths between two points in a social network. The Small-World Phenomenon. A social network exhibits the small-world phenomenon if, roughly speaking, any two individuals in the network are likely to be connected through a short sequence of intermediate acquaintances. Milgram's basic small-world experiment remains one of the most compelling ways to think about the problem. Modeling the Phenomenon. The Present Work. Let us return to Milgram's experiment. ) above: square, and (i) (ii) (iii) . .

Random graph Random graph models[edit] A random graph is obtained by starting with a set of n isolated vertices and adding successive edges between them at random. The aim of the study in this field is to determine at what stage a particular property of the graph is likely to arise.[2] Different random graph models produce different probability distributions on graphs. Most commonly studied is the one proposed by Edgar Gilbert, denoted G(n,p), in which every possible edge occurs independently with probability 0 < p < 1. The probability of a random graph with m edges is .[3] A closely related model, the Erdős–Rényi model denoted G(n,M), assigns equal probability to all graphs with exactly M edges. with 0 ≤ M ≤ N, G(n,p) has elements and every element occurs with probability .[2] The latter model can be viewed as a snapshot at a particular time (M) of the random graph process Given any n + m elements , there is a vertex c in V that is adjacent to each of and is not adjacent to any of . that a given edge .

Exclusive: How Google's Algorithm Rules the Web | Wired Magazine Want to know how Google is about to change your life? Stop by the Ouagadougou conference room on a Thursday morning. It is here, at the Mountain View, California, headquarters of the world’s most powerful Internet company, that a room filled with three dozen engineers, product managers, and executives figure out how to make their search engine even smarter. This year, Google will introduce 550 or so improvements to its fabled algorithm, and each will be determined at a gathering just like this one. The decisions made at the weekly Search Quality Launch Meeting will wind up affecting the results you get when you use Google’s search engine to look for anything — “Samsung SF-755p printer,” “Ed Hardy MySpace layouts,” or maybe even “capital Burkina Faso,” which just happens to share its name with this conference room. Udi Manber, Google’s head of search since 2006, leads the proceedings. You might think that after a solid decade of search-market dominance, Google could relax.

Scale-free network A scale-free network is a network whose degree distribution follows a power law, at least asymptotically. That is, the fraction P(k) of nodes in the network having k connections to other nodes goes for large values of k as where is a parameter whose value is typically in the range 2 < < 3, although occasionally it may lie outside these bounds.[1][2] History[edit] In studies of the networks of citations between scientific papers, Derek de Solla Price showed in 1965 that the number of links to papers—i.e., the number of citations they receive—had a heavy-tailed distribution following a Pareto distribution or power law, and thus that the citation network is scale-free. Barabási and Albert proposed a generative mechanism to explain the appearance of power-law distributions, which they called "preferential attachment" and which is essentially the same as that proposed by Price. (that is, the number of edges incident to ) by . Characteristics[edit] Random network (a) and scale-free network (b).

PageRank Algorithm used by Google Search to rank web pages PageRank (PR) is an algorithm used by Google Search to rank web pages in their search engine results. It is named after both the term "web page" and co-founder Larry Page. PageRank is a way of measuring the importance of website pages. According to Google: PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. Currently, PageRank is not the only algorithm used by Google to order search results, but it is the first algorithm that was used by the company, and it is the best known.[2][3] As of September 24, 2019, all patents associated with PageRank have expired.[4] Description[edit] A PageRank results from a mathematical algorithm based on the webgraph, created by all World Wide Web pages as nodes and hyperlinks as edges, taking into consideration authority hubs such as cnn.com or mayoclinic.org. History[edit] Algorithm[edit] Simplified algorithm[edit] where , and At and .

Kirchhoff&#039;s theorem Kirchhoff's theorem[edit] Kirchhoff's theorem relies on the notion of the Laplacian matrix of a graph that is equal to the difference between the graph's degree matrix (a diagonal matrix with vertex degrees on the diagonals) and its adjacency matrix (a (0,1)-matrix with 1's at places corresponding to entries where the vertices are adjacent and 0's otherwise). Equivalently the number of spanning trees is equal to any cofactor of the Laplacian matrix of G. An example using the matrix-tree theorem[edit] The Matrix-Tree Theorem can be used to compute the number of labeled spanning trees of this graph. First, construct the Laplacian matrix Q for the example kite graph G (see image at right): Next, construct a matrix Q* by deleting any row and any column from Q. Finally, take the determinant of Q* to obtain t(G), which is 8 for the kite graph. Proof outline[edit] First notice that the Laplacian has the property that the sum of its entries across any row and any column is 0. is an n-by-m matrix.

Nash equilibrium In game theory, the Nash equilibrium is a solution concept of a non-cooperative game involving two or more players, in which each player is assumed to know the equilibrium strategies of the other players, and no player has anything to gain by changing only their own strategy.[1] If each player has chosen a strategy and no player can benefit by changing strategies while the other players keep theirs unchanged, then the current set of strategy choices and the corresponding payoffs constitutes a Nash equilibrium. The reality of the Nash equilibrium of a game can be tested using experimental economics method. Stated simply, Amy and Will are in Nash equilibrium if Amy is making the best decision she can, taking into account Will's decision while Will's decision remains unchanged, and Will is making the best decision he can, taking into account Amy's decision while Amy's decision remains unchanged. Applications[edit] History[edit] The Nash equilibrium was named after John Forbes Nash, Jr. Let .

Small-world network Small-world network exampleHubs are bigger than other nodes Average vertex degree = 1,917 Average shortest path length = 1.803. Clusterization coefficient = 0.522 Random graph Average vertex degree = 1,417 Average shortest path length = 2.109. Clusterization coefficient = 0.167 In the context of a social network, this results in the small world phenomenon of strangers being linked by a mutual acquaintance. Properties of small-world networks[edit] This property is often analyzed by considering the fraction of nodes in the network that have a particular number of connections going into them (the degree distribution of the network). ) is defined as R. Examples of small-world networks[edit] Small-world properties are found in many real-world phenomena, including road maps, food chains, electric power grids, metabolite processing networks, networks of brain neurons, voter networks, telephone call graphs, and social influence networks. Examples of non-small-world networks[edit] See also[edit]

Game theory Game theory is the study of strategic decision making. Specifically, it is "the study of mathematical models of conflict and cooperation between intelligent rational decision-makers."[1] An alternative term suggested "as a more descriptive name for the discipline" is interactive decision theory.[2] Game theory is mainly used in economics, political science, and psychology, as well as logic, computer science, and biology. The subject first addressed zero-sum games, such that one person's gains exactly equal net losses of the other participant or participants. Today, however, game theory applies to a wide range of behavioral relations, and has developed into an umbrella term for the logical side of decision science, including both humans and non-humans (e.g. computers, animals). Modern game theory began with the idea regarding the existence of mixed-strategy equilibria in two-person zero-sum games and its proof by John von Neumann. Representation of games[edit] Extensive form[edit] [edit]

Maximum flow problem An example of a flow network with a maximum flow. The source is s, and the sink t. The numbers denote flow and capacity. History[edit] Over the years, various improved solutions to the maximum flow problem were discovered, notably the shortest augmenting path algorithm of Edmonds and Karp and independently Dinitz; the blocking flow algorithm of Dinitz; the push-relabel algorithm of Goldberg and Tarjan; and the binary blocking flow algorithm of Goldberg and Rao. Definition[edit] A flow network, with source s and sink t. Let be a network with being the source and the sink of respectively. The capacity of an edge is a mapping , denoted by or . A flow is a mapping , subject to the following two constraints: , for each (capacity constraint: the flow of an edge cannot exceed its capacity) (conservation of flows: the sum of the flows entering a node must equal the sum of the flows exiting a node, except for the source and the sink nodes) The value of flow is defined by , where is the source of . to , and a flow

Related: