大规模图处理
各种网络及其度
无/有向网络中节点的度
无向网络
Degree (or degree centrality) of a vertex: :
, of edges connected to it
eg. 下图中

有向网络
In-degree of a vertex : , of edges pointing to
Out-degree of a vertex : , of edges from
eg. 下图中

各网络及其度的定义
Simple network: If a network has neither self-edges nor multi-edges
Multi-edge (multigraph): If more than one edge between the same pair of vertices
Self-loop: If an edge connects vertex to itself
Direct graph (digraph): If each edge has a direction
Weighted graph: If a weight (a real number ) is associated with each edge

网络图结构及性质
Subgraph: A subset of the nodes and edges in a graph/network
Clique (complete graph): Every node is connected to every other
Singleton vs. dyad (two nodes and their relationship) vs. triad:

Ego-centric network: A net work pull out by selecting a node and all of its connections

度的分布

Degree sequence
The list of degrees of the nodes sorted in non-increasing order. Eg: in ,degree sequence:(4, 3, 2, 2, 1)
Degree frequency distribution
Let denote the of vertices with degree k, is max degree for a node
Eg: in ,degree frequency distribution: (0, 1, 2, 1, 1)
Degree distribution
Probability mass function for random variable :
Eg: in ,degree distribution: (0, 0.2, 0.4, 0.2, 0.2)
路径
Walk
In a graph between nodes and : ordered sequence of vertices, starting at and ending at , s.t. there is an edge between every pair of consecutive vertices
Hop
The length of the walk
Path
A walk with distinct vertices
Shortest Path (geodesic path, )
Geodesic paths are not necessarily unique: It is quite possible to have more than one path of equal length between a given pair of vertices
Diameter of a graph: the length of the longest geodesic path between any pair of vertices in the network for which a path actually exists
Average Path Length( )
Average of the shortest paths between all pairs of nodes:
Eulerian Path
A path that traverses each edge in a network exactly once
Hamilton Path
A path that visits each vertex in a network exactly once
Distance
The length of the shortest path

Total of path length 2 from to ,via any vertex in is:
So, generalizing to path of arbitrary length, we have:
When starting and ending at the same vertex , we have:
of loops can be expressed in terms of adjacency matrix:
Matrix written in the form of , is the orthogonal matrix of eigenvectors and is the diagonal matrix of eigenvalues (SVD):
, where is the -th eigenvalue of the adjacency matrix
Independent Paths
Edge-independent:
Two path connecting a pair of vertices are edge-independent if they share no edges
Vertex-independent:
Two path are vertex-independent if they share no vertices other than the starting and ending vertices
半径与直径

Eccentricity
The eccentricity of a node is the maximum distance from to any other nodes in graph
Eg.
Radius
Radius of a connected graph : the min eccentricity of any node in
Eg.
Diameter
Diameter of a connected graph : the max eccentricity of any node in
Eg.
弱/强连接&出/入元素(有向图)
Weakly/Strongly connected
Weakly: If the vertices are connected by 1 or more paths when one can go either way along any edge

In/Out-component
Out-component: Those reachable from vertex A

簇类系数
因为真实的网络可能十分稀疏,所以需要这么一个衡量系数去衡量一下密集度
Clustering coefficient of a node : A measure of the density of edges in the neighborhood of
Let be the subgraph induced by the neighbors of vertex , ( of neighbors of ), and ( of edges among the neighbors of )
For undirected network:
For directed network:
For a graph: averaging the local clustering coefficient of all the vertices
双边网络
Bipartite Network: two kinds of vertices, and edges linking only vertices of unlike types

Incidence matrix: if vertex links to group , otherwise
The projection to one-mode can be written in terms of the incidence matrix as follow:
The product of will be if and both belong to the sample group in the bi-partite network
协同引证与文献耦合
Co-citation
Co-citation of vertices and : , if and are both cited by

of vertices having outgoing edges pointing to both and :
Co-citation matrix: its a symmetric matrix
Diagonal matrix : total papers citing
Bibliographic coupling
Bibliographic coupling of vertices and : , if and both cite

Bibliographic coupling of and :
Bibliographic coupling matrix:
Diagonal matrix : total papers cited by
Co-citation&Bibliographic Coupling Comparison
Strong co-citation must have a lot of incoming edges
Strong bib-coupling if two papers have similar citations
图计算任务和算法
图计算的任务
1、Graph pattern matching: subgraph matching...
2、 Graph analytics: PageRank ...
图计算的算法
PageRank, Simrank, HITS, Label Propagation, Maximal-Independent-Set, Maximal-Node-Matching, Connected-Component, Shortest Distance ...
程序模型框架
不支持:N 支持:Y
Model
Data/Graph Integration
Both Graph Matching/Analytic
Explicit Loop Control
Systems
Vertex-Centric
N
N
N
Giraph, Pregel, WindCatch, GPS, Grace, Pregel+, TUX
Edge-Centric
N
N
N
PowerGraph, GraphChi, X-Stream, TurboGraph, FlashGraph, Powerlyra
Block-Centric
N
N
N
Giraph++, Blogel
Linear Algebra
N
N
N
PEGASUS, GraphTwist, LA3
Grape
N
Y
N
Grape
DSL
N
N
Y
Green-Marl
Datalog
N
Y
Y
Socialite, Distribute-Socialite, DeALS
Dataflow
Y
Y
Y
GraphX, GraphFrame, Pregelix
SQL-G
Y
Y
Y
SQL-G, EmptyHeaded
Last updated