Τίτλος 
Fast shortest path distance estimation in large graphs 
Έγγραφο 
Προβολή εγγράφου

Συγγραφέας 
Michalis Potamias 
Περιγραφή 
Michalis Potamias [22/3/2010] 
Fast shortest path distance estimation in large graphs
This talk will focus on approximate landmarkbased methods for
pointtopoint distance estimation in very large networks. These
methods involve selecting a subset of nodes as landmarks and computing
offline the distances from each node in the graph to those landmarks.
At runtime, when the distance between a pair of nodes is needed, it
can be estimated quickly by combining the precomputed distances. We
prove that selecting the optimal set of landmarks is an NPhard
problem, and thus heuristic solutions need to be employed. We
therefore explore theoretical insights to devise a variety of simple
methods that scale well in very large networks. The efficiency of the
suggested techniques is tested experimentally using five realworld
graphs having millions of edges.
While theoretical bounds support the claim that random landmarks work
well in practice, our extensive experimentation shows that smart
landmark selection can yield dramatically more accurate results: for a
given target accuracy, our methods require as much as 250 times less
space than selecting landmarks at random. In addition, we demonstrate
that at a very small accuracy loss our techniques are several orders
of magnitude faster than the stateoftheart exact methods. We also
apply our methods to the task of social search in large graphs.
Finally, I will briefly talk about the notion of distance in
probabilistic graphs. Complex networks, such as biological, social,
and communication networks, often entail uncertainty, and thus, can be
modeled as probabilistic graphs. Similar to the problem of similarity
search in standard graphs, a fundamental problem for probabilistic
graphs is to answer efficiently knearest neighbor queries (kNN),
which is the problem of computing the k closest nodes to some specific
node. We propose novel distance functions that extend wellknown graph
concepts, such as shortest paths. In order to compute them in
probabilistic graphs, we design algorithms based on sampling and novel
graphtransformation ideas. During kNN query processing we
efficiently prune the search space using new techniques. In our
experiments we demonstrate that our distance functions outperform
alternatives in identifying true neighbors in realworld data. We also
demonstrate that our algorithms scale for graphs with tens of millions
of edges.