550 likes | 781 Views
DELIS IP-project. Evolution and structure of the Internet. Alessandro Vespignani ( CNRS, LPT - Paris ). Marc Barthelemy ( CEA - Paris ). Alain Barrat ( CNRS, LPT - Paris ). Romualdo Pastor-Satorras (UPC -Barcelona). Yamir Moreno ( University of Saragoza )
E N D
DELIS IP-project Evolution and structure of the Internet Alessandro Vespignani (CNRS, LPT-Paris)
Marc Barthelemy (CEA-Paris) • Alain Barrat (CNRS, LPT-Paris) • Romualdo Pastor-Satorras (UPC -Barcelona) • Yamir Moreno (University of Saragoza) • Alexei Vazquez (University of Notre Dame) • Roberto Percacci (INFN) • Luca Dall’Asta (CNRS, LPT-Paris) • Ignacio Alvarez Hamelin (CNRS, LPT-Paris)
The Physical Internet • Satellites • Computers (routers) • Modems (??) • Phone cables • Optic fibers • EM waves Technological Heterogeneity
A network is a system that allows its abstract/mathematical representation as a graph Vertices (nodes) = elements of the system Edges (links) = interactions/relations among the elements of the system
Internet tomography Claffy et al (1999). • Multi-probe reconstruction (router-level) • Use of BGP tables for the Autonomuos System level (domains) • CAIDA • NLANR • RIPE • IPM • PingER Topology and performance measurements Graph representation different granularities
Small world properties Shortest paths = minimum (# hops) between two nodes Regular lattice with N= 104 d~ 102 Small world with N= 104 d~ln N
Small world properties Distribution of Shortest paths (# hops) between two nodes Average fraction of nodes within a shortest path of lenght d
Haphazard set of points and lines Randomness This does not imply complexity!!
Poisson distribution Erdös-Rényi model(1960) With probability p an edge is established among couple of vertices <k> = p N
(Late 90s large networks graphs and data become available) Where “the complications” are ??
n 3 Higher probability to be connected 2 1 Clustering coefficient = connected peers will likely know each other # of links between 1,2,…n neighbors C = n(n-1)/2
P(k) =probability that a node has k links Connectivity distribution • Router level &AS level P(k) ~ k -g (2 <g 3) • <k>= const • <k2> Faloutsos et al. 1999 Pastor Satorras, Vazquez &Vespignani, PRL 87, 258701 (2001) Diverging fluctuations Scale-free properties
Classical Internet topology generators • Waxman generator • Structural generators • Transit-stub • Tiers Exponentially Bounded Degree distributions Scale-free topology generators INET (Jin, Chen, Jamin) BRITE (Medina & Matta) Modeling of the Internet structure with ad-hoc algorithms tailored on the properties we consider more relevant
Pastor Satorras, Vazquez &Vespignani, PRL 87, 258701 (2001) Qian, Govindan et al. (2002) The Internet growth • In 1999: • 3410 new AS • 1713 lost AS 1997 2000 AS 3112 9107 Evolving system
Main Features of complex networks • Many interacting units • Dynamical evolution • Self-organization Supervising entity Project/blueprint Non-trivial architecture Unexpected emergent properties Cooperative phenomena Complexity
Statistical physics approach to network modeling Microscopic processes of the many component units Macroscopic statistical and dynamical properties of the system Cooperative phenomena Complex topology Natural outcome of the dynamical evolution
Shift in focus : Dynamical processes Modeling starts from the understanding of the basic mechanisms underlying the networks’ growth Complex topology is spontaneously generated in the models (opposite to ad-hoc constructions) Richer understanding of the interplay among dynamics, traffic and economical requirements.
Preferential attachmentmechanism Networks expand by the addition of new nodes Examples:WWW : addition of new documents Internet : connection of new routers Nodes are wired with higher probabibility to highly connected nodes Examples:WWW : links to well known web-pagesInternet : links to well connected ISP
How to generate scale-free graph Growth: at each time step a new node is added with m links to be connected with previous nodes Preferential attachment: The probability that a new link is connected to a given node is proportional to the number of node’s links. by Barabasi & Albert (1999) The BA model The preferential attachment is following the probability distribution : The generated connectivity distribution is P(k) ~ k -3
Degree distribution BA network
Pastor Satorras, Vazquez &Vespignani, PRL 87, 258701 (2001) Qian, govindan et al. (2002) Jeong, Neda and Barabasi (2003) Preferential Attachment in Internet Probability that a link connects to a node with connectivity k m(k) ~k a p(k) k a-g a-g= -1.2 a 1.0 Linear preferential attachment
Shift of focus: Static construction Dynamical evolution Direct problem Evolution rules Emerging topology Inverse problem Given topology Evolution rules
More models • Generalized BA model • (Redner et al. 2000) • (Mendes & Dorogovstev 2000) • (Albert et al.2000) Non-linear preferential attachment : (k) ~ k Initial attractiveness : (k) ~ A+k Rewiring • Highly clustered • (Eguiluz & Klemm 2002) • Fitness Model • (Bianconi et al. 2001) • Multiplicative noise • (Huberman & Adamic 1999)
Heuristically Optimized Trade-offs (HOT) Papadimitriou et al. (2002) New vertex i connects to vertex j by minimizing the function Y(i,j) = ad(i,j) + V(j) d= euclidean distance V(j)= measure of centrality Optimization of conflicting objectives
What else…… • Hierarchies and correlations (architecture) • Robustness and resilience • Spreading phenomena • Routing and database updating
The Hierarchy of the Internet • Stub AS : has only one connection to another AS • Multi-homed AS: two or more connections to other ASs but does not carry transit traffic • Transit AS: Two or more connections to other ASs and carries transit traffic HIERARCHICAL DECOMPOSITION • Four level hierarchy (linear scale) (Govindan and Reddy 1994) • Three-tier hierarchy (log scale) (Chang et al. 98) • Jellyfish hierarchy (connectivity shells) (Tauro et al. 2001)
Connectivity correlations Pastor Satorras, Vazquez &Vespignani, PRL 87, 258701 (2001) Average nearest neighbors degree < knn(k)> = Sk’k’ p(k’|k) Degree correlation function
Connectivity Hierarchy Average nearest neighbors degree < knn(k)> = Sk’k’ p(k’|k) Degree correlation function Highly degree ASs connect to low degree ASs Low degree ASs connect to high degree ASs No hierarchy for the router map
Clustering Hierarchy Clustering coefficient as a function of the vertex degree Highly degree ASs bridge not connected regions of the Internet Low degree ASs have links with highly interconnected regions of the Internet No hierarchy at the router level
Scale-free hierarchy • Continuum of levels Modular construction Small groups of networks organized in larger groups which act as the modules at the next level and so on “ad libitum”
Models validation (part II) Degree hierarchy Clustering hierarchy
Scale-free connectivity Density of infected individuals • Absence of any epidemic threshold (critical point) • Active state for any value of l • The infection pervades the system whatever spreading rate • In infinite systems the infection is infinitely persistent (indefinite stationary state) Pastor-Satorras&Vespignani, PRL 86, 3200(2001)
Rationalization of computer virus data • Lack of healthy phase = standard immunization cannot • drive the system below thershold!!!
Distributed database updating Broadcast = each elements passes the update to neighbors Epidemics = the update is spread as an infective agent Moreno, Nekovee,Vespignani(03) # updated databases E= efficiency = # of messages sent Warning => not deterministic Not all elements are contacted!! Trade-offs between efficiency and reliability
Internet is ever changing at all levels Is it too ambitious the attempt to have a dynamical theory of the Internet at the large scale ?? The lesson of statistical physics and cooperative phenomena: Basic symmetry and principles win over the microscopic details when we look at emergent properties
Internet mapping • Deployement of measurement tools • Active • Passive Netscan (traceroute based tool) maps the paths to selected IP address from a testing host (single probe). Testing host • One path to each node = directed graph spanning tree • NO cross-paths Burch & Cheswick (1999)
Interconnected level maps • Heuristic methods (Govindan et al.) • Router level maps Very effective for intranetwork
Measurements infrastructures Merging partial spanning tress from multiple sources
Sampling is incomplete Lateral connectivity is missed (edges are underestimated) Finite size sample Govindan et al 2002
Introduction of Biases Vertices and edges best sampled in the proximity of sources Number of sources and target (total traceroute probes) Statistical properties of the sampled graph sharply different from the original one Crovella et al. 2002 Clauset & Moore 2004 De Los Rios & Petermann 2004
Mean-Field Theory of traceroute-like exploration Ns = # sources Ns Nt e = = rt Ns Nt= # targets (rt -> density of targets) N Edge detection probability pij= 1 –exp ( -ebij) Vertex detection probability pi= 1 – (1- rt) exp ( -ebi) k*i= 2e +2 ebi) Effective degree observed bi , bij Betweenness
Betweennesscentrality = # of shortest paths traversing a vertex or edge (flow of information ) if each individuals send a message to all other individuals
Scale-free graph are better discriminated Tail is sampled very effectively
Homogeneous graphs give rise to spurious effects Average connectivity always dominate
Heavy tails properties are a genuine feature of the Internet however Quantitative analysis might be strongly biased What else…. • Router level very limited maps • Optimized strategies • Massive deployement traceroute@home
The dark side of the moon……Traffic and weights • The internet is a weighted networks • bandwidth, traffic, efficiency, routers capacity • Data are scarse and on limited scale • Interaction among topology and traffic • Traffic and routing