1 / 11

Graph Algorithms for Irregular, Unstructured Data

Graph Algorithms for Irregular, Unstructured Data. John Feo Center for Adaptive Supercomputing Software Pacific Northwest National Laboratory July, 2010. Analytic methods and applications. Semantic Web. FaceBook - 300 M users. Train. Anthrax. Bus. Money. Endo. Hayashi. Zaire.

vanna
Download Presentation

Graph Algorithms for Irregular, Unstructured Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Graph Algorithms for Irregular, Unstructured Data John Feo Center for Adaptive Supercomputing SoftwarePacific Northwest National Laboratory July, 2010

  2. Analytic methods and applications Semantic Web FaceBook - 300 M users Train Anthrax Bus Money Endo Hayashi Zaire People, Places, & Actions Community Activities Security National Security SmartGrid Blog Analysis Anomaly detection Connect-the-dots N-x contingency analysis Community thought leaders

  3. Data analytics 1000x growthin 3 years! • has more than 300 million active users • Sample queries: • Allegiance switching: identify entities that switch communities. • Community structure: identify the genesis and dissipation of communities • Phase change: identify significant change in the network structure • Traditional graph partitioning often fails: • Topology: Interaction graph is low-diameter and has no good separators • Irregularity: Communities are not uniform in size • Overlap: individuals are members of one or more communities

  4. Graphs are not grids Scientific Grids Graphs for Data Informatics Static or slowly involving Planar Nearest neighbor communication Work performed per cell or node Work modifies local data Dynamic Non-planar Communications are non-local and dynamic Work performed by crawlers or autonomous agents Work modifies data in many places Graphs arising in informatics are very different from the grids used in scientific computing

  5. Small-world and scale-free “Six degrees of separation” Large hubs are in grey • In scale-free graphs • difficult to partition • work concentrates in a few nodes • In low diameter graphs • work explodes • difficult to partition • high percentage of nodes are visited

  6. Graph methods Influential Factors • Degree distribution • Normal • Scale-free • Planar or non-planar • Static or dynamic • Weighted or unweighted • Weight distribution • Typed or untyped edges Load imbalanceNon-planar Difficult to partition Concurrent insertsand deletions • Paths • Shortest path • Betweenness • Min/max flow • Structures • Spanning trees • Connected components • Graph isomorphism • Groups • Matching/Coloring • Partitioning • Equivalence

  7. Challenges • Problem size • Ton of bytes, not ton of flops • Little data locality • Have only parallelism to tolerate latencies • Low computation to communication ratio • Single word access • Threads limited by loads and stores • Frequent synchronization • Node, edge, record • Work tends to be dynamic and imbalanced • Let any processor execute any thread

  8. Grids,Uniform, and Scale-Free Graphs USA Roadmap METIS Partitioner Scale-Free Uniform

  9. System requirements Cray XMT • Global shared memory • No simple data partitions • Local storage for thread private data • Network support for single word accesses • Transfer multiple words when locality exists • Multi-threaded processors • Hide latency with parallelism • Single cycle context switching • Multiple outstanding loads and stores per thread • Full-and-empty bits • Efficient synchronization • Wait in memory • Message driven operations • Dynamic work queues • Hardware support for thread migration

  10. Center for Adaptive Supercomputer Software Sponsored by DOD Driving Development of Next-Generation Massively Multithreading Architectures

  11. Summary • The new HPC is irregular and sparse • There are commercial and consumer applications • If the applications are important enough, machines will be built • HPC is too large and too diverse for “one size fits all” • We need to build the right machines for the problems we have to solve

More Related