Listing Dense Subgraphs in Small Memory

Pinto P.; Cruces N.; Hernandez C.

Keywords: web graphs, Graph Pattern Listing, Streaming Algorithms, External Memory Algorithms

Abstract

Listing relevant patterns from graphs is becoming increasingly challenging as Web and social graphs are growing in size at a great rate. This scenario requires to process information more efficiently, including the need of processing data that cannot fit in main memory. Typical approaches for processing data using limited main memory include the streaming and external memory models. This paper addresses the problem of listing dense subgraphs from Web and social graphs using little memory. We propose an external memory algorithm based on K-way merge-sort for clustering and reordering input graphs. We also propose mining heuristics that work well with different stream orders such as URL, BFS, and cluster-based. Our experimental evaluation shows that on Web graphs, in comparison with the in-memory algorithm, the streaming mining heuristic is able to find between 70 and 96% of edges participating in dense subgraphs, uses only between 17 and 25% of the memory, and running times are between 34 and 65%. We further consider an application that uses these dense subgraphs for compressing Web graphs with a representation that enables querying the collection of subgraphs for pattern recovery and basic statistics without decompression.

Más información

Título según WOS: Listing Dense Subgraphs in Small Memory
Título según SCOPUS: Listing dense subgraphs in small memory
Título de la Revista: 2014 9TH LATIN AMERICAN WEB CONGRESS (LA-WEB)
Editorial: IEEE
Fecha de publicación: 2014
Página de inicio: 33
Página final: 41
Idioma: English
DOI:

10.1109/LAWeb.2014.16

Notas: ISI, SCOPUS