Raphtory: Streaming analysis of distributed temporal graphs

Submitted by richard on Sun, 01/16/2022 - 11:39
Benjamin Steer, Felix Cuadrado, Richard G. Clegg
Future Generation Computer Systems
Temporal graphs capture the development of relationships within data throughout time. This model fits naturally within a streaming architecture, where new events can be inserted directly into the graph upon arrival from a data source and be compared to related entities or historical state. However, the vast majority of graph processing systems only consider traditional graph analysis on static data, with some outliers supporting batched updating and temporal analysis across graph snapshots. In this work we define a temporal graph model which can be updated via event streams and discuss the challenges of distribution and graph maintenance. To solve these challenges, we introduce Raphtory, a distributed temporal graph management system which maintains the full graph history in memory, leveraging this to insert streamed events directly into the model without batching or centralised ordering. Raphtory additionally provides an API to perform both approximative analysis on the most up-to-date version of the graph, as well as temporal analysis throughout its full history; executed in parallel with ingestion.
This paper is the first journal paper describing the Raphtory distributed streaming system. While the system design has moved on since this paper was written much of the overall architecture description remains relevant.
author = {Steer, Benjamin and Cuadrado, Felix and Clegg, Richard G.},
journal = {Future Generation Computer Systems},
volume = {101},
pages = {453--464},
year = {2020}
Paper type