WWW 2022最佳论文：GNN的结构搜索系统

wyt · 2022 年9 月 20 日 06:05

论文链接：https://dl.acm.org/doi/10.1145/3485447.3511986

arxiv:https://arxiv.org/pdf/2203.00638.pdf

系统链接：GitHub - PKU-DAIR/SGL: A scalable graph learning toolkit for extremely large graph datasets. (WWW'22, 🏆 Best Student Paper Award)

wyt · 2022 年9 月 20 日 08:54

This paper proposes PaSca, a new paradigm and system that offers a principled approach to systemically construct and explore the design space for scalable GNNs

本文提出了一种新的范式——PaSca 为可扩展的gnn提供系统构建和探索设计空间的原则性方法

we implement an auto-search engine that can automatically search well-performing and scalable GNN architectures

自动搜索性能良好和可扩展的GNN架构

打败了JK-Net，0.4个百分点，速度有一定提升

大多数的GNN需要perform a recursive neighborhood expansion to gather neural messages repeatedly，这个过程会造成指数增长的资源消耗，是目前大规模GNN计算面临的主要问题之一。

To demonstrate this issue, we utilize distributed training functions provided by DGL [2] to execute the train pipeline of GraphSAGE

（加速效果）

由此我们提出了一个新系统：Pasca，有以下特点：

Scalable Paradigm
Auto-search Engines.：

Concretely, the representatives (i.e.,PaSca-V2 and PaSca-V3) outperform the state-of-the-art JK-Net by0.2% and 0.4% in predictive accuracy on our industry dataset, while achieving up to 56.6× and 28.3× training speedups, respectively.

Relevance to Web.

系统内的几个设计：

（本人数学不好，可以去文章里看一下公式）

GNN Pipelines.
Scalable GNN Instances.
Graph Neural Architecture Search.

提出了一个SGAP（Scalable Graph Neural Architecture Paradigm）范式：

it differs from the previous NMP and DNMP framework in terms of message type, message scale, and pipeline：

Pre-processing.
Model-training.
Post-processing.

Search Engine：

系统搜索引擎结构：

Evaluation Engine：

Graph Data Aggregator.
Neural Architecture Trainer.

最后是实验效果：