Skip to main content

GEEM : Active Learning for Graphs

Some nodes are more informative than others.

Semi-supervised node classification is the problem of infering node labels y\mathbf{y} over the whole graph given the graph structured G\mathcal{G}, node attributes X\mathbf{X}, and the labels of some of the nodes yL\mathbf{y}_{\mathcal{L}}.

Carefully selecting which labeled nodes are part of the dataset YLY_{\mathcal{L}} with active learning can greatly reduce the required number of data to reach the same accuracy.

Our GEEM algorithm uses graph cognizant logistic regression, equivalent to a linearized graph convolutional neural network (GCN), for the prediction phase and maximizes the expected error reduction in the query phase.

You can view our GEEM algorithm in action against a random baseline:

Alt Text

Authors:

Citation

This project was published at ICML 2020.

@inproceedings{regol2020geem,
title = {Active Learning on Attributed Graphs via Graph Cognizant Logistic Regression and Preemptive Query Generation},
author = {Regol, Florence and Pal, Soumyasundar and Zhang, Yingxue and Coates, Mark},
booktitle = {International Conference on Machine Learning (ICML)},
pages = {8041--8050},
year = {2020},
month = {Jul}
}

ArXiv link