图神经⽹络实践之图节点分类(⼀)
本⽂主要以Deep Graph Library(DGL)为基础,利⽤图神经⽹络来进⾏图节点分类任务。本篇针对的图为同构图。
1. DGL 介绍
2. 图节点分类实践
2.1 数据集加载
和风日语网本⽂使⽤的数据集为DGL中已经有的Cora数据集,该数据集为论⽂引⽤数据集,包含论⽂节点和论⽂之间的引⽤关系,通过论⽂本⾝的特征和引⽤关系来对论⽂进⾏分类,其共包括以下七类:基于案例遗传算法神经⽹络概率⽅法强化学习规则学习
理论
通过上述代码,可以加载Cora数据集,并能够看到数据集的基本情况,数据集共包含2708个节点,10556条边。
2.2 图神经⽹络模块定义
高级口译考试时间
简单的GCN构建:
以下代码构建了⼀个两层,每⼀层通过聚合邻居信息来计算新的节点表⽰。
辘轳怎么读如果想要构建多层 GCN,您可以简单地堆叠GraphConv 模块,这些模块继承⾃Module.
GraphSAGE构建:
GraphSAGE 是图神经⽹络中⽐较经典的模型,GraphSAGE 包含采样和聚合 (Sample and aggregate),⾸先使⽤节点之间连接信息,对邻居进⾏采样,然后通过多层聚合函数不断地将相邻节点的信息融合在⼀起。本⽂参照DGL中的例⼦来实现的GraphSAGE,代码如下:import dgl .data from dgl .nn import GraphConv import torch .nn as nn from dgl .nn .pytorch .conv import SAGEConv import torch import torch .nn .functional as F datat = dgl .data .CoraGraphDatat ()print ('Number of categories:', datat .num_class )g = datat [0]print ("结点信息",g .ndata )print ("边信息",g .edata )
1
2
3
4
5
6
7
8
9
10
11
12class GCN (nn .Module ): def __init__(lf , in_feats , h_feats , num_class ): super (GCN , lf ).__init__() lf .conv1 = GraphConv (in_feats , h_feats ) lf .conv2 = GraphConv (h_feats , num_class ) def forward (lf , g , in_feat ): h = lf .conv1(g , in_feat )
h = F .relu (h ) h = lf .conv2(g , h ) return h
1
2
3
4
5
6
7
8
9
10
11
2.3 评价函数
2.4 图神经⽹络的训练
全图(使⽤所有的节点和边的特征)上的训练只需要使⽤上⾯定义的模型进⾏前向传播计算,并通过在训练节点上⽐较预测和真实标签来计算损失,从⽽完成后向传播。
节点特征和标签存储在其图上,训练、验证和测试的分割也以布尔掩码的形式存储在图上。class GraphSAGE (nn .Module ): def __init__(lf , in_feats , n_hidden , n_class , n_layers , activation , dropout , aggregator_type ): super (GraphSAGE , lf ).__init__() lf .layers = nn .ModuleList () lf .dropout = nn .Dropout (dropout ) lf .activation = activation # input layer lf .layers .append (SAGEConv (in_feats , n_hidden , aggregator_type )) # hidden layers for i in range (n_layers - 1): lf .layers .append (SAGEConv (n_hidden , n_hidden , aggregator_type )) # output layer lf .layers .append (SAGEConv (n_hidden , n_class , aggregator_type )) # activation None def forward (lf , graph , inputs ): h = lf .dropout (inputs ) for l , layer in enumerate (lf .layers ): h = layer (graph , h ) if l != len (lf .layers ) - 1:
h = lf .activation (h ) h = lf .dropout (h ) return h
1
2
3
4
5
6
7
8
9
10
gre报名费
11
12
13
14
15
16
17
18
19
20
21
bonn22
23
24
25
26
四级成绩查询网站27
小学生国旗下演讲稿28
29
30def evaluate (model , graph , features , labels , nid ): model .eval () with torch .no_grad (): logits = model (graph , features ) logits = logits [nid ] labels = labels [nid ] _, indices = torch .max (logits , dim =1) correct = torch .sum (indices == labels ) return correct .item () * 1.0 / len (labels )
1
2
3
4
5
6
7
8
9
双层GNN训练:
GraphSAGE训练:
运⾏上述代码即可得到分类的效果,⼀般来说GraphSAGE的效果会略好于双层的GCN,但差距并不太⼤。
3. 总结
本⽂主要在DGL包⾃带的同构图数据集上进⾏了⼀个简单的图节点分类的尝试,之后会尝试在其他数据集(异构图/知识图谱)上进⾏图节点分类的任务。
参考代码features = g .ndata ['feat']labels = g .ndata ['label']train_mask = g .ndata ['train_mask']val_mask = g .ndata ['val_mask']test_mask = g .ndata ['test_mask']train_nid = train_mask .nonzero ().squeeze ()val_nid = val_mask .nonzero ().squeeze ()test_nid = test_mask .nonzero ().squeeze () def train (g , model ): optimizer = torch .optim .Adam (model .parameters (), lr =0.01) best_val_acc = 0 best_test_acc = 0 for e in range (100): # Forward logits = model (g , features ) # Compute prediction pred = logits .argmax (1) # Compute loss # Note that you should only compute the loss of the nodes in the training t. loss = F .cross_entropy (logits [train_mask ], labels [train_mask ]) # Backward optimizer .zero_grad () loss .backward () optimizer .step () acc = evaluate (model , g , features , labels , val_nid ) print ("Epoch {:05d} | Loss {:.4f} | Accuracy {:.4f} | ".format (e , loss .item (), acc ))
1
2
3
4
5
6
7
8
betweenlegs9
10
小学英语课程11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32model = GCN (g .ndata ['feat'].shape [1], 16, datat .num_class )train (g , model )print ()acc = evaluate (model , g , features , labels , test_nid )print ("Test Accuracy {:.4f}".format (acc ))
1
2
鞭笞3
4
5modeSAGE = GraphSAGE (g .ndata ['feat'].shape [1], 16, datat .num_class , 2, F .relu , 0.5, "gcn")train (g , modeSAGE )acc = evaluate (modeSAGE , g , features , labels , test_nid )print ("Test Accuracy {:.4f}".format (acc ))
1
2
3
4
5
6
7
8
9
10