.. _supported_models-GAT: GAT ========= ----------------- Introduction ----------------- `\[paper\] `_ **Title:** Graph Attention Networks **Authors:** Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, Yoshua Bengio **Abstract:** We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations. By stacking layers in which nodes are able to attend over their neighborhoods' features, we enable (implicitly) specifying different weights to different nodes in a neighborhood, without requiring any kind of costly matrix operation (such as inversion) or depending on knowing the graph structure upfront. In this way, we address several key challenges of spectral-based graph neural networks simultaneously, and make our model readily applicable to inductive as well as transductive problems. Our GAT models have achieved or matched state-of-the-art results across four established transductive and inductive graph benchmarks: the Cora, Citeseer and Pubmed citation network datasets, as well as a protein-protein interaction dataset (wherein test graphs remain unseen during training). ---------------------- Running with XGCN ---------------------- forward_mode: 'full_graph' -------------------------- When using the 'full_graph' forward_mode, embeddings of all the nodes are inferred in each training batch. **Configuration template:** .. code:: yaml # config/GraphSAGE-full_graph-config.yaml # Dataset/Results root data_root: "" results_root: "" # Trainer configuration epochs: 200 use_validation_for_early_stop: 1 val_freq: 1 key_score_metric: r100 convergence_threshold: 20 val_method: "" val_batch_size: 256 file_val_set: "" # Testing configuration (optional) test_method: "" test_batch_size: 256 file_test_set: "" # DataLoader configuration Dataset_type: NodeListDataset num_workers: 0 NodeListDataset_type: LinkDataset pos_sampler: ObservedEdges_Sampler neg_sampler: RandomNeg_Sampler num_neg: 1 BatchSampleIndicesGenerator_type: SampleIndicesWithReplacement train_batch_size: 1024 str_num_total_samples: num_edges epoch_sample_ratio: 0.1 # Model configuration model: GAT seed: 1999 graph_device: "cuda:0" emb_table_device: "cuda:0" gnn_device: "cuda:0" out_emb_table_device: "cuda:0" forward_mode: full_graph from_pretrained: 0 file_pretrained_emb: "" freeze_emb: 0 use_sparse: 0 emb_dim: 64 emb_init_std: 0.1 emb_lr: 0.005 gnn_arch: "[{'in_feats': 64, 'out_feats': 64, 'num_heads': 4, 'activation': torch.tanh}, {'in_feats': 64, 'out_feats': 64, 'num_heads': 4}]" gnn_lr: 0.001 loss_type: bpr L2_reg_weight: 0.0 **Run from command line:** .. code:: bash # script/examples/facebook/run_GAT-full_graph.sh # set to your own path: all_data_root='/home/sxr/code/XGCN_and_data/XGCN_data' config_file_root='/home/sxr/code/XGCN_and_data/XGCN_library/config' dataset=facebook model=GAT seed=0 device="cuda:1" graph_device=$device emb_table_device=$device gnn_device=$device out_emb_table_device=$device data_root=$all_data_root/dataset/instance_$dataset results_root=$all_data_root/model_output/$dataset/$model/[seed$seed] # file_pretrained_emb=$all_data_root/model_output/$dataset/Node2vec/[seed$seed]/out_emb_table.pt python -m XGCN.main.run_model --seed $seed \ --config_file $config_file_root/$model-full_graph-config.yaml \ --data_root $data_root --results_root $results_root \ --val_method one_pos_k_neg \ --file_val_set $data_root/val-one_pos_k_neg.pkl \ --key_score_metric r20 \ --test_method multi_pos_whole_graph \ --file_test_set $data_root/test-multi_pos_whole_graph.pkl \ --graph_device $graph_device --emb_table_device $emb_table_device \ --gnn_device $gnn_device --out_emb_table_device $out_emb_table_device \ # --from_pretrained 1 --file_pretrained_emb $file_pretrained_emb \ forward_mode: 'sample' -------------------------- When using the 'sample' forward_mode, DGL's neighbor sampler is used to generate "blocks" (please refer to `DGL docs: Chapter 6: Stochastic Training on Large Graphs `_ for more information). **Configuration template:** .. code:: yaml # config/GAT-block-config.yaml # Dataset/Results root data_root: "" results_root: "" # Trainer configuration epochs: 200 use_validation_for_early_stop: 1 val_freq: 1 key_score_metric: r100 convergence_threshold: 20 val_method: "" val_batch_size: 256 file_val_set: "" # Testing configuration test_method: "" test_batch_size: 256 file_test_set: "" # DataLoader configuration Dataset_type: BlockDataset num_workers: 0 num_gcn_layers: 2 train_num_layer_sample: "[10, 20]" NodeListDataset_type: LinkDataset pos_sampler: ObservedEdges_Sampler neg_sampler: RandomNeg_Sampler num_neg: 1 BatchSampleIndicesGenerator_type: SampleIndicesWithReplacement train_batch_size: 1024 str_num_total_samples: num_edges epoch_sample_ratio: 0.1 # Model configuration model: GAT seed: 1999 graph_device: "cuda:0" emb_table_device: "cuda:0" gnn_device: "cuda:0" out_emb_table_device: "cuda:0" forward_mode: sample infer_num_layer_sample: "[10, 20]" from_pretrained: 0 file_pretrained_emb: "" freeze_emb: 0 use_sparse: 0 emb_dim: 64 emb_init_std: 0.1 emb_lr: 0.005 gnn_arch: "[{'in_feats': 64, 'out_feats': 64, 'num_heads': 4, 'activation': torch.tanh}, {'in_feats': 64, 'out_feats': 64, 'num_heads': 4}]" gnn_lr: 0.001 loss_type: bpr L2_reg_weight: 0.0 **Run from command line:** .. code:: bash # script/examples/facebook/run_GAT-block.sh # set to your own path: all_data_root='/home/sxr/code/XGCN_and_data/XGCN_data' config_file_root='/home/sxr/code/XGCN_and_data/XGCN_library/config' dataset=facebook model=GAT seed=0 device="cuda:1" graph_device=$device emb_table_device=$device gnn_device=$device out_emb_table_device=$device data_root=$all_data_root/dataset/instance_$dataset results_root=$all_data_root/model_output/$dataset/$model/[seed$seed] # file_pretrained_emb=$all_data_root/model_output/$dataset/Node2vec/[seed$seed]/out_emb_table.pt python -m XGCN.main.run_model --seed $seed \ --config_file $config_file_root/$model-block-config.yaml \ --data_root $data_root --results_root $results_root \ --val_method one_pos_k_neg \ --file_val_set $data_root/val-one_pos_k_neg.pkl \ --key_score_metric r20 \ --test_method multi_pos_whole_graph \ --file_test_set $data_root/test-multi_pos_whole_graph.pkl \ --graph_device $graph_device --emb_table_device $emb_table_device \ --gnn_device $gnn_device --out_emb_table_device $out_emb_table_device \ # --from_pretrained 1 --file_pretrained_emb $file_pretrained_emb \