ดาวน์โหลด tgn - ดาวน์โหลดซอร์สโค้ด tgn

TGN: เครือข่ายกราฟชั่วคราว [Arxiv, YouTube, บล็อกโพสต์]

กราฟแบบไดนามิก	TGN

การแนะนำ

แม้จะมีโมเดลที่แตกต่างกันมากมายสำหรับการเรียนรู้อย่างลึกซึ้งเกี่ยวกับกราฟ แต่มีวิธีการเพียงไม่กี่วิธีในการจัดการกับกราฟที่นำเสนอธรรมชาติแบบไดนามิก (เช่นการพัฒนาหรือการเชื่อมต่อเมื่อเวลาผ่านไป)

ในบทความนี้เรานำเสนอเครือข่ายกราฟชั่วคราว (TGNs) ซึ่งเป็นกรอบทั่วไปที่มีประสิทธิภาพสำหรับการเรียนรู้อย่างลึกซึ้งเกี่ยวกับกราฟไดนามิกที่แสดงเป็นลำดับของเหตุการณ์ที่กำหนดเวลา ด้วยการผสมผสานใหม่ของโมดูลหน่วยความจำและตัวดำเนินการที่ใช้กราฟ TGNs สามารถทำได้ดีกว่าวิธีการก่อนหน้านี้อย่างมีนัยสำคัญในเวลาเดียวกันมีประสิทธิภาพในการคำนวณมากขึ้น

นอกจากนี้เรายังแสดงให้เห็นว่ารุ่นก่อนหน้าหลายรุ่นสำหรับการเรียนรู้เกี่ยวกับกราฟแบบไดนามิกสามารถใช้เป็นอินสแตนซ์เฉพาะของเฟรมเวิร์กของเรา เราทำการศึกษาการระเหยอย่างละเอียดเกี่ยวกับส่วนประกอบที่แตกต่างกันของเฟรมเวิร์กของเราและกำหนดรูปแบบที่ดีที่สุดที่ได้รับประสิทธิภาพที่ทันสมัยในงานการทำนายแบบ transductive และอุปนัยสำหรับกราฟแบบไดนามิก

ลิงค์กระดาษ: เครือข่ายกราฟชั่วคราวสำหรับการเรียนรู้อย่างลึกซึ้งเกี่ยวกับกราฟไดนามิก

เรียกใช้การทดลอง

ความต้องการ

การพึ่งพา (ด้วย Python> = 3.7):

 pandas==1.1.0
torch==1.6.0
scikit_learn==0.23.1

ชุดข้อมูลและการประมวลผลล่วงหน้า

ดาวน์โหลดข้อมูลสาธารณะ

ดาวน์โหลดชุดข้อมูลตัวอย่าง (เช่น Wikipedia และ Reddit) จากที่นี่และจัดเก็บไฟล์ CSV ในโฟลเดอร์ชื่อ data/

ประมวลผลข้อมูลล่วงหน้า

เราใช้รูปแบบ npy หนาแน่นเพื่อบันทึกคุณสมบัติในรูปแบบไบนารี หากคุณสมบัติขอบหรือคุณสมบัติโหนดขาดไปพวกเขาจะถูกแทนที่ด้วยเวกเตอร์ของศูนย์

 python utils/preprocess_data.py --data wikipedia --bipartite
python utils/preprocess_data.py --data reddit --bipartite

การฝึกอบรมแบบจำลอง

การเรียนรู้ด้วยตนเองโดยใช้งานทำนายลิงค์:

 # TGN-attn: Supervised learning on the wikipedia dataset
python train_self_supervised.py --use_memory --prefix tgn-attn --n_runs 10

# TGN-attn-reddit: Supervised learning on the reddit dataset
python train_self_supervised.py -d reddit --use_memory --prefix tgn-attn-reddit --n_runs 10

การเรียนรู้ภายใต้การดูแลเกี่ยวกับการจำแนกโหนดแบบไดนามิก (ซึ่งต้องใช้รูปแบบที่ผ่านการฝึกอบรมจากงานที่ดูแลตนเองโดยเช่นการเรียกใช้คำสั่งด้านบน):

 # TGN-attn: self-supervised learning on the wikipedia dataset
python train_supervised.py --use_memory --prefix tgn-attn --n_runs 10

# TGN-attn-reddit: self-supervised learning on the reddit dataset
python train_supervised.py -d reddit --use_memory --prefix tgn-attn-reddit --n_runs 10

เส้นขอบฟ้า

 ### Wikipedia Self-supervised

# Jodie
python train_self_supervised.py --use_memory --memory_updater rnn --embedding_module time --prefix jodie_rnn --n_runs 10

# DyRep
python train_self_supervised.py --use_memory --memory_updater rnn --dyrep --use_destination_embedding_in_message --prefix dyrep_rnn --n_runs 10


### Reddit Self-supervised

# Jodie
python train_self_supervised.py -d reddit --use_memory --memory_updater rnn --embedding_module time --prefix jodie_rnn_reddit --n_runs 10

# DyRep
python train_self_supervised.py -d reddit --use_memory --memory_updater rnn --dyrep --use_destination_embedding_in_message --prefix dyrep_rnn_reddit --n_runs 10


### Wikipedia Supervised

# Jodie
python train_supervised.py --use_memory --memory_updater rnn --embedding_module time --prefix jodie_rnn --n_runs 10

# DyRep
python train_supervised.py --use_memory --memory_updater rnn --dyrep --use_destination_embedding_in_message --prefix dyrep_rnn --n_runs 10


### Reddit Supervised

# Jodie
python train_supervised.py -d reddit --use_memory --memory_updater rnn --embedding_module time --prefix jodie_rnn_reddit --n_runs 10

# DyRep
python train_supervised.py -d reddit --use_memory --memory_updater rnn  --dyrep --use_destination_embedding_in_message --prefix dyrep_rnn_reddit --n_runs 10

การศึกษาด้วยการระเหย

คำสั่งที่จะทำซ้ำผลลัพธ์ทั้งหมดในการศึกษาการระเหยผ่านโมดูลที่แตกต่างกัน:

 # TGN-2l
python train_self_supervised.py --use_memory --n_layer 2 --prefix tgn-2l --n_runs 10 

# TGN-no-mem
python train_self_supervised.py --prefix tgn-no-mem --n_runs 10 

# TGN-time
python train_self_supervised.py --use_memory --embedding_module time --prefix tgn-time --n_runs 10 

# TGN-id
python train_self_supervised.py --use_memory --embedding_module identity --prefix tgn-id --n_runs 10

# TGN-sum
python train_self_supervised.py --use_memory --embedding_module graph_sum --prefix tgn-sum --n_runs 10

# TGN-mean
python train_self_supervised.py --use_memory --aggregator mean --prefix tgn-mean --n_runs 10

ธงทั่วไป

 optional arguments:
  -d DATA, --data DATA         Data sources to use (wikipedia or reddit)
  --bs BS                      Batch size
  --prefix PREFIX              Prefix to name checkpoints and results
  --n_degree N_DEGREE          Number of neighbors to sample at each layer
  --n_head N_HEAD              Number of heads used in the attention layer
  --n_epoch N_EPOCH            Number of epochs
  --n_layer N_LAYER            Number of graph attention layers
  --lr LR                      Learning rate
  --patience                   Patience of the early stopping strategy
  --n_runs                     Number of runs (compute mean and std of results)
  --drop_out DROP_OUT          Dropout probability
  --gpu GPU                    Idx for the gpu to use
  --node_dim NODE_DIM          Dimensions of the node embedding
  --time_dim TIME_DIM          Dimensions of the time embedding
  --use_memory                 Whether to use a memory for the nodes
  --embedding_module           Type of the embedding module
  --message_function           Type of the message function
  --memory_updater             Type of the memory updater
  --aggregator                 Type of the message aggregator
  --memory_update_at_the_end   Whether to update the memory at the end or at the start of the batch
  --message_dim                Dimension of the messages
  --memory_dim                 Dimension of the memory
  --backprop_every             Number of batches to process before performing backpropagation
  --different_new_nodes        Whether to use different unseen nodes for validation and testing
  --uniform                    Whether to sample the temporal neighbors uniformly (or instead take the most recent ones)
  --randomize_features         Whether to randomize node features
  --dyrep                      Whether to run the model as DyRep

โทดอส

ทำให้หน่วยความจำรหัสมีประสิทธิภาพ: เพื่อความเรียบง่ายโมดูลหน่วยความจำของโมเดล TGN จะถูกนำมาใช้เป็นพารามิเตอร์ (เพื่อให้เก็บและโหลดเข้าด้วยกันของโมเดล) อย่างไรก็ตามสิ่งนี้ไม่จำเป็นต้องเป็นกรณีและการใช้งานที่มีประสิทธิภาพมากขึ้นซึ่งปฏิบัติต่อโมเดลเป็นเพียงเทนเซอร์ (ในลักษณะเดียวกับคุณสมบัติการป้อนข้อมูล) จะคล้อยตามกราฟขนาดใหญ่

อ้างอิงเรา

 @inproceedings { tgn_icml_grl2020 ,
    title = { Temporal Graph Networks for Deep Learning on Dynamic Graphs } ,
    author = { Emanuele Rossi and Ben Chamberlain and Fabrizio Frasca and Davide Eynard and Federico 
    Monti and Michael Bronstein } ,
    booktitle = { ICML 2020 Workshop on Graph Representation Learning } ,
    year = { 2020 }
}