Parameter settings
The maximum sentence length is set to 128 and the batch size to 64; SGD
is used as an optimizer to optimize all trainable parameters. The number
of hidden layer units in GCN used herein is 200, the learning rate is
0.001, and parameter λ is set to 0.3 according to the experiment.
The BERT learning rate is set to 1 × 10-5 and the
dropout value is set to 0.5. The experiments in this article were
conducted on a high-performance computer with an Nvidia T1 graphics card
and 32 GB of RAM, using the PyTorch 1.5.0 framework and Python 3.6.