I 112 Table of Contents 114 165

Ann-Sophie Page

SCP workflow analysis 113 4 initialization positions to FI configuration and yield better performance. Merging Different Phases with Gaussian Composition: So far, we have only explained how our DQN transition retrieval model segments a single phase. To generalize this, we start by running an independently trained DQN transition retrieval model for each phase. If we take the raw estimations of these phase transitions, we inevitably create overlapping phases, or time intervals with no phase allocated, due to errors in estimation. To address this, we perform a Gaussian composition of the predicted phases. For each predicted pair of transitions fnb, fne, we draw a Gaussian distribution centered at fnb +fne , with standard deviation |fnb−fne| . For each video clip, the final multi-class prediction corresponds to the phase with maximum distribution value. Training Details The DQN model is trained in a multi-agent mode where Wb, We for a single phase are trained together. The input for individual DQNs in each agent shares a public state concatenated from the content of both search windows, allowing the agents to be able to aware information of others. The procedures of training the DQN are showing in pseudo code in Algorithm 1. For one episode, videos are trained one by one and the maximum number of steps an agent can explore in a video is 200 without early stopping. For every steps the agents made, movement information (sk, sk+1, ak, rk) are stored in its replay memories, and sampled with a batch size of 128 in computing Huber loss.210 This loss is optimized with gradient descent algorithm, where α is the learning rate and ∇Wk Lk is the gradient of loss in the direction of the network parameters. Experiment Setup and Dataset Description The proposed network is implemented in PyTorch using a single Tesla V100- DGXS-32GB GPU of an NVIDIA DGX station. For the ResNet-50 part, PyTorch default ImageNet pretrained parameters

Made with FlippingBook

RkJQdWJsaXNoZXIy MTk4NDMw