3 34 3. Towards a task-based computational evaluation benchmark Figure 3.8: Results of the input perturbation analysis. To evaluate which visual information is the most relevant for the actions of the model, the input frame was systematically occluded at different locations with a localaverage circular mask. Colors indicate the change in Q-value prediction for the ’forward step’ action for local masking of the input frame. Left panel: positive changes in the Q-value predictions. Right panel: negative changes in the Q-value predictions for forward steps. 3.3.2. Edge detection thresholds The results of training models with different edge detection parameters are displayed in Figure3.9. The performance displays an inverted-u-shape, where optimal performance is achieved for a value of 100. Note that even at the optimal threshold, the models were not able to avoid all obstacles using the phosphene vision input (opposed to the collision-free baseline performance; see Figure 3.6). The negative total reward, obtained with the highest threshold indicates that the model performed more negatively-valued actions (e.g., side-stepping, or actions that resulted in a collision) than positive actions (i.e., a collision-free forward step). Figure 3.9: Test performance for training virtual implant users (n=3) on different edge detection thresholds. Left: the number of box collisions in the fixed test environment, right: the total obtained test reward. 3.3.3. Phosphene resolution The results of training the model with different phosphene resolutions are visualized in Figure 3.10. In general, higher phosphene resolutions resulted in larger rewards and fewer collisions. The performance increase drops off at higher phosphene resolutions. We found no significant differences between the plain and complex environment.
RkJQdWJsaXNoZXIy MTk4NDMw