5.4. Discussion 5 77 simulated phosphene representation. InFigure5.8it can be observed that the model has successfully learned to preserve correspondences between the phosphene representation and the input image in all of the training conditions. However, as a more formal analysis was outside the scope of this study, we did not further quantify the subjective interpretability. In our model the subjective interpretability was promoted through the regularization loss between the simulated phosphenes and the input image. Similarly, a recent study adapted an auto-encoder architecture designed to directly maximize the perceptual correspondence between a target representation and the simulated percept in a retinal prosthesis, using basic stimuli (Granley et al., 2022a). The preservation of subjective interpretability in an automated optimization pipeline remains a non-trivial challenge, especially when using natural stimuli. This may be even more important for cortical prostheses, as the distinct punctuate phosphenes are in nature very dissimilar from natural images, possibly hampering perceptual similarity metrics that rely on lowlevel feature correspondence. Eventually, the functional quality of the artificial vision will not only depend on the correspondence between the visual environment and the phosphene encoding, but also on the implant recipient’s ability to extract that information into a usable percept. The functional quality of end-to-end generated phosphene encodings in daily life tasks will need to be evaluated in future experiments. Regardless of the implementation, it will always be important to include human observers (both sighted experimental subjects and actual prosthetic implant users) in the optimization cycle to ensure subjective interpretability for the end user (Beyeler & Sanchez-Garcia, 2022; Fauvel & Chalk, 2022). 5.4.3. General limitations and future directions Performance and hardware There are some remaining practical limitations and challenges for future research. We identify three considerations related to the the performance of our model and the required hardware for implementation in an experimental setup. Firstly, although our model runs in real-time and is faster than the state-of-the art realistic simulation for retinal prostheses (Beyeler et al., 2017), there is a trade-off between speed and the memory demand. Therefore, for higher resolutions and larger numbers of phosphenes, future experimental research may need to adapt a simplified version of our model - although most of the simulation conditions can be run easily with common graphical cards. While several simulators exist for cortical prostheses that run in real time without requiring a dedicated graphics card (Fehervari et al., 2010; Li, 2013), none of these incorporate current spread-based models of phosphene size, realistic stimulation parameter ranges or temporal dynamics. An effort can be made to balance more lightweight hardware requirements with more realistic phosphene characteristics. Secondly, a suggestion for follow-up research, is to combine our simulator with the latest developments in mixed reality (XR) to enable immersive simulation in virtual environments. More specifically, a convenient direction would be the implementation of our simulator using the Cg shader programming language for graphics processing, which is used in 3D game engines like Unreal Engine, or Unity 3D, as previously demonstrated for epiretinal simulations by (Thorn et al., 2020). Thirdly and lastly, future studies could explore the effects of using eye-tracking technology with our simulation software. Even after loss of vision, the brain integrates eye movements
RkJQdWJsaXNoZXIy MTk4NDMw