Chapter 4.2.4 116 Fig. 3. Color-coded ribbon illustration for two complete surgical videos from (a) Cholec80 and (b) Sacrocolpopexy processed by Trans-SV and TRN models. Additionally, TRN21/41 FI also have substantially better event-based metrics than frame-based methods due to its formulation. An operation may not have a complete set of phases. For the missing phases in Cholec80 test videos, it has little impact as shown in supplementary document. The RL agent makes begin and end labels converge towards the same/consecutive timestamps. Sometimes there are still residual frames erroneously predicted as the missing phase. These errors are counted in reported statistics resulted in imperfect event ratios. For sacrocolpopexy, we display a case where our partial-coverage models (TRN21/81 FI) are at their best in terms of computational efficiency. These are very long procedures and we are interested in only the suturing phases as an example of clinical interest.191 Therefore, a huge proportion of the video can be ignored for a full segmentation. Our models slightly under perform all baselines in frame-based metrics, but achieve this result by only looking at under 20% of the videos on average. CONCLUSION In this work we propose a new formulation for surgical workflow analysis based on phase transition retrieval (instead of frame-based classification), and a new solution to this problem based on multiagent reinforcement learning. This poses a number of advantages when compared to the conventional frame- based methods. Firstly, we avoid any frame-level noise in predictions, strictly enforcing phases to be continuous blocks. This can be useful in practice if, for example, we are interested in time-stamping phase transitions, or in detecting unusual surgical workflows (phases occur in a non-standard order), both of which are challenging to obtain from noisy frame-based classifications. In addition, our models with partial coverage (TRN21/41/81 FI) are able to significantly reduce the number of frames necessary to produce a complete segmentation result.
RkJQdWJsaXNoZXIy MTk4NDMw