Hylke Salverda

96 Chapter 5 Discussion In this study we found little to no difference when comparing descriptive statistics of one-per-minute data and one-per-second data from the same source. This included clinically relevant outcomes as proportion of time within oxygen saturation target range, hypoxia and days of supplemental oxygen. Sub analyses of recording under 100 or 200 hours showed no difference. The results suggest that routinely collected data recordings of comparable length or longer could be used for retrospective studies. Although using routinely collected vital parameters for big data analysis and machine learning is increasingly popular, to our knowledge there is no literature available describing the minimum data sampling frequency for our purpose. From the field of data signal processing the Nyquist-Shannon sampling theorem3 provides us with a guideline for a sufficient sample-rate, but this is aimed at reproducing the original signal, and not the summarizing statistic we often require for our retrospective studies. One could argue that taking a sample every minute from continuous vital signs monitoring is somewhat analogous to research in general. It is uneconomical to study an entire population, thus we take a representative sample. When the change of being sampled is related to the outcome there is a chance of biased results. Although our sample is not at random, the value is always extracted in the first second of the minute. It is unlikely that a vital parameter like heart rate is systematically lower in the first second of the minute or in other words related to the outcome. There may be a detectable circadian trend in the average heart rate, but the instantaneous heart rate should not be related to a certain second within a minute. Limitations of our study are that 1-per-minute data cannot be used to calculate the length of vital sign episodes, for example the duration of a hypoxic episode, or other more elaborate outcomes from complex signal processing techniques. We have only investigated descriptive statistics of SpO2 and FiO2 and most of the data recordings had a minimum duration of 100 hours. It should also be noted that intra-recording differences were present but these are averaged out over the entire set. Finally, to prevent synchronization issues we did not compare our PDMS data directly with our higher frequency data, but down-sampled the latter. However, because in neither case filtering, anti-aliasing or other processing was done they are comparable.