The paucity of physiological time-series data collected from low-resource clinical settings limits the capabilities of modern machine learning algorithms in achieving high performance. Such performance is further hindered by class imbalance; datasets where a diagnosis is much more common than others. To overcome these two issues at low-cost while preserving privacy, data augmentation methods can be employed. In the time domain, the traditional method of time-warping could alter the underlying data distribution with detrimental consequences. This is prominent when dealing with physiological conditions that influence the frequency components of data. In this paper, we propose PlethAugment; three different conditional generative adversarial networks (CGANs) with an adapted diversity term for the generation of pathological photoplethysmogram (PPG) signals in order to boost medical classification performance. To evaluate and compare the GANs, we introduce a novel metric-agnostic method; the synthetic generalization curve. We validate this approach on two proprietary and two public datasets representing a diverse set of medical conditions. Compared to training on non-augmented class-balanced datasets, training on augmented datasets leads to an improvement of the AUROC by up to 29% when using cross validation. This illustrates the potential of the proposed CGANs to significantly improve classification performance.
IEEE journal of biomedical and health informatics