In the era of data-driven machine learning algorithms, data is the new oil. For the most optimal results, datasets
should be large, heterogeneous and, crucially, correctly labeled. However, data collection and labeling are timeconsuming and labor-intensive processes. In the field of medical device segmentation, present during minimally
invasive surgery, this leads to a lack of informative data. Motivated by this drawback, we developed an algorithm
generating semi-synthetic images based on real ones. The concept of this algorithm is to place a randomly shaped
catheter in an empty heart cavity, where the shape of the catheter is generated by forward kinematics of continuum robots. Having implemented the proposed algorithm, we generated new images of heart cavities with
various artificial catheters. We compared the results of deep neural networks trained purely on real datasets with
respect to networks trained on both real and semi-synthetic datasets, highlighting that semi-synthetic data improves catheter segmentation accuracy. A modified U-Net trained on combined datasets performed the segmentation with a Dice similarity coefficient of 92.6 ± 2.2%, while the same model trained only on real images
achieved a Dice similarity coefficient of 86.5 ± 3.6%. Therefore, using semi-synthetic data allows for the
decrease of accuracy spread, improves model generalization, reduces subjectivity, shortens the labeling routine,
increases the number of samples, and improves the heterogeneity.