IMAGE SEGMENTATION USING TRANSFER LEARNING WITH DEEPLABV3 TO FACILITATE PHOTOGRAMMETRIC LIMB SCANNING

Abstract

In this paper, we explore the use of deep learning (DL) in conjunction with photogrammetry for scanning amputated limbs. Combining these two technologies can expand the scope of prosthetic telemedicine by facilitating low-cost limb scanning using cell phones. Previous research identified image segmentation as one of the main limitations of using photogrammetry for limb scanning. Based on those limitations, this work sought to answer two main research questions: (1) Can a neural network be trained to identify and segment an amputated limb automatically? (2) Will segmenting 2D limb images using neural networks impact the accuracy of 3D models generated via photogrammetry? To answer the first question, transfer learning was applied to a neural network with the DeepLabv3 architecture. After training, the model was able to successfully identify and segment limb images with an IoU of 79.9%. To answer the second question, the fine-tuned DL model was applied to a dataset of 22 scans comprising 6312 limb images, then 3D models were rendered utilizing Agisoft Metashape. The Mean Absolute Error (MAE) of models rendered from images segmented with DL was 0.57 mm ± 0.63 mm when compared to models rendered from ground truth images. These results are important because segmentation with DL makes photogrammetry for limb scanning feasible on a large clinical scale. Future work should focus on generalizing the segmentation model for different types of amputations and imaging conditions.

1. INTRODUCTION

Rehabilitative care for persons with limb loss is rapidly evolving due to advances in digital healthcare technologies. Novel digital workflows are empowering clinicians with tools for visualizing patient anatomy and physiology, designing custom fitting prostheses via computer aided design (CAD), building assistive devices with computer aided manufacturing (CAM), and tracking patient response in environments such as virtual reality (VR) Cabrera et al. (2021) We hypothesize that automating image segmentation via DeepLabv3 then rendering the segmented images using photogrammetry could create an efficient processing pipeline for scanning amputated limbs with smartphones. Automatic segmentation of limb photographs would allow for more photographs to be taken during the scanning procedure thus increasing the sampling density. With these additional photographs, it would be possible to improve the coverage and accuracy of 3D models. Finally, segmentation could help correct for motion of the limb during the scanning procedure. These potential benefits would allow photogrammetric limb scanning to be used on a larger scale to reach more patients in the clinic and remotely via telemedicine. .

2.1. PHOTOGRAMMETRY FOR MEDICAL IMAGING

In its simplest form, photogrammetry is the science of measuring size and shape of objects using images Fryer (1996) . In this context, photogrammetry has been used extensively since the 1950's for medical imaging and measurementNewton & Mitchell (1996) . The development of digital cameras and the accompanying transition to digital photogrammetry has led to technologies for the reconstruction of 3D models using photogrammetric algorithms Linder (2016). Digital photogrammetry has been used successfully to reconstruct patient anatomy in many medical contexts such as cranial deformation scanning Barbero-García et al. (2017; 2020; 2021) Two values for accuracy are commonly reported for photogrammetric models: Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE). RMSE values will always be greater than or equal to MAE values and are more susceptible to outliers, but are recommended when errors are unbiased and follow a normal distribution Chai & Draxler (2014). Close range photogrammetric approaches have been proven to have accuracy comparable to clinical gold standard technologies Ross et al. (2018) . Using an Artec Spider structured light scanner as a clinical "gold standard" reference, Nightingale et al. (2020) found that photogrammetric reconstructions of facial scans using 80 images captured had RMSE accuracy values of 1.3 mm ± 0.3 mm. In a similar study, Nightingale et al. (2021) achieved RMSE accuracy values of 1.5 mm ± 0.4 mm with 30 photographs on reconstructions of the external ear. Using spherical objects with known geometry as a reference, Barbero-García et al. (2018) was able to achieve an MAE accuracy of 0.3 mm ± 0.2 mm using 95 images with tie point aids. In later research, these authors were able to achieve similar accuracy while scanning infant skulls with MAE accuracy values of 0.5 mm ± 0.4 mm and 200 images Barbero-García et al. (2020) . While the accuracy of photogrammetry for anatomical scanning is very good, workflows involving photogrammetry require a great deal of human input often taking hours Barbero-García et al. (2017); Cabrera et al. (2021) . For this reason, recent research has focused on various methods for automating photogrammetric workflows for anotomical scanning Barbero-García et al. (2020); Cabrera et al. (2020) . Photogrammetric models are rendered following acquisition (not in real time) thus errors in the image acquisition stage may not become evident until after a patient is scanned and no longer present. Automated approaches have focused their attention on this acquisition stage to ensure completeness of the results Nocerino et al. (2017) with recent advances incorporating machine learning for landmark detection Barbero-García et al. (2021) . Still, automation of photogrammetric image processing (specifically image segmentation) remains a large problem Cabrera et al. (2021) . Automating this image segmentation step could dramatically increase the speed of photogrammetric workflows for medical imaging, improving the clinical viability.



Medical imaging technologies are fundamental to every digital workflow because they inform clinicians of limb geometry, surface and/or sub-surface features, plus pathology of amputated limbs Paxton et al. (2022). Systematic reviews by Cabrera et al. (2021) and Paxton et al. (2022) identified photogrammetry as a promising technology for capturing patient surface anatomy. The main advantage of photogrammetric scanning is that models can be rendered using photographs captured via smartphones Cabrera et al. (2020); Barbero-García et al. (2018); De Vivo Nicoloso et al. (2021); R. B. Taqriban et al. (2019); Ismail et al. (2020); Barbero-García et al. (2020; 2021). Scanning with smartphones is significantly cheaper than other medical imaging modalities Cabrera et al. (2021); Paxton et al. (2022) and results in reliable and robust surface accuracy on par with existing clinical gold standard technologies Nightingale et al. (2020; 2021). Unfortunately, photogrammetry workflows often require extensive image segmentation, at the expense of human operator time and effort, in order to render 3D models Cabrera et al. (2021). Segmentation is an important problem in medical imaging and involves separating regions of interest (ROIs) from the rest of an acquired image. Convolutional neural networks (CNNs) are regarded as the dominant state-of-the-art approach for medical image segmentation in applications requir-ing high-accuracy Kumar et al. (2020); Wang et al. (2022). Deep convolutional neural networks (DCNNs), such as DeepLabv3, are able achieve high IoU performance when classifying pixels and outperform other CNN architectures Chen et al. (2017b). Using transfer learning, it is possible to fine-tune pre-trained deep neural networks with instances from the target domain Zhuang et al. (2020). Transfer learning is crucial to medical imaging because in many cases it is not possible to collect sufficient training data Kumar et al. (2020); Wang et al. (2022); Zhuang et al. (2020).

, facial scanning Ross et al. (2018); Nightingale et al. (2020; 2021), and amputated limb scanning R. B. Taqriban et al. (2019); Cabrera et al. (2020); Ismail et al. (2020); De Vivo Nicoloso et al. (2021).

