Accuracy of triangular meshes of stone models created from DICOM cone beam CT data

Background The aim of this study was to assess the theory that CBCT scanners can be used for a subsequent triangular mesh generation which accurately represents the actual stone model. Ten, recently acquired stone models, were used in the present study. The stone models were initially scanned with the Dental Wings 7Series dental scanner. Each stone model was then scanned using a 150-μm voxel resolution in a Planmeca Mid CBCT device with 2 sets of exposure parameters and in a Newtom VG device. The DICOM files were initially imported in Blue Sky Plan implant surgery software, segmented and then imported for computational manipulation in CloudCompare, a dedicated mesh handling software. Results For all CBCTs and for all exposure parameters, the mean (SD) difference was 0.052 (0.011) mm ranging from 0.032 to 0.070 mm with a 95% CI for the population mean of 0.052 ± 0.004 mm. Specifically, the mean (SD) difference for each device/exposure parameter tested was (1) Newtom VG = 0.040 (0.006) mm, (2) Planmeca Mid 90 = 0.057 (0.0066) mm, and (3) Planmeca Mid 80 = 0.059 (0.0063) mm. Conclusions There are differences amongst the CBCT models, whilst different exposure parameters of the same model do not seem to offer a significant advantage. The interaction between the threshold value and the imaging modality as far as the errors are concerned necessitates the careful selection of the right threshold value for the triangular mesh creation.


Background
The need for digital dental models is rising along with the increased usage of computer-aided designed and computer-aided manufactured (CAD/CAM) implant surgical guides, orthodontic clear aligners, the practice of 3D orthognathic surgery including the applications of the so-called virtual patient, and finally as a space-effective mean of model storage [1][2][3][4][5][6][7].
Scanning the stone model with either a 3D laser or a white light desktop scanner remains the gold standard for the digitisation procedure since this technique is widely accepted as being the best available method [8,9]. However, these desktop scanners are usually situated in dental laboratories and not in dental offices.
Cone beam CT devices for dental usage are becoming increasingly common in dental practices, offering advanced diagnostic capabilities to the general and to the specialised practitioner with a reduced radiation dose for the patient, compared to medical multi-slice CT scanners [10][11][12].
In view of the increased costs related to the acquisition of a highly accurate desktop scanning device and the fact that CBCT devices are becoming increasingly common either in dental practices or in specialised dental radiological facilities, it is our theory that these CBCT scanners can be used for a subsequent triangular mesh generation that accurately represents the actual stone model. The expected effect size (differences between the two modalities, CBCT and desktop scanner) should not prohibit the effective usage of this mesh model in certain clinical situations.
In this study, three (3) null hypotheses were tested concerning the triangular meshes produced by the transformation of CBCT data in the form of the digital imaging and communications in medicine standard (DICOM) and the triangular meshes produced by a desktop scanner: (a) There is no significant difference amongst the tested CBCT scanners in relation to accuracy. (b) There is no significant difference between different exposure factors on the same model of the CBCT scanner in relation to accuracy. (c) There is no interaction of the CBCT imaging modality with the threshold value used for the conversion of DICOM data to STL meshes in relation to accuracy.

Methods and methods
Ten (10), recently acquired for orthodontic reasons from fully dentate adult patients' stone models, were used in the present study. The stone models (five maxillae, five mandibles) were casted from type IV dental stone (Hera Moldastone CN, Kulzer) using common laboratory procedure and were scanned with the different modalities (laser scanner and CBCTs).
The Dental Wings 7series laser desktop scanner (Dental Wings Inc., Canada) was used as a reference scanner against which the meshes coming from CBCT devices were compared. This device is a commercial scanner equipped with a blue illumination laser beam used in dental labs and commonly employed for the digitisation of stone models in order to design and manufacture CAD/CAM dental prostheses. Its accuracy, as reported by the manufacturer, is 15 μm. The resultant triangular meshes of the stone models (stereolithography-STL files) were used as the gold standard ( Fig. 1).
In order to estimate the repeatability of the reference scanner, the first stone model (case 1, maxilla) was scanned with the laser scanner ten times. The ten meshes were simultaneously cropped and were finely registered with each other, following the same procedure as later described for the main study. This resulted in 90 pairs of meshes whilst each of the meshes acquired was sequentially used as a reference. The average standard deviation of the differences of the meshes was used as a measure of repeatability for the desktop scanner.
Each stone model was then scanned using a 150-μm voxel resolution: It should be noted that the scanning sequence started within 2 days after the desktop laser scanning, initially by using the Planmeca Mid (80 KVp, 12.5 mAs) device which was immediately followed by x-ray scanning with the same Planmeca Mid but with different exposure parameters (90 KVp, 14 mAs). The Newtom VG x-ray scanning commenced approximately 15 days later since this device is situated in another facility.
The data were exported in DICOM format by the proprietary software of each scanner resulting in three DICOM data folders for each stone model.
The DICOM files were then imported into the Blue Sky Plan implant surgery software (Blue Sky Plan, Blue Sky Bio, USA) and were segmented using the segmentation tool for models, incorporated in the software. The window level (L) was initially set at the value of 1425 and increased up to the value of 2625 in 100 unit steps in a Hounsfield (HU) scale with a scale range from − 1000 to 7000 units. The window width (W) is automatically set by the software to 2500 units. This manipulation resulted in 13 different segmentations for each stone model. Triangular meshes were calculated and exported as STL files for each of these segmentation values. As a result, 39 STL stone model representations for each stone model and for all radiation scanning modalities were produced. For each stone model, the 40 (reference standard and CBCTs) meshes were imported for computational manipulation in a dedicated mesh and point cloud handling software (CloudCompare, http://www.cloudcompare. org/). The triangular mesh which derived from laser scanning was used as a reference, and no other manipulation was permitted. The 39 CBCT-originated meshes were then initially roughly registered together using a minimum (3)(4)(5) number of points and then were again finely registered with each other using the iterative closest point (ICP) algorithm, calculated on a sample of 50,000 pairs of points. This resulted in the 39 meshes for each stone model overlapping one another. The meshes were then simultaneously cropped, thus leaving only the teeth and 3-5 mm of the gingiva. The end result was 39 triangular meshes representing the same stone model, with clinically relevant remaining anatomy, almost identical for each mesh. Finally, each of these meshes was again separately, roughly, and finely registered to the reference laser model (gold standard). This resulted in 39 registered to the gold standard different meshes for each stone model.
For each registered to the gold standard CBCT mesh, the absolute distance of each and every face of the mesh to a point on the surface of the reference (gold) standard was computed indicating the difference that exists between this mesh and the gold standard. The median value of the differences and the interquartile range (IQR) for each pair was noted. The product of the median value and the IQR was computed and was used as an index for the best matching mesh pair between the 13 pairs that constituted the group of segmentations for each stone model per scanning modality. We named this product the Dissimilarity Index (DI): The Dissimilarity Index (DI) ranges from zero to infinity with the values closer to zero representing smaller overall differences between the mesh coming from the CBCT and the gold standard.
In order to estimate the accuracy and the precision of our registration software (CloudCompare), five laser-scanned meshes were used. Each mesh was imported in the software and cloned; the clone was moved in a random way and then roughly and finely registered with the original. The mean difference between the meshes was used as a measure of the software accuracy whilst the standard deviation of the differences was used as a measure of the registration precision of the software.
Descriptive statistics were calculated, and inferences were drawn using (a) repeated measures one-way ANOVA with fixed factor 'X-ray Modality' (i.e. Planmeca80, Planmeca90, Newtom VG) and dependent variable 'the lowest (= best) value of the Dissimilarity Index per stone case and per x-ray modality (30 values)' and (b) two-way ANOVA with fixed factors 'X-ray Modality' and 'Threshold value' and dependent variable 'the Dissimilarity Index' , with Greenhouse-Geisser correction if appropriate (390 values). Results were considered significant for a Dunn-Sidak corrected p < 0.05. SPSS version 20 was used for the analysis. All values were rounded to 2 significant digits.

Results
The results for the repeatability study of the reference scanner and those for the accuracy and repeatability of the mesh handling software are presented in Table 1.
For the pair of meshes with the lowest (best) Dissimilarity Index for every stone model case and for every x-ray modality, the threshold value, the value below which the 95% of the differences between all the points are included, the median value of the error, and the IQR and the DI value are presented in Table 2. For these cases, for all CBCTs and for all exposure parameters (30 measurements), the mean (SD) difference was 0.052 (0.011) mm ranging from 0.032 to 0.070 mm with a 95% CI for the population mean of 0.052 ± 0.004 mm.
Two-way repeated measures ANOVA with fixed factors 'Xray Modality' (3 levels) and 'Threshold value' (13 levels) was run. There was a statistically significant interaction between the factors (F(1.3, 11.5) = 18, p = 0.01), revealing that the effect of the imaging modality (Planmeca80, Plan-meca90, Newtom VG) on the differences between the meshes (CBCT and gold standard) depends on the threshold value (Fig. 2). Therefore, simple main effects were run. For the simple main effect of x-ray modality on error, the differences between the Plamneca80 and Planmeca90 were significant for all the threshold values except the values 2025-2525 HU with the difference between the meshes (CBCT and gold standard) always positive (error Planmeca80 > Planmeca90). For the differences between the Planmeca80 and Newtom VG, and Planmeca90 and Newtom VG, the error was statistically significant for all the threshold values except the values 2525-2625 HU, with the Newtom VG exhibiting smaller error in all the threshold values (error Planmeca80 > Newtom VG and error Planmeca90 > Newtom VG) ( Table 3).
Various descriptive statistics are presented in Table 4.

Discussion
There is an increasing need for indirect digitisation of dental arches for simulation and treatment purposes. One method of indirect digitisation is scanning the cast model with a CBCT device to create triangular meshes which can then be used for the fabrication of certain dental prostheses. It was the main purpose of this study to estimate how accurately a CBCT scanner can replicate the details of the gypsum model, thus providing support to our theory coming from personal experience that CBCT scanners are capable of the task. For the estimation of the repeatability of the reference scanner, the standard deviation of the differences between identical meshes was employed, a method also used by other studies [13,14]. Our mean standard deviation between the 90 pairs of meshes was 4.1 μm, and the repeatability could be considered excellent. The mean value of 16 μm for the differences between the meshes is the average deviation from the truth (0 mm); it can be considered a measure of accuracy, and it is similar to the value provided by the manufacturer of the laser scanner (15 μm).
Instead of a commercial alternative, CloudCompare (Version 2.9 Omnia, http://www.cloudcompare.org/), an independent open source project and free software under the GNU General Public License, was chosen as our mesh handling and comparison software. The accuracy and the repeatability of the software as expressed by the standard deviation and the mean values of differences between identical meshes can be considered adequate for the needs of the present study (Table 1).
Concerning Blue Sky Plan (Blue Sky Bio, USA), the software used to calculate and export the triangular meshes from DICOM data, it is a commercial software used for the design and fabrication of 3D surgical implant guides. The segmentation is accomplished with the use of a Hounsfield calibrated scale, and the software allowed the easy export of multiple meshes of the same stone model in different and discrete threshold values. The range of threshold values was decided based on our experience of using the value of 1750 HU for the production of meshes in our practices. As the main statistic for our analysis, the Dissimilarity Index was used. This index can be considered as an unstandardized measure of effect and was devised when it became necessary to combine the measure of central tendency with the measure of the dispersion of the errors. Since the differences between the CBCT meshes and the gold standard were always skewed and away from normality, the median and the interquartile range were the appropriate statistics. It was invariably noted that the IQR values follow the tendency of the median (Fig. 3). However, when the median was approaching its lowest value, similar median values for different neighbour segmentations were computed. The product of the median and the IQR gives a greater resolution on the threshold value which best describes the pair of meshes with the lowest errors, taking into account not only the median value of the error but also the dispersion of the errors, too. The value of the DI can be easily traced back to its constituents (median and interquartile range), when necessary.
The average mean (SD) difference of the median errors for all the CBCT modalities was 0.052 mm (0.011) resulting in a 95% CI for the errors of 0.048 to 0.056 mm. To our knowledge, only one study evaluated the accuracy of stone model meshes originating from CBCT data [15]. The authors examined 8 different CBCT devices and found a mean difference of 0.064 ± 0.005 mm against 5 different extraoral digitisers used as a gold standard. However, in that study, the threshold value used for the DICOM to mesh conversion was at the discretion of the investigator. In our study, we found a significant interaction between the error and the threshold value for the different imaging modalities indicating that an appropriate threshold value must be computed for each device in order to minimise errors. An average value of 0.052 mm for the median error must be evaluated taking into account the overall expected errors of other modalities that are used in order to digitise the tooth reality. The recommended in vitro benchmark of ± 20 μm for the replication of the tooth morphology establishes the upper limit of accuracy for the manufacturing of any successful prosthesis in the area of prosthodontics [16]. The in vitro accuracy of conventional impression and stone model pouring method has been found to be 20.4 ± 2.2 μm, and this error can be considered as the limit for the ability of the classical method to replicate reality [17]. The standard procedure for the indirect digitisation is the use of a lab desktop laser or white light scanner for the facilitation of CAD/CAM prostheses. The accuracy of these commercial lab scanners has been proven by a number of studies and ranges from 6 to 33 μm with the majority of the scanners in the under 20 μm category [18,19]. A final and newer method for the direct replication of tooth anatomy is with the use of intraoral scanners that can entirely bypass the impression and stone model pouring technique. A number of studies show that the full-arch accuracy of these devices ranges in vitro from 11.5 to 332.9 μm. [17,[20][21][22].
The value below which the errors are situated when 95% of the points for each pair of meshes (CBCT and gold standard) is considered was also calculated. Even though values of central tendency and dispersion are useful, this 95% value gives an idea for the majority of the absolute errors that are expected in the totality of the stone model, without any averaging. This value being in any situation below 210 μm is conservative and, as it can be seen from inspection of the deviation maps, usually reflects the error from the soft tissue or is the result of calculations in a very small number of points. We removed the extreme 5% of the errors between points since it is expected that outliers can inflate our range of differences. It should be noted that in studies where differences between meshes are computed, 60-80% of the pairs of points are used [22]. In our study, Considering the different imaging modalities, the Newtom VG device was significantly more accurate than the Planmeca Mid for both sets of exposure parameters.   Even though the stone models were scanned with the Newtom device 2 weeks after the stone pouring and suffered from handling due to transportation in another facility, the average error for the Newtom was 40 μm, ranging from a minimum value of 32 to a maximum value of 49 μm. The Newtom VG device operates in higher kilovoltage peak than the Planmeca, uses a rotating anode with a very small focal spot, and has a high total filtration value, which differences possibly partially explain its better performance. For the Planmeca Mid device, the difference between the two sets of exposure parameters was not significant, possibly implying that it is not the exposure parameters per se, but other software and hardware issues that increased the error compared to the Newtom VG. (Fig. 3). A range of values was used in the Blue Sky Plan software in order to find the threshold that would produce the triangular mesh with the minimum error. The right threshold value is of importance especially for the Planmeca device since the error could be large for low threshold values. It was seen that with the low threshold value of 1425 HU, the error was maximum and the error reached a minimum at the area of 1825 to 2225 HU value after which it started increasing again, albeit very slowly. The mean best threshold value was almost the same for both Planmeca settings (2155 vs 2145 HU) as it was the range of the values for the smaller error (1925-2425 vs 1825-2425 HU) indicating that for the Planmeca device in general, any threshold value in the range of 1925-2425 HU can be used with relative safety. For the Newtom VG device, the errors in every threshold value were significantly less than the Planmeca and they reached their minimum at the values of 1825 to 2225 HU with a mean value of 1975 HU, indicating again that for Newtom VG device, any value in the range 1825-2225 HU can be used with relative safety. Combining the results of both CBCT models and of every exposure parameter, we find a mean (SD) value of 2092 (184) HU and a 95% CI for the mean of 2023-2161 HU. This range includes part of the best values of every device and every exposure setting and can be considered safe for the thresholding of stone models in all the CBCT devices of our study.
The use of CBCT for the scanning of stone models introduces artefacts to the final image. The divergent nature of the x-ray beam means that only the object lying in the midplane will be accurately reproduced [23]. Depending on the cone beam angle of the midplane, inaccuracy is to be expected due to data inconsistency. In the present study, the stone models were placed in the centre of the field of view and with the arch of the teeth parallel to the midplane minimising the image degradation. Beam hardening artefacts resulting from the polychromatic nature of the x-ray beam showing as streaks and shadows in the reconstructed images should also be expected, especially when the kilovoltage peak value is low [24]. In our cases, no such image degradation was optically noted for any given combination of exposure parameters, even when the 80 KVp tube voltage was used. Other possible limitations of the CBCT scanning method include the size of the focal spot and penumbra effects, the limited spatial resolution of the flat panel, the electronic and statistical x-ray noise, and the partial volume averaging effects [25]. In addition to the errors due to the physical process of x-ray exposure and the inherent limitations of the CBCT devices, there is always a possibility of having defects such as inaccuracy due to errors in the STL file, following the DICOM to STL conversion process [26].
With an average error of 0.052 mm, a number of applications seem feasible. In orthodontic applications, an error in the models of 200 μm seems acceptable [27], whilst in the cases of surgical guides, the knowledge of this error could be incorporated in the design, if necessary. For model storage, this value is more than adequate.
The main limitation of our study was the use of a commercial desktop dental scanner as our gold standard. The use of an industrial scanner could offer greater accuracy and repeatability. However, the repeatability of our scanner could be considered excellent, whilst the estimation of its accuracy was in the range defined by the manufacturer. In addition, the main purpose of our study was to estimate the performance of the x-ray scanners in comparison with a commonly used and universally available method.
In conclusion, the results of this study provide support to our theory that CBCT scanners can be used for the clinically relevant accurate digitization of stone models.
However, there are significant differences between the two CBCT models used; therefore, hypothesis 'a' was rejected. Different exposure parameters of the same CBCT model do not seem to offer a significant advantage, and, therefore hypothesis 'b' could not be rejected. Finally, the interaction between the threshold value and the exposure modality as far as the errors are concerned mandates the careful selection of the right threshold value for the triangular mesh creation. Tested hypothesis 'c' was therefore also rejected.