A deep learning model to predict recurrence of atrial fibrillation after pulmonary vein isolation

The efficacy of radiofrequency catheter ablation (RFCA) in atrial fibrillation (AF) is well established. The standard approach to RFCA in AF is pulmonary vein isolation (PVI). However, a large proportion of patients experiences recurrence of atrial tachyarrhythmia. The purpose of this study is to find out whether the AI model can assess AF recurrence in patients who underwent PVI. This study was a retrospective cohort study that enrolled consecutive patients who underwent catheter ablation for symptomatic, drug-refractory AF and PVI. We developed an AI algorithm to predict recurrence of AF after PVI using patient demographics and three-dimensional (3D) reconstructed left atrium (LA) images. We included 527 consecutive patients in the study. The overall mean LA diameter was 42.0 ± 6.8 mm, and the mean LA volume calculated using 3D reconstructed images was 151.1 ± 46.7 ml. During the follow-up period, atrial tachyarrhythmia recurred in 158 patients. The area under the curve (AUC) of the AI model based on a convolutional neural network (including 3D reconstruction images) was 0.61 (95% confidence interval [CI] 0.53–0.74) using the test dataset. The total test accuracy was 66.3% (57.0–75.6), and the sensitivity was 53.3% (34.8–71.9). The specificity was 73.2% (51.8–75.0), and the F1 score was 52.5% 34.5–66.7). In this study, we developed an AI algorithm to predict recurrence of AF after catheter ablation of PVI using individual reconstructed LA images. This AI model was unable to predict recurrence of AF overwhelmingly; therefore, further large-scale study is needed.


Introduction
The efficacy of radiofrequency catheter ablation (RFCA) in atrial fibrillation (AF) is well established [1]. Maintaining a normal sinus rhythm decreases the risk of stroke and heart failure [2,3]. The standard approach for RFCA of AF is pulmonary vein isolation (PVI) for both paroxysmal and persistent AF because most triggers arise in the pulmonary veins [4,5]. Nevertheless, many patients experience recurrence of atrial tachyarrhythmia and require repeat ablation. Many other strategies for adjuvant substrate modification are required to improve ablation outcomes [6,7]. Still, the substrate modification group in a previous study did not demonstrate results superior to the PVI-only group in persistent AF [5]. Therefore, patient selection is important in determining whether to ablate the PVI alone or to modify the substrate.

Open Access
International Journal of Arrhythmia This study used a deep learning model to offer a prediction for recurrence of AF in patients who have undergone PVI alone. The purpose of this study is to find out whether the AI model can assess AF recurrence in patients who underwent PVI.

Study population
This study was a retrospective cohort study that enrolled consecutive patients who underwent catheter ablation for symptomatic, drug-refractory AF and PVI only from August 2013 to December 2016. We developed an AI algorithm to predict recurrence of AF after PVI with or without cavotricuspid isthmus (CTI) ablation. Therefore, we excluded patients who underwent other ablation procedures, including linear ablation or complex fractionated electrogram ablation. In these patients, any type of AF was included, either paroxysmal or persistent. This study was approved by the Institutional Review Board of the Catholic Medical Center, South Korea.

Data collection
The study cohort was classified into two groups. The first is a recurrence group that demonstrated atrial tachyarrhythmia that lasted for more than 30 s and was detected by a 12-lead electrocardiogram (ECG) or 24-h Holter monitor after a three-month blanking period following a catheter ablation procedure. The second is a non-recurrence group with no history of atrial tachyarrhythmia on ECG or 24-h Holter monitoring after catheter ablation during follow-up.
We used patient demographics and three-dimensional (3D) reconstructed images of the left atrium (LA) as predictive variables to develop the algorithm. The demographic information comprised age, sex, body mass index, other underlying diseases including congestive heart failure, hypertension, diabetes, history of stroke, vascular disease, chronic obstructive pulmonary disease and thyroid disease, and LA size. The LA size was assessed in two ways: echocardiogram and 3D computed tomography (CT) imaging. We measured LA volume during diastolic and systolic phases, including any LA appendages, and calculated the LA ejection fraction. In addition, we obtained 3D reconstructed LA images from a 3D mapping system (Ensite NavX, Abbott) from an anterior-posterior view. Data without 3D reconstructed cardiac images or any of the variables were excluded. All patients were randomly assigned at an 8:2 ratio to either a training group or a test group. The training dataset was used to develop the algorithm, and the testing dataset, which was not used to train the network, was employed to assess the accuracy of the algorithm.

Electrophysiological study and ablation procedure
All patients underwent cardiac CT scan before the procedure. Intracardiac electrograms were filtered at 30-500 Hz with an amplifier in the Prucka Cardio Lab System (GE Healthcare, Milwaukee, WI, USA). Detailed electroanatomical data were obtained from the Ensite NavX (Abbott, St. Paul, MN, USA) 3D mapping system. The circular mapping catheter (Optima, Abbott) and the ablation catheter were advanced through a double transseptal puncture. All ablation procedures were performed using RF energy with a 4-mm, open, irrigated catheter (Coolflex, Abbott). All four PVs (including the carina lines) were circumferentially ablated for PVI with an RF energy up to 25-35 W.

Outcomes
The primary outcome was recurrence of atrial tachyarrhythmia using an artificial intelligence (AI) model in participants who underwent PVI alone. A receiver operating characteristic (ROC) curve was created and used to assess the area under the curve (AUC), as well as the accuracy, sensitivity, specificity and F1 score.

Overview of the AI model
We developed two learning and inference models to determine the effectiveness of 3D reconstructed images on prediction of recurrence. The first model was a multimodal deep learning model in which demographic data were utilized along with 3D reconstructed images (Fig. 1). The other used only demographic data. In both models, a deep neural network (DNN) module with four fully connected, hidden layers is commonly used for processing demographic data. The hidden layers collectively consist of 1024 nodes, and the input layer of demographic data was directly connected to these layers. For the model that uses images, a convolutional neural network (CNN) module with a VGG16 model was exploited in the form of transfer learning, and the weights of the VGG16 model were adopted into our CNN module for faster learning with higher prediction accuracy [8]. The 3D images of the LA reconstructed using the Ensite NavX mapping system were input into separate VGG16 models for anterior and posterior aspects. Another fully connected hidden layer with 1024 nodes was used in our CNN module for ensemble learning of the flattened results of the VGG16 models. Finally, a module of two hidden layers with batch normalization was included in both our models. Note that the outcomes of the DNN and CNN modules were merged in this module for models including both types of data. Our models were implemented using the Keras framework with a Tensorflow backend.

Statistical analysis
Statistical analysis was performed using Statistical Package for the Social Sciences (SPSS), version 18.0 (SPSS, Inc., Chicago, IL, USA). Continuous variables were compared using unpaired t test or Wilcoxon rank-sum test, while categorical variables were compared using Chi-squared test or Fisher's exact test, as appropriate. We assessed the AUC using the ROC curve. A p-value < 0.05 was considered statistically significant.

Baseline characteristics
In total, 527 consecutive patients were included in the study and the mean follow-up duration was 21.5 ± 10.2 months. Among these, 41 patients with missing data were excluded. The overall mean LA diameter was 42.0 ± 6.8 mm, and the mean LA volume calculated using the 3D reconstructed image was 151.1 ± 46.7 ml. During the follow-up period, atrial tachyarrhythmia recurred in 158 patients. As shown in Table 1, the baseline demographic data showed a significant difference between the recurrence and non-recurrence groups.  Recurred patients had a significantly larger LA size that was consistently observed in any measurement method, including LA dimensions obtained by echocardiography and LA volume determined using a 3D system and CT images. The remaining baseline characteristics are summarized in Table 1.

AI model
A deep learning predictive model was developed with 400 cases, and the performance test was conducted on 86 randomly selected patients. The AUC of the AI model based on CNN learning including 3D reconstruction images was 0.61 (95% CI 0.53-0.74) using the test dataset (Fig. 2) (Fig. 3). The results indicate that the learning capacity of CNN significantly outperformed DNN using only demographic data.

Discussion
In this study, we developed an AI algorithm to predict recurrence of AF after catheter ablation of PVI only. This study demonstrated that the performance of the AI model using convolution layers with reconstructed LA images was superior to that of the AI model that used only demographic data including LA diameter and volume.
Catheter ablation is the most effective therapy for rhythm control of AF. However, this approach remains challenging as some patients experience recurrence. Many attempts are being made to improve the outcome. It is well known that the pulmonary veins are an Fig. 2 The ROC curve for the convolutional neural networks in the testing dataset. AUC = area under the curve important trigger of paroxysmal AF [4]. Therefore, current guidelines recommend electrical isolation of PVs as a routine procedure for catheter ablation [6,7,9]. To maintain a normal sinus rhythm after the procedure, PVI durability is critical. Several methods have demonstrated the ability to achieve a durable PVI, such as a confirming bidirectional block and a dormant conduction test [10,11]. Currently, cryoballoon ablation is an alternative method to achieve PVI [12]. In particular, the efficacy of cryoablation with PVI alone is strongest in select patients, such as those with paroxysmal AF or younger patients with no structural heart disease [13]. Other patients require substrate modification in addition to PVI. Therefore, patient selection is vital in the decision to perform PVI alone or in conjunction with an additional procedure to reduce total procedure time and improve the outcome. Sanhoury et al. suggested the CAAP-AF risk scoring system to predict AF recurrence after balloon cryoablation [13]. The AUC of a CAAP-AF score ≥ 5 was 0.71, and it had a sensitivity of 64% and a specificity of 68%. In this study, the AUC of the AI model was 0.61, and the total test accuracy was 66.3%. The sensitivity was 53.3%, and the specificity was 73.2%. These findings are similar to the results of other studies and also compare favorably. Many variables were involved due to the individual characteristics of the study participants, limiting the power of the result. In addition, the purpose of the study was not to make predictions that were limited to objective findings, such as diameter and volume, but instead to study the morphology of the LA itself and the location of PVs. An AI model that can learn using reconstructed images can better predict whether the trigger should be targeted or further substrate modification is needed. Therefore, we predicted that machine learning would be better than a conventional statistical analysis. The result of the AI model using 3D images was better than that using only Fig. 3 The ROC curve for the deep neural networks in the testing dataset. AUC = area under the curve demographic data. This hypothesis also may be supported by variability in atrial fiber architecture. A study of myofiber architecture of the human atria using highresolution, 3D diffusion tensor magnetic resonance techniques revealed heterogeneity of transmural fibers and variability of the pattern of atrial architecture. These structural variability factors also may contribute to atrial rhythm and pump function [14].
Several limitations were present in our study. First, deep learning is based on use of big data. However, we included only a small, single-center population. Therefore, we could not conduct external validation, which could have caused overfitting. In addition, a small study population is not appropriate for model development. Second, this study population was not randomly selected, and the decision whether to perform PVI alone or in conjunction with another ablation procedure was made at the physician's discretion, Therefore, there were several opportunities for bias, such as smaller LA size or younger age. Despite these limitations, the AI model demonstrated favorable predictive performance, and further large-scale study is needed to confirm our results.

Conclusion
An AI algorithm was developed from AF catheter ablation data, including reconstructed individual LA images, and it was favorable for predicting need for additional procedures after PVI. However, this AI model was not outperformed to predict recurrence of AF compared with other methods, so further large-scale studies are needed.