Experience of neuronet diagnosis and prediction of peptic ulcer disease by results of risk factor analysis

Aim. To develop and verify a method for diagnosis of peptic ulcer based on neural network analysis of patient risk factors data. Materials and methods. This article presents the results of a study based on materials on the risk factors of 488 patients. The data was analyzed using internally developed the artificial neural network (Certificate of State Registration of Program for Computers (RU) no. 2017613090). The results of the study. The proposed approach demonstrated levels of sensitivity of 74.4%, m = 4.3 and specificity of 93.3%, m = 2.46 during clinical testing. The prediction of the age of probable hospitalization ensured the generation of an array of data for which the Mean Absolute Error (MAE) of the prognosis was 1.8 years, m = 0.11 in the training set and 1.9 years, m = 0.15 in the clinical testing set. The maximum of absolute prognosis error in the clinical testing set did not exceed 2.2 at p = 0.95 and 2.3 years at p = 0.99. Conclusion. A new method is proposed for diagnosis of peptic ulcer based on a neural network analysis of data on patients’ risk factors. During clinical testing of the model, this approach demonstrated Area Under the Curve (AUC) levels reaching 0.943. The use of the artificial neural network also made it possible to predict the age of probable hospitalization. The use of the neural network demonstrated additional advantages including: non-invasiveness, the lack of need to prepare the patient for the study and the possibility to obtain results immediately after the onset of the disease without a time delay for sample processing. The results of training and clinical testing of the “System of Intellectual Analysis and Diagnosis of Diseases” in the diagnosis of peptic ulcer.


INTRODUCTION
The problem of peptic ulcer remains relevant due to the high incidence, reaching 6-16% of the population, as well as the risk of development of dangerous complications [1]. A common approach to the diagnosis of peptic ulcer is based on assessment of the clinical, laboratory and instrumental data obtained during the examination of the patient. Such a strategy is belated for the purpose of the primary personalized preventive strategy development [2,3] since it deals with a disease that has already developed and therefore does not meet modern requirements for making competent management decisions [4][5][6].
At the same time, the risk factors for peptic ulcer are well known. Their list includes sex and age [7-10], heredity [11,12], social and professional hazards [13,14], unhealthy diet [7, 14,15], stresses [8] and also factors associated with unhealthy habits [16] and the intake of ulcerogenic drugs [17]. All these elements act simultaneously before the onset of the disease and can in principle be detected in advance. However, their spectrum is a set of data difficult to structure. They act together and demonstrate implicit interrelations and signs of cobweb causality [18]. Data of such a kind is complex in processing and requires involvement of a wide range of IT and automated control systems [19,20].
The series of attempts to develop specialized information systems allowing the assessment of risk factors using conventional biostatistical methods had been developed. However, they demonstrated inability to fully solve the problem of developing a personalized preventive strategy because of their inherent fundamental mathematical limitations [21-23]. On the contrary, artificial neural networks (ANNs) have demonstrated opportunities for analyzing difficult data [24, 25], but their potential in this area of medicine is still to be studied.
For the reasons mentioned above, the objective of this study was to develop and verify a method for diagnosis of peptic ulcer based on neural network analysis of data on patients' risk factors.

MATERIALS AND METHODS
The study included 488 patients with pathologies of the hepatopancreatoduodenal zone (including 117 men and 59 women with peptic ul-cer) undergoing inpatient treatment in the city of Kursk. The mean age of patients with peptic ulcer was 48.1 (m = 1.23) years. In the course of the study, we internally developed the "System for the Intellectual Analysis and Diagnosis of Diseases", based on the principle of a multilayer perceptron with hyperbolic tangent used as an activation function [26].
Patients were divided into two groups: those who received treatment before 01 January, 2011 (n 1 = 385) and after this date (n 2 = 103). The first subgroup comprised 133 patients with peptic ulcer and 252 patients in whom this diagnosis was excluded. The first subgroup was used to train the perceptron. This process was carried out with two types of outputs -qualitative and quantitative. The qualitative one was associated with the presence or absence of the diagnosis (as well as a possible vague outcome) and quantitative represented the age of hospitalization of patients for hepatopancreatoduodenal diseases. The second subgroup comprised 43 people with peptic ulcer and 60 patients with other pathologies (pancreatitis, cholecystitis). It was used to test "System" in practical healthcare field.
The following input parameters were selected as input parameters of perceptron: the age and sex of patients, the risk factors they had: the form and the extent of bad habits, the presence or absence of stress, employment, eating habits and the diet regime [27]. The following perceptron settings were used: the number of hidden layers -3 and the number of neurons in the hidden layer -10.
The evaluation of the ANN work quality was carried out using the methods of descriptive and inductive statistics, calculation of sensitivity, specificity, prognosis errors, and Receiver Operating Characteristic (ROC) analysis.

RESULTS AND DISCUSSION
Differences in sets of risk factors in patients with peptic ulcer on the one hand and cholecystitis and pancreatitis on the other allowed to successfully train the network for the recognition of data vectors. The result of the ROC analysis is shown in Fig. 1. The threshold value y B , defining the interval (-y B ; y B ), within which the output (unadapted) value of the network was interpreted as undefined (vague), was assumed to be 0. That is, all output values of the network were treated as true or false -the presence or absence of peptic ulcer [28]. Лазаренко В.А., Антонов А.Е., Markapuram V.K., Awad K. Опыт нейросетевой диагностики и прогнозирования язвенной болезни The value of the AUC of the presented graph for the training set equaled to 0.966. For the clinical testing group, the result of ROC analysis is shown in Fig. 2. The AUC for this group was 0.943. Decrease in the AUC can be explained by the variations of risk factors' influence on the development of peptic ulcer and proves the necessity to dynamically update the information on risk factors. Nevertheless, the quality of the mathematical model for recognizing the data vectors in the training set and in the clinical testing group remains high.  As can be seen from data presented in Table 1, the network successfully coped with the task of distinguishing the disease based on an analysis of risk factors' spectrum. The sensitivity and specificity indicators were at the level of conventional diagnostic methods. In general, the results of clinical testing are expectedly slightly inferior to the results shown by the data from the training set. This can also be explained by the dynamics of risk factors over time. The results show no signs of network's overtraining. The results for quality assessment of predicting the age of probable hospitalization are presented in Table 2. The MAE of the predicted age was calculated as the absolute of the difference between the estimated and empiric values. This indicator did not exceed 1.9 years for the clinical testing group. For the training set it was lower.
T a b l e 2 The results of training and clinical testing of the multilayer perceptron in predicting the age of probable hospitalization The percentiles p 95 and p 99 were determined for the maximum prediction error. Just like for diagnostic model, the forecast of the quantitative indicator was more accurate for the training set. Nevertheless, during the clinical test, the ANN demonstrated the absolute error of the hospitalization age not exceeding 2.3 at probability level of p = 0.99.
In addition, it should be noted that the neural network could be executed on non-specialized, personal computers. The application has low hardware requirements at the stage of practical application. The use of the "System of Intellectual Analysis and Diagnosis of Diseases" could be performed by a health care provider having no special additional training. The use of ANN also demonstrated additional advantages: no need to prepare the patient for the study, the possibility of obtaining results immediately after the onset of the disease (and in principle before the moment) and the absence of time delay for processing the material. The investigation was non-invasive. 2. The prediction of the age of probable hospitalization of patients with peptic ulcer using the neural network analysis demonstrated the MAE of the forecast equal to 1.8 years, m = 0.11 for the training set and 1.9 years, m = 0.15 for the clinical testing group. The absolute error of the forecast in the clinical testing group would or did not exceed 2.2 at p = 0.95 and 2.3 years at p = 0.99, respectively.

CONFLICT OF INTEREST
The authors declare the absence of obvious and potential conflicts of interest related to the publication of this article.

SOURCE OF FINANCING
The authors state that there is no funding.

COMPLIANCE WITH THE PRINCIPLES OF ETHICS
The research corresponds to ethical standards developed in accordance with the World Medical Association Declaration Лазаренко В.А., Антонов А.Е., Markapuram V.K., Awad K. Опыт нейросетевой диагностики и прогнозирования язвенной болезни