Molecular characteristics of anaplastic astrocytomas and isolation of molecular subgroups of their IDH1 mutant forms using in silico analysis

Aim. The problem of anaplastic astrocytomas is quite relevant today. The WHO classification distinguishes IDH1 / IDH2 mutant anaplastic astrocytomas, anaplastic astrocytomas without IDH1 / IDH2 mutations, and anaplastic astrocytomas not otherwise specified. The aim of this work was to cluster IDH1 -mutant anaplastic astrocytomas based on their cytogenetic profile to select prognostically significant molecular subgroups, which can have both clinical and fundamental scientific value. Materials and methods . In this work, we performed a cluster analysis of anaplastic astrocytomas according to their cytogenetic profiles based on available genetic databases of tumors and large cohort studies, as well as a comparison of Kaplan – Meyer survival curves for various molecular subgroups of patients. Results. We studied the main genetic features of the inter-tumor heterogeneity of anaplastic astrocytomas and distinguished seven molecular subgroups based on the cytogenetic profile: embryo-like, inflammatory-like, deletion, matrix, cyclin, GATA3 -dependent and tyrosine kinase. Moreover, each of these subgroups has not only distinctive molecular characteristics, but also important clinical features. Conclusion. A detailed study of the molecular properties of anaplastic astrocytomas will not only optimize the process for predicting treatment outcomes, but also create innovative formats for targeted therapy within the framework of the concept of personalized medicine.


INTRODUCTION
The problem of anaplastic gliomas is quite relevant today. Anaplastic gliomas in the WHO classification of tumors of the central nervous system (CNS), 4 th revision 2016, are represented by anaplastic astrocytoma and anaplastic oligodendroglioma [1]. According to statistics, the average incidence of anaplastic astrocytoma is 1.7% of the total number of central nervous system tumors, the prevalence of anaplastic astrocytoma is 1,307 cases per 100,000 people [2,3].
Заключение. Детальное изучение молекулярных свойств анапластических астроцитом позволит не только оптимизировать процесс прогнозирования исходов лечения, но и создать инновационные форматы для таргетной терапии в рамках концепции персонализированной медицины. __________________________ mitochondrial form. Both enzymes are involved in the oxidative decarboxylation of isocitrate with its conversion to 2-oxoglutarate [5]. It was shown that IDH1 and IDH2 mutations are the most accurate prognostic factors for astrocytic tumors; the presence of a mutation in these genes is associated with better patient survival [6]. It was also found that in cells carrying a mutation of the IDH1 or IDH2 gene, hyperproduction of oncometabolite 2-hydroxyglutarate (2HG) occurs, which leads to significant rearrangements in the epigenetic landscape of the tumor genome [7]. Moreover, the absence of the mutation results in a significant increase in the proliferative potential of astrocytic glioma cells [8]. A number of studies have revealed both the activating and inactivating effects of IDH1 and IDH2 mutations on different protooncogenes, such as PIK3CA, KRAS, AKT, N-MYC and others [9]. Some of the effects of these mutations are realized through metabolic molecular pathways, primarily through modification of lipid metabolism [10].
In modern oncology, the issues of tumors' molecular heterogeneity play a significant role. Genetic, epigenetic, and proteomic features of the pathological process can have significant individual characteristics in each specific tumor. Only with these features taken into account is it possible to form a truly personalized approach to the diagnosis and treatment of tumors, within the framework of which modern hightech methods of diagnosis and treatment will allow the most precise assessment of the main properties and characteristics of the tumor process and, based on this, implement effective targeted therapy programs and individualized treatment approach. All of the above is extremely relevant for anaplastic gliomas, since the existing diagnostic and therapeutic approaches are obviously not effective enough [11].
In this work, we decided to analyze and review data from several large studies of the genetic characteristics of anaplastic astrocytomas, including the aim of identifying various patterns that can have both clinical, practical, and fundamental scientific value.

MATERIALS AND METHODS
An analytical study was conducted in accordance with the international principles of observational studies in epidemiology (MOOSE) [12]. To assess the genetic heterogeneity and related features of the clinical aspects of anaplastic astrocytomas, data from large multicenter studies were analyzed, including data from The Cancer Genome Atlas.

Search and selection of literature
We conducted a thorough literature search using the PubMed, Medline, Scopus, Embase, and Cochrane Library databases. The search keywords used were "astrocytoma OR anaplastic astrocytoma OR diffuse anaplastic astrocytoma OR astrocytoma Grade III" (all fields) AND "genomic data OR genome-wide analysis OR mutations OR multi-omics OR genome sequencing" (all fields). In order not to miss work on this topic, the list of links of full-text articles has also been fully checked.
The inclusion criteria were the presence of the data of full genome sequencing with the determination of mutational events and cytogenetic rearrangements of at least part or the entire studied group of patients; the presence of clinical data in the same patients, including indicators of overall survival (OS) and relapse-free survival (RFS); the presence in patients of an identified mutation in the IDH1 gene; sufficient data to assess the risk ratio (RR) and 95% confidence intervals. Articles were excluded from the analysis if they were presented by reviews, abstracts, letters to the editor or experimental work on animals; if more than one study was conducted in the same group of patients, only the most recent or complete study was included in the analysis. The full text of the articles accepted for analysis has been carefully studied for a comprehensive assessment.

Data extraction and quality assessment
The two authors retrieved the data independently. The information retrieved included the name of the first author, year of publication, country of origin of the article, histological type of tumor, time of observation, methodological features of whole-genome sequencing, and OS and RFS indices. The study cohort included only patients with an established histological diagnosis of anaplastic astrocytoma, WHO Grade III, carrying the IDH1 or IDH2 gene mutation. The quality of each study was assessed independently by two researchers using the Newcastle-Ottawa Quality Assessment Scale [13]. Data from patient cohorts The Cancer Genome Atlas [14,15], Glioma (MSK) [16], Low-Grade Gliomas (UCSF) [17], and Merged Cohort of LGG and GBM (TCGA) [18] were extracted using CBioPortal instrument (Memorial Sloan Kettering Cancer Center, USA). The analysis also included data from another 9 large, genome-wide studies of gliomas of various degrees of malignancy [19][20][21][22][23][24][25][26][27].

Statistical Data Analysis
A cluster analysis of all cases of anaplastic astrocytomas included in the study was performed. For cluster analysis, the k-means method was used. Cluster analysis was carried out on the basis of data on cytogenetic rearrangements in tumor samples. To determine the number of clusters, a hierarchical analysis was initially performed, indicating the possibility of separation with the greatest reliability of 6, 7 or 8 clusters. Cluster analysis was carried out with the isolation of 5, 6, 7, 8 and 9 clusters, the highest accuracy was observed in the identification of 7 clusters (molecular subgroups). In each cluster (molecular subgroup), the frequency of cytogenetic modifications and mutations was evaluated separately. Separate evaluation of OS and RFS for each molecular subgroup using Kaplan-Meyer curves, as well as a comparative analysis of survival rates was performed using the logarithmic rank criterion (LRC), the Cox-Mantel criterion (CMC), and the Gehan-Wilcoxon (GW) criterion. Adjusted risk ratios with 95 percent confidence intervals were used. A level of p less than 0.05 was considered statistically significant.
Statistical analysis was performed using SPSS Statistics 23 software (IBM, USA).

General characteristics of the cohort
In the cohort of patients included in the study and meeting all inclusion criteria, including the criterion for the presence of in the IDH1 gene mutation, there were a total of 886 patients, accounting for 69.66% of all considered cases of anaplastic astrocytomas. The average age of the patients was 36.72 ± 4.58 years. Men accounted for 58.24% of the total cohort, while women accounted for 41.76%. The average OS rate was 9.18 ± 0.24 years, and the average RFS level was 2.34 ± 0.18 years.

Brief molecular characterization
In anaplastic astrocytomas with IDH1 mutation, based on our analysis, the most frequent concomitant mutation was the point modification of the TP53 gene, which was detected in 96.15% of cases. The group of tumors that simultaneously carry mutations in the IDH1 and TP53 genes was characterized by a high frequency of the ATRX gene mutation, which occurs in 64% of cases. In addition, mutations of the SMAR-CA4, APOB, and FLG genes were relatively often detected, each of which was observed in 10% of cases. Anaplastic astrocytomas with a mutation in the ATRX gene showed a higher mutation rate in the CDKN2A gene (8.57%) compared with tumors that did not carry this mutation (3.85%). The epidermal growth factor receptor (EGFR) mutated in 4.26% of cases, however, in combination with the ATRX mutation, it was not found. Among the cytogenetic events that accompanied the ATRX gene mutation, the most frequent was the amplification of the EXT1 gene, which occurred in 21.43% of cases.

Molecular subgroups of anaplastic astrocytomas, IDH1-mutatant, and their characteristics
The first subgroup that can be distinguished among anaplastic astrocytomas with IDH1 gene mutation based on the cytogenetic profile was anaplastic astrocytomas carrying amplifications of the EXT1 and MYC genes. The EXT1 gene encodes exostosin glycosyltransferase 1 protein, which is required for exosomal release of SDCBP, CD63, and syndecan factors, and plays a role in early tissue development and tumor progression [28]. The participation of the MYC in the implementation of genetic proliferative programs is widely known, as well as its participation in the pathogenesis of tumor diseases of different localization [29]. At the same time, activation of MYC gene in relation to CNS tumors is most often found in tumors and in individual populations of tumor cells having embryonic properties, in particular, medulloblastomas [30,31]. Moreover, as in the case of anaplastic astrocytomas, the activation of this gene occurs most often precisely due to its amplification. Thus, this molecular subgroup can be arbitrarily called an embryonic-like subgroup. It occurs in 21.67% of analpastic astrocytomas. It is interesting to note that in this subgroup, PTK2 gene amplification is revealed in 86.5% of cases. This gene product is tyrosine kinase type 2 protein, which is also associated with cells exhibiting embryonic properties to varying degrees and it is actively involved in the proliferation and stabilization of neuronal and glial elements in early stages of central nervous system development [32].
Moreover, clinically, this group, as in the case of medulloblastomas, is characterized by a worse prognosis for OS and RFS. Thus, OS (LRC: p = 0.0086; CMC: p = 0.00051; GW: p = 0.00038) along with RFS (LRC: p = 0.00776; CMC: p = 0.00138; GW: p = 0.00368) is significantly lower in the embryonic-like subgroup compared with similar indicators in the rest of the analyzed cohort (Fig. 1).
The second subgroup includes tumors carrying amplification of the ERC1 gene. In the aspect of carcinogenesis, the participation of ERC1 protein in the activation of the transcription factor NF-κB is extremely important; increased activity of this mechanism has been identified as one of the key molecular events in breast tumors [33,34]. Under physiological conditions, NF-κB is a key factor in the implementation of inflammatory programs and cell stress response [35]. Subsequently, the role of this transcription factor in the carcinogenesis of different tumors was shown [36]. Therefore, this molecular subgroup can be called inflammatory-like, and is detected in 18.51% of cases. This subgroup is characterized by a high frequency of amplification of the cyclin D2 gene (CCND2) in 94.36% of cases. The cyclin D2 gene is a regulatory component of the cyclin D2-CDK4 complex, inhibiting the members of retinoblastoma protein family (RB), including RB1 protein, and causing the cell transition to the S phase of the cell cycle. This increases the proliferative activity of cells [37]. In addition, there was a high frequency of amplification of genes belonging to the fibroblast growth factor family, fibroblast growth factor 6 (FGF6) and 23 (FGF23) types. These factors lead to the activation of the tyrosine kinase cascade with a significant increase in mitotic activity and cell survival [38]. The clinical features of this subgroup consist in a similar level of OS and RFS in comparison with the embryonic like subgroup (Fig. 1). The observational data are also confirmed by statistical criteria, which do not reveal significant differences both in relation to RFS (LRC: p = 0.0878; CMC: p = 0.0615; GW: p = 0.05895) and OS (LRC: p = 0.0781; CMC: p = 0.05845; GW: p = 0.0568).
The third subgroup includes tumors carrying deletions of the BRSK1, ZNF331, TFPT and U2AF2 genes. The BRSK1 gene encodes a brain-specific serine/threonine protein kinase 1 that phosphorylates and activates a number of secondary messengers and acts as a key regulator of the polarization of cortical neurons, as well as the formation of the gliocyte cytoskeleton. In addition, it is involved in the implementation of a number of other functions, in particular, it can play the role of a negative regulator of the cell cycle, inhibiting its development in case of DNA chain damage and simultaneously facilitating its rapid repair [39]. The TFPT gene ensures the development of apoptosis regardless of the mutational status of the TP53 gene, which is extremely important for anaplastic astrocytomas carrying the TP53 mutation in 96.15% of cases [40]. The U2AF2 gene encodes an auxiliary factor of small nuclear RNA type 2, which plays an important role in the splicing processes of a number of genes, including those associated with the cell's proliferative potential [41]. Since the whole set of combined deletions is the most characteristic from the genetic point of view for this type, the considered subgroup can be conditionally called deletion. It is detected in 17.49% of cases.
Such cytogenetic changes lead to better clinical outcomes (Fig. 1). The OS in this subgroup is not only higher than the average level for all molecular subgroups, but also significantly higher than in em- The following group contains cases with the amplification of the MSN gene and combined amplification of the AMER1 gene in 83.33% of cases. The MSN gene encodes a moesin protein involved in providing a connection between the components of the cytoskeleton of the cell and the cytoplasmic membrane. Moreover, it can participate in the regulation of contact inhibition of certain cells' proliferation and their motility [42]. The protein encoded by the AMER1 gene is one of the key regulators of the Wnt/beta-catenin cascade, capable of both increasing and decreasing its activity. Elements of this cascade, as well as the product of the MSN gene, are involved in cell contacts with the intercellular matrix and other cells; this cascade likewise takes part in the processes of contact matrix and intercellular regulation of cell proliferation [43]. In connection with the described features, this subgroup can be designated as matrix. It occurs in 14. and CDKN2B genes were combined. The CDKN2A encodes several transcript variants that differ in the composition of their first exons and act as regulators of the cell cycle, inhibiting the transition of the cell to mitosis. In addition, under the influence of the proteins encoded by the CDKN2A gene, the p53 protein is stabilized and activated [44]. The product of the CDKN2B gene is a cyclin-dependent kinase inhibitor, which is complexed with CDK4 or CDK6 and prevents their activation; therefore, the encoded protein also functions as a cell growth regulator that slows down the progression of the G1 phase of the cell cycle [45]. The MTAP gene encodes a methylthioadenosine phosphorylase protein, which plays an important role in the metabolism of polyamines. A decrease in the activity of this gene is observed in many tumors due to frequent co-deletion with the CDKN2A and CDKN2B genes. Due to the fact that cyclin system regulating proteins play a significant role in the pathogenesis of this subgroup of tumors, the cyclin subgroup seems to be the most suitable variant. This subgroup presented in 14.33% of cases. Interestingly, the ATRX gene mutation occurs in 100% of tumors of this subgroup. The OS in patients of this subgroup is extremely low, while its OS and RFS differ for the worse from all the subgroups indicated above: embryonic-like (LRC: Thus, belonging to the cyclin subgroup is an extremely unfavorable factor of prognosis of patients (Fig. 1). These findings are consistent with a recent study by Shirahata et al., which showed that deletion of CDK-N2A and CDKN2B is an unfavorable prognostic molecular event for anaplastic astrocytomas [46].
A small subgroup of patients had amplification of the GATA3 gene. The product of this gene is a transcriptional activator that binds to an enhancer of T-cell receptor genes. The pro-carcinogenic effects of GATA3 can be associated with deregulation of three genes, BCL2, DACH1, THSD4, which are involved in cell differentiation processes [47]. The GATA3-de-pendent subgroup is not numerous, it was revealed in 11.63% of cases, but, like the cyclin one, it is characterized by an extremely unfavorable prognosis. Nevertheless, the prognosis of OS is somewhat better than in the cyclin subgroup ( The last subgroup was characterized by combined amplification of the FIP1L1, CHIC2, PDGFRA, KIT and KDR genes. The FIP1L1 gene encodes a protein performing polyadenylation of the 3′end of pre-mRNA [48]. PDGFRA encodes a type A thrombocyte growth factor receptor, a tyrosine kinase cell membrane receptor that has pronounced mitogenic effects through the activation of the RAS/RAF/MAPK cascade [49]. The product of the KIT gene, c-kit protein, is a tyrosine protein kinase that acts as a cell surface receptor for KITLG/SCF cytokines and plays a significant role in the regulation of cell survival and proliferation, hematopoiesis, and maintenance of stem cells, as well as their migration. Moreover, like PDGFRA, the c-kit realizes a significant part of its effects through the activation of the RAS/RAF/MAPK tyrosine kinase cascade [50]. In this regard, conditionally this subgroup can be called tyrosine kinase. It was detected in 1.58% of cases. The tyrosine kinase subgroup is characterized by similarities in both OS and RFS with the GATA3-dependent subgroup, and the statistical criteria do not reveal significant differences between the subgroups in these indicators (LRC: p = 0.37488; CMC: p = 0.28652; GW: p = 0.25913 and LRC: p = 0.34975; CMC: p = 0.27728; GW: p = 0.25105, respectively).

DISCUSSION
Thus, according to the results of our analysis, 7 molecular subgroups of anaplastic astrocytomas were identified that differ in their cytogenetic profile, as well as different prognosis and features of mutational changes. Embryonic and inflammatory-like subgroups are the most frequent and occur in 21.67% and 18.51% of cases, respectively. The deletion subgroup makes up a total of 17.49% of cases, the matrix subgroup consists of 14.79% of cases, the cyclin subgroup with 14.33% of cases and the GATA3-dependent subgroup, which occurs in 11.63%, are located nearby. The rarest is the tyrosine kinase subgroup, as it is detected in only 1.58% of cases. The worst prognosis is observed in the cyclin subgroup, a relatively poor prognosis is revealed in the GATA3-dependent and tyrosine kinase subgroups, the middle prognosis in the embryonic, in-flammatory-like and matrix subgroups, the best prognosis is found in the deletion subgroup (Fig. 2).
What can be responsible for such inter-tumor heterogeneity within the framework of one nosological unit? We can try to trace the potential pathways for the occurrence of genetic diversity of anaplastic astrocytomas by examining the currently available literature data on the carcinogenesis of gliomas and by sketching a possible pathway for the progression of these tumors.
The earliest and most important mutational event in anaplastic astrocytomas is the mutation of the IDH1 gene [51]. This event primarily affects neuronal stem cells, which may be the primary tissue source of the tumor [52]. However, tumor cells can follow different paths under the influence of many factors, including genetic and epigenetic constitutional features and stochastic effects in gene expression, leading to the emergence of not only inter-tumor, but also intratumoral heterogeneity. Various types of cells arise; in particular, three principal cell populations appear in the composition of anaplastic astrocytoma: astrocyte-like cells, oligodendro-like cells and progenitor cells with stem properties [19]. It is curious that amplification of the PDGFRA gene acts as a marker genetic event for oligodendro-like cells, the content of which is extremely low in anaplastic astrocytomas. In the cohort analyzed, within the framework of this study, the tyrosine kinase subgroup, for which this also serves The increase in the degree of tumor aggressiveness in each subgroup is the higher, the higher this subgroup is located relative to the ordinate axis. The prevalence of molecular subgroups is shown on the abscissa as a percentage of the total number of cases analyzed as one of the marker events, is extremely rare. In our study, the worst prognosis was observed in patients whose tumors belong to the cyclin subgroup, which is characterized primarily by deregulation of cell cycle proteins. It is curious that similar changes in the application of single-cell technologies are found in tumor glioma cells, called mesenchymal-like cells. This type of cell practically does not occur in classical anaplastic astrocytomas and is more characteristic of glioblastomas, for which a higher degree of malignancy is typical. Thus, the tumor cell subclones within the same tumor create the mosaic picture that is assembled into a single molecular tumor pattern, which served as the basis for the selection of subgroups in our study. Differences in the details of this mosaic can produce different molecular subtypes of anaplastic astrocytomas. Each subtype will have different biological properties and different aggressiveness, which is reflected in the prognostic aspects.

CONCLUSION
Diffuse glial tumors are a difficult problem both in fundamental and in clinical terms. The significant heterogeneity of the molecular properties of anaplastic astrocytomas affects not only the rate of progression of the pathological process, but also the effectiveness of different types of treatment. Moreover, such heterogeneity is a reflection of the individual characteristics of the tumors in each individual patient. Consideration of such features is extremely important for the development of truly effective personalized approaches to the diagnosis and treatment of such patients. Molecular clustering of tumors will not only optimize the prognosis of treatment outcomes, but also create innovative formats for targeted, precise therapy within the framework of the concept of personalized medicine.