International conference on bioinformatics of genome regulation
Download 3.91 Kb. Pdf ko'rish
|
Key words: very long intergenic non-coding RNA, systems biology, function annotation, gene ontology, database, web-service Motivation and Aim: Vast amount of lncRNA species have been discovered recent years in mammalian genomes and stored on the web. However there is an impressive misbal- ance between portion of at least 10% of human genome occupied by non-coding RNA and relatively few facts known about its function. Our results suggest that vlincRNAs represent a hitherto hidden layer of regulation involved in critical biological processes and diseases. Many of them are highly expressed in cancers, and some are expressed in stem cells and appear to be regulated by transcription factors involved in stem cell dif- ferentiation [1]. Here we present our vlincDB – a web-tool for functional annotation and analysis of human very long intergenic non-coding RNAs. Methods and Algorithms: vlincDB web-site was created with PHP (v.5) and bootstrap framework (v.3.3.6), all annotation tables are stored in MySQL database (v.14.14). Results and Conclusion: VlincRNA database, provides both the list of 5151 putative very long intergenic non-coding RNA transcripts and the list of separate 1542 vlinc RNA genes. The integrated analysis of genomic features leveraged by transcription level data measured in 833 tissues and cell lines by FANTOM5 consortium allowed us to predict possible functions of vlincRNA genes [1]. The database provides annotation and a tool of search by gene ontology categories, SNP traits, chromatin modification states, ChIP-seq signals in vlincRNA genes promoter regions, by overlapping known lncRNA, nearby genes and other. Advanced tools implement scenario of overlapping vlincRNA genes with user-defined genomic intervals (provided in BED format), GO terms enrich- ment analysis and classification into cancer/normal/stem-cell categories according to FANTOM5 gene expression data. All annotation data can be downloaded. Availability: http://office.nprog.ru:8081/table.php References: 1. G. St. Laurent et al. (2016) Functional annotation of the vlinc class of non-coding RNAs using sys- tems//Nucl. Acid Res., doi: 10.1093/nar/gkw162. 34 THE TENTH INTERNATIONAL CONFERENCE ON BIOINFORMATICS OF GENOME REGULATION AND STRUCTURE\SYSTEMS BIOLOGY POXVIRAL CHEMOKINE-BINDING PROTEINS: THEORETI- CAL STUDY OF STRUCTURE AND FUNCTION EVOLUTION D.V. Antonets* 1 , K.V. Gunbin 2 , T.S. Nepomnyashchikh 1 1 State Research Center of Virology and Biotechnology “Vector”, Novosibirsk, Russia 2 Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia * Corresponding author: antonec@yandex.ru Key words: poxviruses, immunomodulatory proteins, chemokines, GIF – GM-CSF/IL-2-binding protein, glycosaminoglycans, molecular modeling, phylogenetic analysis Motivation and Aim: Poxviruses are large enveloped dsDNA viruses with complex ge- nomes containing about 200 genes, a half of which codes for immunomodulatory proteins subverting antiviral responses of the host. One of the most interesting groups of poxviral immunomodulatory proteins are chemokine-binding proteins. Despite sharing low se- quence identity the members of this protein family posses remarkably similar tertiary structures. With the help of phylogenetic analysis and molecular modeling here we tried to get some insights about the evolution of the proteins of this family and about changes of their molecular functions. Methods and Algorithms: Multiple protein alignment was made by PROMALS3D. Sec- ondary structures were predicted by SCRATCH-1D. We generated 300 sub-alignments using 50% alignment jackknife. Then for each alignment the phylogenetic tree was re- constructed by RaxML 8 and GTR model. Phylogenetic analysis for each sub-alignment was based on 3 data sources: (1) amino acid sequences and protein secondary structures using (2) 3 or (3) 8 structure types. Consensus phylogenetic trees were reconstructed by Dendroscope 3.4.0. The reconstruction of the last common ancestors (LCA) of 12 clus- ters of proteins (corresponding to statistically significant phylogenetic clades) were made using RaxML 8 and GTR model. Protein 3D structures were reconstructed by I-TASSER and RaptorX web-services. Structure refinement was done using FG-MD, ModRefiner and GalaxyRefine. Electrostatic surface potentials were calculated with DelPhi software. Results and Conclusion: For each protein cluster at least one 3D protein structure was modelled. Using these modelled structures, we distinguished two structural types of inves- tigated proteins: (A) composed of one chemokine-binding domain (1CQ3/2VGA/4P5I) only, and (B) composed of two domains: chemokine-binding SECRET domain and TN- FR2-like domain (CrmB and CrmD proteins – 3ON9). These two protein types form two statistically significant subtrees composed of 7 and 5 clades, respectively. Similarity analysis of GTR matrices of amino acid substitution rates for 12 clusters as well as the analysis of secondary structure evolution shown that subtree A contains LCA of poxviral chemokine-binding proteins. There are two clusters located closely to LCA: the cluster containing secreted chemokine-binding proteins from Vaccinia and Cowpox viruses, and the cluster of such proteins from Orf virus. The more ancestral subtree of GM-CSF/IL2- binding (GIF) proteins was predicted not only to share high structural similarity with A41 but also to bear prominent positive electrostatic charge at surface formed with second β-sheet as it was shown for A41. Thus GIF proteins might bind glycosaminoglycans in- terfering with chemokines binding to GAGs and chemotactic gradient formation. Acknowledgements: This work was supported with RFBR grant #15-04-08956-a. 35 THE TENTH INTERNATIONAL CONFERENCE ON BIOINFORMATICS OF GENOME REGULATION AND STRUCTURE\SYSTEMS BIOLOGY TRANSCRIPTOME WIDE PREDICTION OF LNCRNA-RNA INTERACTIONS BY A THERMODYNAMICS ALGORITHM I.V. Antonov*, M.A. Zamkova, A.V. Marakhonov, M.Y. Skoblov, Y.A. Medvedeva Research Center of Biotechnology RAS, Moscow, Russia * Corresponding author: ivan.antonov@gatech.edu Key words: antisense interaction, long non-coding RNA, post-transcriptional regulation Motivation and Aim: Long noncoding RNAs (lncRNAs) are a large and diverse class of transcribed RNA molecules with a length of more than 200 nucleotides that do not encode proteins. The discovery of thousands of lncRNAs in mammals raised a question about their functionality. Due to functional diversity the role and/or molecular mecha- nism of only few hundred lncRNAs have been determined by the date. Particularly, it has been shown that some of them function post-transcriptionally via formation of inter- molecular RNA-RNA duplexes. The primary aim of this study is to bioinformatically address novel lncRNA functions by predicting RNA-RNA interactions transcriptome- wide. Methods and Algorithms: To search for potential antisense partners for a given non- coding RNA, existing large-scale studies utilized sequence alignment tools (such as BLASTn) without taking into account RNA secondary structure and interaction energy, crucial for RNA-binding. To compensate for this disadvantage co-folding of two RNAs (the query lncRNA versus each of the RNAs in the transcriptome) into minimal free energy structure using thermodynamics-based methods (e.g. bifold) can be used. Unfor- tunately, this task is not computationally feasible on the transcriptome-wide level. In this work we developed a new pipeline, called ASSA (‘’AntiSense Search Approach’’), which reduces running time by fast identification of putative antisense sites by a se- quence alignment tool BLASTn followed by verification of each potential interaction by bifold. In our pipeline we automated selection of the initial set of putative antisense sites (i.e. optimized thresholds for BLASTn search), estimated statistical significance (E-value) of antisense interaction energy and the length of the flanking sequences to putative site for validation by bifold. Results: ASSA was capable of predicting 26 out of the 29 known functional RNA-RNA interactions (both cis and trans) in human and mouse transcriptomes. Comparison of ASSA with other tools showed that it produces one of the strongest predictions in terms of Sensitivity, Accuracy and AUC. We have also applied ASSA to publicly available data from knockdown experiments of 49 murine lncRNAs. We identified four lncRNAs with statistically significant overlap between the ASSA predictions and the differentially expressed genes observed in the experiment, suggesting possible molecular mechanism for these long noncoding RNAs. Conclusion: We have developed a new computational approach for transcriptome-wide prediction of lncRNA-RNA interactions. We believe that ASSA will be a useful tool to both bioinformatics and wet-lab researches to study lncRNA mechanisms and to select potential antisense partners for the RNA of interest. 36 THE TENTH INTERNATIONAL CONFERENCE ON BIOINFORMATICS OF GENOME REGULATION AND STRUCTURE\SYSTEMS BIOLOGY DIFFERENTIAL ALTERNATIVE SPLICING IN RATS BRAIN TISSUES SELECTED BY AGGRESSIVE BEHAVIOUR V.N. Babenko*, A.O. Bragin, I.V. Chadaeva, Y.L. Orlov Novosibirsk State University, Novosibirsk, Russia Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia * Corresponding author: bob@bionet.nsc.ru Key words: alternative splicing, RNA profiling, aggressive behavior, synaptic conduction genes, glutamate receptor NMDA, Grin1 gene Motivation and Aim: Alternative splicing is important basis of gene functioning and dif- ferentiation in neuronal tissues of higher eukaryotes [1]. The process of cell specializa- tion is multi-level one that includes processes of replication, transcription and splicing, as well as miRNA regulation of splicing factors. Earlier several neurospecific splicing enhancers regulating mRNA structure of large number of gene-targets were revealed: NOVA1/2, FOX1/2, nSR100/SRM4 and silencers PTB1/2 [1, 2]. Analysis of such mo- lecular mechanisms has a great fundamental importance in biomedicine and neurosci- ences. In this paper we considered differential alternative splicing of genes by analysis of RNA-Seq data of aggressive and tame rat lines selected at ICG SB RAS [3]. Methods and Algorithms: We analyzed tissue samples from several brain areas of labora- tory animals including hypothalamus. Previously it was shown that these brain tissues are associated with aggressive behavior in rats. Results: By using RNA profiling the main class of neuronal genes with alternative splic- ing such as genes of synaptic specializations, among which are highlighted profiles dif- ferentially spliced isoforms was identified. We studied in details difference in Grin1 isoforms. It was a significant difference between the aggressive and tame rats in propor- tions of alternative transcripts in a number of the synapse gene. Such difference may determine the relevant specific behavior. Conclusion: The deviations of proportions of synapse transcripts may be due to a change of the expression of neurospecific RNA-binding splicing proteins such as SLM1, NOVA, PTB2 and others. Overall, we present alternative splicing as molecular mechanisms af- fecting gene expression isoforms and behavior patterns laboratory rats. Availability: Software is available from the author upon request. References: 1. Venables J.P. et al. (2012) Tissue-specific alternative splicing is conserved in deuterostomes. Mol Biol Evol. 29 (1): 261-9. 2. Raj B., Blencowe B.J. (2015) Alternative Splicing in the Mammalian Nervous System: Recent Insights into Mechanisms and Functional Roles. Neuron. 87 (1): 14-27. 3. Spitsina A.M. et al. (2015) Supercomputer analysis of genomics and transcriptomics data revealed by high-throughput DNA sequencing. Program systems: theory and applications. 6:1(23): 157–174. (In Russian) 37 THE TENTH INTERNATIONAL CONFERENCE ON BIOINFORMATICS OF GENOME REGULATION AND STRUCTURE\SYSTEMS BIOLOGY MOLECULAR MODELING OF INFLUENZA VIRUS H1N1 HEMAGGLUTININ INHIBITION BY CAMPHOR IMINES D.S. Baev*, A.S. Sokolova, O.I. Yarovaya, T.G. Tolstikova, V.V. Zarubaev N.N. Vorozhtsov Novosibirsk Institute of Organic Chemistry SB RAS, Novosibirsk, Russia Influenza Research Institute, Petersburg, Russia * Corresponding author: baev@nioch.nsc.ru Key words: influenza, H1N1, hemagglutinin, camphor imines Motivation and Aim: Clinical use of antiviral M2 and neuraminidase inhibitors is limited due widely distributed drug resistance [1]. This fact drives the research to identify new anti-influenza drugs with novel targets and mechanisms of activity for treatment of in- fluenza. The influenza surface glycoprotein hemagglutinin (HA) is a potential target for antiviral drugs because of its key roles in the initial stages of infection: receptor binding and the fusion of virus and cell membranes [2]. Methods and Algorithms: The docking analysis of molecules was carried out using Autodock Vina. The structural coordinates of HA from the 2009 human pandemic influ- enza virus (A/California/04/2009, PDB ID 3UBE) were obtained from the protein da- tabank. 3UBE model was superimposed with 3EYK HA model by sequence alignment for detection of binding site of the inhibitor of membrane fusion, tert-butylhydroquinone (TBHQ). Results: Spatial characteristics of HA A/California (3UBE) structure lead to the forma- tion of two possible cavities in TBHQ-friendly region. In the case of molecular docking grid captures both of these sites, the aliphatic imines trying to embed in specific hydro- phobic site while the conservative site binds TBHQ forming hydrogen bonds. Conclusion: Computer simulation of camphor imines interaction with viral HA suggests that the probable mechanism of their action is inhibition of HA activity by binding to hydrophobic site on its molecular surface. References 1. F.G. Hayden et al. (2011) Emerging influenza antiviral resistance threats, J. Infect. Dis. 203: 6-10. 2. R.J. Russell et al. (2008). Structure of influenza hemagglutinin in complex with an inhibitor of mem- brane fusion, Proc. Natl Acad. Sci. USA. 105(46): 17736-41. 38 THE TENTH INTERNATIONAL CONFERENCE ON BIOINFORMATICS OF GENOME REGULATION AND STRUCTURE\SYSTEMS BIOLOGY THE USE OF DISCIMINANT ANALYSIS AND ARTIFICIAL NEURONAL NETWORK IN BREAST CANCER DETECTION U.S. Bagina*, L.V. Shchegoleva, T.O. Volkova Petrozavodsk State University, Petrozavodsk, Russia * Corresponding author: uliana.bagina@gmail.com Key words: artificial neuronal networks, discriminant analysis, diagnosis, breast cancer Motivation and Aim: Tumorigenesis is accompanied with the changes in different sys- tems of the organism. There are many factors involved that affect each other. The com- bined effect from two factors often exceeds the sum of the effects therefore increasing the individual effect. Important requirement of tumor development is ability of cancer cells to cause death of the lymphocytes. Therefore, in establishing the algorithm of the diagnostics of the breast cancer, we used several parameters approach – specifically, activity of caspases and the ratio of lymphocytes subsets. Methods and Algorithms: 108 peripheral blood samples were obtained from the Onco- logic clinic of Republic of Karelia. 15 of them were collected from patients with breast benign disease, 20 – stage I of breast cancer, 30 – stage II breast cancer, 43 – stage III of breast cancer. As control samples, group of 30 healthy controls was similarly studied to establish normal ranges and means. Caspase-3, -6, -8, -9 activity assay was carried out in peripheral blood lymphocytes. The ratio of T-cell subsets such as CD3, CD4, CD8, CD16, CD20, CD25 и CD95 was estimated. Statistical analysis was performed using Statgraphics Plus 5.0 software. Artificial neural network (multilayer perceptron) was developed with a 3-layer design, with one hidden layer. Results: The 11 criteria such as activity of caspase-3, -6, -8, -9 and rate of CD3, CD4, CD8, CD16, CD20, CD25 and CD95 T-cell subsets in peripheral blood were tested as a diagnostic biomarkers. Based on discriminant analysis for 138 cases, it was found that the best separation into groups should use all 11 biomarkers. Discriminant analysis al- lows for a correct attributing to the group 98.9% of patients. The second approach in classifying the cases was based on the artificial neural network which included the 11 neurons on the input layer, the 5 neurons on the output layer and 6 on the single hidden layer. The network was tested on 138 observations. All of them were correctly classified and error rate was zero. Based on both discriminant analysis and neural networks using free R-statistics, we developed software that automatically with certain probability (% - discriminant analysis) and in a binary system “yes / no” (1/0) differentiates the blood samples into groups of benign breast disease, stage I, stage II or stage III of breast cancer and group without pathology. Conclusion: Using discriminant analysis and neural network allows the development of high-precision computational tool in noninvasive differential diagnosis of breast pathologies based on peripheral blood biomarkers. Some of them – T-cells – already widely used in laboratory and clinical practice, other one – the caspase activity – does not require laborious methods of analysis and significant costs, making them available for routine determination. Acknowledgements: Project was support by the grant No 2014/154 of the Ministry of Education and Science of Russia, scientific research number – 1713 and by the grant of Petrozavodsk State University program of strategic development. 39 THE TENTH INTERNATIONAL CONFERENCE ON BIOINFORMATICS OF GENOME REGULATION AND STRUCTURE\SYSTEMS BIOLOGY DE NOVO SEQUENCING AND COMPARATIVE ANALYSIS OF CHLOROPLAST GENOMES FOR FOUR FERNS OF DRYOPTERIS AND ADIANTUM GENERA M.S. Belenikin* 1, 2 , A.A. Krinitsina 1 , S.V. Kuptsov 1 , M.D. Logacheva 1 , A.S. Speranskaya 1 1 Lomonosov Moscow State University, Moscow, Russia 2 Pirogov Russian National Research Medical University, Moscow, Russia * Corresponding author: genetics.npcmpd@gmail.com Key words: fern, NGS, Illumina, de novo assembly, chloroplast genome Motivation and Aim: Chloroplast (cp) genomes provide various genetic information for evolutionary and functional studies in plants. However, there are limited number of fern cp-genomes, for some genera date on cp-genomes are absent. Here, we present next- generation sequencing, de novo assembly and comparative analysis of cp-genomes for species of Dryopteris genus (D. villarii, D. filix-mas, D. blanfordii) and Adiantum genus (A. hispidulum). Methods and Algorithms: Paired-end library was constructed using TruSeq or Nextera protocols. The sequencing was performed by MiSeq (Illumina) producing PE (2x300bp) read datasets. Then, after read trimming, at the first stage we have performed target pair read filtering using the fern relatives with known sequences of chloroplast genomes . At the second stage we have performed de novo assembly using a number of de-novo assemblers (velvet, mira, spades, newbler) and in-house scripts. Cp-genomes were cir- culated, a few gaps were closed by Sanger sequencing. Protein-coding genes were an- notated by DOGMA (Wyman et al. 2004). Results: Comparative analysis of independently de novo assembled entire cp-genomes showed high identity and gene order both Dryopteris and Adiantum. Inside genera the main differences between cp-genomes of species are short indels that located at inter- genic or intranic regions. However, comparison of cp-genomes of two species of Adian- tum (A. hispidulum and A. capillus-veneris [AY178864]) and three species of Dryopteris (D.villarii, D.filix-mas, D.blanfordii) have showed the loss of tRNA coding gene in in- verted repeat regions of Dryopteris. Conclusion: For Dryopteris genus the complete sequences of cp-genomes were obtained for the first time; comparative analysis of fern genera Dryopteris and Adiantum showed that D. villarii, D. filix-mas, D. blanfordii loss of the the gene, coding one of tRNA. A similar loss of the genes encoding the tRNA was found in cp-genome of tree fern Al- sophila spinulosa (Gao et al., 2009). Acknowledgements: This work was partially supported by RFBR grant 14-04-01852а and RSF grant 14-50-00029. References: 1. S.K.Wyman et al. (2004) Automatic annotation of organellar genomes with DOGMA, Bioinformatics, 20(17): 3252-5. 2. L.Gao et al. (2009) Complete chloroplast genome sequence of a tree fern Alsophila spinulosa: insights into evolutionary changes in fern chloroplast genomes, BMC Evolutionary Biology, 9: 130. 40 THE TENTH INTERNATIONAL CONFERENCE ON BIOINFORMATICS OF GENOME REGULATION AND STRUCTURE\SYSTEMS BIOLOGY EVOLUTION OF RESTRICTION-MODIFICATION SYSTEMS IN LARGE SCALE O.I. Bezsudnova 1 , I.S. Rusinov 1, 2 , A.S. Ershova 2, 3, 4 , A.S. Karyagina 2, 3, 4 S.A. Spirin 1, 2, 5 , A.V. Alexeevski 1, 2, 5 * 1 Faculty of Bioengineering and Bioinformatics 2 Belozersky Institute of Physico-Chemical Biology, Moscow State University, Russia 3 Gamaleya Center of Epidemiology and Microbiology, Moscow, Russia 4 Institute of Agricultural Biotechnology RAS, Moscow, Russia 5 Scientific Research Institute for System Studies, RAS, Moscow, Russia * Corresponding author: aba@belozersky.msu.ru Download 3.91 Kb. Do'stlaringiz bilan baham: |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling