Plant mitochondrial genomes (mitogenomes) have diverse and complex structures. They are difficult to annotate completely and accurately due to the presence of multiple trans-spliced genes, mitochondrial plastid DNAs (MTPTs), tRNA genes of different origins and extensive RNA editing modifications. Here, we developed a computation pipeline, Plant Mitochondrial Genome Annotator (PMGA), to address these challenges.
a. PMGA contains a high-quality database that includes: (1) 1147 gene sequences from 29 angiosperm plants, corrected with RNA-seq data; (2) 11,092 gene sequences from 319 mitogenomes, based on bioinformatic methods; and (3) origin information from 423 tRNA genes of 12 species.
b. PMGA comprises three algorithms: (1) 'Analysis of Upstream Extended Sequences' (AUES) for nad5 small exon annotation; (2) 'Assembling Exons with Weighted Direct Graph' (AEWDG) for assembling transcripts of trans-spliced genes; and (3) 'Multiple Dimension Annotation of tRNA Genes' (MDA-tRNA) to determine the origins of tRNA genes.
c. PMGA has developed utility modules for simultaneous annotation of multi-chromosome mitogenome.
d. PMGA incorporates the CPGAVAS2 [1] algorithm to identify mitochondrial plastid genes.
e. PMGA integrates MISA [2], Tandem Repeat Finder [3], and Vmatch (REPuter) [4] for identifying repeat sequences.
f. PMGA integrates Deepred-MT [5] for identifying RNA editing sites in CDS.
g. PMGA uses tRNAscan-SE 2.0 [6] and ARAGORN [7] for annotating tRNA genes.
h. PMGA draws circular mitogenome maps using OGDRAW version 1.3.1 [8].
a. Annotate - This module can annotate a mitogenome when user provides a mitochondrial sequence in FASTA format
b. GetORFs - This module can identify open reading frames (ORFs) by calling getorf from EMBOSS (version 6.6.0.0) [9] with the parameter: length >= 300 bp. (This module is under test!)
c. CleanSeq - This module helps users quickly remove degenerate bases from mitogenome sequences. However, we recommend manual correction of degenerate bases based on the reads mapping results.
d. RNA Editing - This module can be used to identify RNA editing sites of mitochondrial protein-coding genes.
e. Help - This module provides a concise PDF document that details the use of PMGA and the viewing and interpretation of the results.
PMGA: A Plant Mitochondrial Genome Annotator.
Li J, Ni Y, Lu Q, Chen H, Liu C. PMGA: A Plant Mitochondrial Genome Annotator. Plant Commun. 2024 Nov 9:101191.
URL: https://doi.org/10.1016/j.xplc.2024.101191
The other third-party tools include:
[1]. Shi L, Chen H, Jiang M, Wang L, Wu X, Huang L, Liu C: CPGAVAS2, an integrated plastome sequence annotator and analyzer. Nucleic Acids Res 2019, 47(W1):W65-w73.
[2]. Beier S, Thiel T, Munch T, Scholz U, Mascher M: MISA-web: a web server for microsatellite prediction. Bioinformatics 2017, 33(16):2583-2585.
[3]. Benson G: Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 1999, 27(2):573-580.
[4]. Kurtz S, Choudhuri JV, Ohlebusch E, Schleiermacher C, Stoye J, Giegerich R: REPuter: the manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res 2001, 29(22):4633-4642.
[5]. Edera AA, Small I, Milone DH, Sanchez-Puerta MV: Deepred-Mt: Deep representation learning for predicting C-to-U RNA editing in plant mitochondria. Comput Biol Med 2021, 136:104682.
[6]. Chan PP, Lin BY, Mak AJ, Lowe TM: tRNAscan-SE 2.0: improved detection and functional classification of transfer RNA genes. Nucleic Acids Res 2021, 49(16):9077-9096.
[7]. Laslett D, Canback B: ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucleic Acids Res 2004, 32(1):11-16.
[8]. Greiner S, Lehwark P, Bock R: OrganellarGenomeDRAW (OGDRAW) version 1.3.1: expanded toolkit for the graphical visualization of organellar genomes. Nucleic Acids Res 2019, 47(W1):W59-w64.
[9]. Rice P, Longden I, Bleasby A: EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet 2000, 16(6):276-277.
This website is freely accessible to all users, including commercial users, allowing them to browse, view, and use the content available herein for personal, educational, or commercial purposes. No fees or charges are associated with accessing the materials provided on this website.
For questions and comments, please send email to cliu@implad.ac.cn; Lijingling1997@163.com.
Center for Bioinformatics
Institute of Medicinal Plant Development
PeKing Union Medical College
Chinese Academy of Medical Sciences
Address: No. 151, Malianwa North Road, Haidian District, Beijing 100093, P.R.China
Last updated: June 11th, 2024.