Vanno: A Visualization‐Aided Variant Annotation Tool

Po-Jung Huang,1, 2 ‡ Chi-Ching Lee,1 ‡ Bertrand Chin-Ming Tan,3 Yuan-Ming Yeh,4 Kuo-Yang Huang,5Ruei-Chi Gan,1 Ting-Wen Chen,1 Cheng-Yang Lee,1 Sheng-Ting Yang,6 Chung-Shou Liao,6 Hsuan Liu,2, 7 †and Petrus Tang1,2,5

Next‐generation sequencing (NGS) technologies have revolutionized the field of genetics and are trending toward clinical diagnostics. Exome and targeted sequencing in a disease context represent a major NGS clinical application, considering its utility and cost‐effectiveness. With the ongoing discovery of disease‐associated genes, various gene panels have been launched for both basic research and diagnostic tests. However, the fundamental inconsistencies among the diverse annotation sources, software packages, and data formats have complicated the subsequent analysis. To manage disease‐associated NGS data, we developed Vanno, a Web‐based application for in‐depth analysis and rapid evaluation of disease‐causative genome sequence alterations. Vanno integrates information from biomedical databases, functional predictions from available evaluation models, and mutation landscapes from TCGA cancer types. A highly integrated framework that incorporates filtering, sorting, clustering, and visual analytic modules is provided to facilitate exploration of oncogenomics datasets at different levels, such as gene, variant, protein domain, or three‐dimensional structure. Such design is crucial for the extraction of knowledge from sequence alterations and translating biological insights into clinical applications. Taken together, Vanno supports almost all disease‐associated gene tests and exome sequencing panels designed for NGS, providing a complete solution for targeted and exome sequencing analysis. Vanno is freely available at http://cgts.cgu.edu.tw/vanno.

page4image28181728

Figure 2. Output of Vanno.

A: Number of samples, identified mutations, and altered genes are summarized in the upper-left block of the output page. The alternation frequency and mutant composition of each gene in the cohort study are summarized in a concentric histogram with the height determined by the frequencies of altered samples. An interactive heatmap with mutually exclusive sorting functionality is also provided for distinguishing driver mutations from passenger mutations. The download links for the detailed annotation table and the related mutant cDNA and protein sequences are provided in this block. B: Cascade of filters is provided based on items such as chromosome, gene symbol, coverage, mutation frequency, and molecular consequence for variants of interest. The identified variants are collapsed at the gene level, and subsequently aggregated by functional classifications such as pathway, disease, and protein domain, providing an alternative way to identify variants of interest. C: After turning on the “compare with TCGA” switch located at the upper-left block, an additional block colored in red will be displayed at the upper-right corner, providing a unique feature to compare an identified mutation with mutation spectrum of a specific cancer type deposited at the TCGA. D: The distribution of variants is summarized in dynamic pie charts based on items such as chromosome, gene symbol, mutation type, sample names, and protein domain. Alternation events are summarized at the gene level and amplicon level, and rendered as Circos ideograms. Base variant information and a mutation spectrum on the relevant protein domain can be displayed in pop-up windows when hovering the mouse cursor over the Circos plots. Protein tertiary structure with a mutation site highlighted can also be displayed using JSMol when the protein structure is available. The full content of the annotation table can be found on our tutorial page (http://cgts.cgu.edu.tw/vanno/Tutorial/index.php#Table).