Nearly half of the known pathogenic genetic variants are induced by single-nucleotide variants (SNVs) and most of the observed SNVs are lacking a clinical interpretation. Recently, clustered regularly interspaced short palindromic repeats (CRISPR) system has been used as powerful base editors to correct SNVs with high efficiency or introduce SNVs to investigate how human genetic variations impact human health. Moreover, CRISPR base editing is also used to improve specific traits in breeding practice of economic animal and plant. CRISPR/Cas-mediated base editing technology is a promising genome editing tool as (i) it does not require a DNA template to introduce mutations and (ii) it avoids creating DNA double-strand breaks, which can lead to unintended chromosomal alterations or elicit an unwanted DNA damage response. Nowadays, plentiful studies have provided large numbers of information on various base editing system of CRISPR. However, the detailed information is discrete, which is hard to search and utilize. Hence, an integrated resource and analysis platform is urgently needed for comprehensive annotation and comparative analysis of the off-target effect and the on-target efficiency among various CRISPR-mediated BE systems.
Herein, we developed the CRISPRbase, a novel platform that integrated comprehensive information for 1 252 935 records of base editing outcomes in 17 species, covering more than 54 cell types. CRISPRbase provides putative editing precision of different BE systems by incorporating multiple annotations including “organisms”, “genomebuild”, “chromosome”, “position”, “sgRNA sequences”, “PAM sequences”, “strand direction”, “mutant”, “editing systems”, “cell/tissue name”, “CRISPR type”, “Cas variant” and “mutation frequency”. CRISPRbase employed ANNOVAR and OncoBase to predict the mutational effects and evaluate the outcome of off-targeting. Therefore, CRISPRbase is an integrated database to systematically prioritize the small guide RNA sequences (sgRNAs), protospacer adjacent motif sequences (PAMs), base editor systems, mutation sites and editing efficiency of target genes. Currently, we have analyzed base editing data in the nearly 200 published studies and will regularly update the collection of base editing data from other projects/literatures in the future. To our knowledge, CRISPRbase is the first integrated platform designed to present mutation information and sgRNA designing of various CRISPR-mediated base editing systems in multiple organisms. We believe CRISPRbase will help users to design suitable sgRNA according to different target genes in various organisms.
CRISPRbase is a comprehensive database curating the outcome and evaluating off-target effects of base editors on various cell types and tissues in dozens of species. We collected more than 1.2 million records of base editing outcome and annotated these outcomes with functional and oncogenic effects of mutations using ANNOVAR and OncoBase, respectively. Furthermore, our database comprises a search tool which allows searching for deposited sgRNA sequences with <= 4 mismatches to the searched sequence. Academic users can freely search, browse and download related data and get analytical results through the web interface.