The Cyanogenes Data Base

A web database for exploring the Synechocystis sp. PCC 6803 genome

Overview

Synechocystis sp. PCC 6803 is a well-studied cyanobacterium and a valuable model organism for research in photosynthesis and biotechnology. However, the lack of a user-friendly web database for exploring its genome has been noted due to long intermittent shutdowns of the Cyanobase portal. To tackle this issue, we have developed CyanoGenes, a user-friendly web server that facilitates interactive exploration of the Synechocystis genome


Genome Functional Annotation and Webpage Functionalities

For annotating the genome of the cyanobacterium Synechocystis, we used the RefSeq annotation information from the Synechocystis assembly GCF_000009725.1, sequenced by Kazusa, which contains 3,741 genes and 3,622 CDS. Additionally, we incorporated functional information from the Cyanobase and Uniprot database. Cyanogenes allows users to search by locus tag/gene symbol and annotate gene lists. It also provides an easy way to inspect the genomic context of each gene and download DNA and protein sequences for each ORF. Moreover, we have added a genomic position selector to allow the download of any DNA fragment.


Protein Clustering Analysis by Structural Homology

Cyanogenes includes the first structural homology analysis of the Synechocystis genome. To determine the number of structural homologue groups present in the Synechocystis genome, we clustered the 3,622 CDS using Foldseek with a coverage threshold of 80%. This resulted in a total of 469 clusters containing at least two proteins and 1,620 singletons. This means that 55% of Synechocystis proteins (2,002 protein sequences) were clustered after the analysis. These 469 protein clusters can be analyzed in Cyanogenes, available at the bottom of each gene profile. Additionally, all members of each structural cluster have been functionally reannotated with GO functional categories using eggNOG5, and are linked to their structural predictions from AlphaFold. We hope this information will assist in the search for remote homologues in the functional analysis of this cyanobacterium.


Gene Transcriptional Profiles

CyanoGenes incorporates the 4,091 transcriptional units (TU) delineated by Matthias Kopf et al. and their transcriptional patterns across 10 distinct experimental conditions (15°C, 42°C, low light, high light, phosphate deprivation, nitrogen deprivation, CO₂ deprivation, darkness, iron deprivation, and stationary and exponential growth phases). This functionality enables users to investigate the influence of different TUs on their associated genes and the impact of diverse environmental conditions on gene expression profiles.