Home > Help > Methods

The expression profile data is downloaded and processed in poplar

Almost the sequencing items in the GEPSdb database are downloaded from the GEO database (https://www.ncbi.nlm.nih.gov/) besides some publishment, and then have been normalized for expression. The raw data are classified according to the types of experimental items: biotic stress (disease and insects) and abiotic stress (drought, salt, frozen, heat, ozone, hypoxia, nitrogen, mechanical wounding, hormone, metal and radiation). At the same time, the data of datasets under different stress processing were consolidated, and the data of samples under the same kind of processing were averaged in the same datasets. The analysis of differential genes is performed by screening the data downloaded from the "Analyze with GEO2R" option in the GEO online database (P value < 0.05, LogFC>│2│). Then we performed chromosome position analysis (Circos) on all poplar genes transferred into the ontology under stress conditions.

Annotation of poplar gene in GEPSdb

(i)Basic gene characteristic

Initially, we obtained basic information about the poplar genes from Phytozome[1], including gene sequence, CDS sequence, protein sequence, NCBI gene accession, chromosome position, protein length, prediction of subcellular location, Pfam and GO annotation from biomart (https://phytozome.jgi.doe.gov/pz/portal.html). We used the computational PI/MW tool in ExPasy to obtain molecular weights and isoelectric points (PI) for predicting protein sequences[2].

(ii)Evolutionary analysis

We downloaded the poplar paralogs annotated in Ensembl using Biomart[3]. Orthologs across the poplar, Arabidopsis, Brachypodium, Gossypium, Rice, sorghum, grapes and corn genomes were also downloaded using Biomart. Subsequently, we calculated Ka, Ks values to further compare each homologous genes involved in the database on the evolutionary relationship.

(iii)Expression patterns of poplar genes in GEPSdb

The NCBI GEO database was used to obtain 11 high quality RNA-seq experiment and 51 Microarray Library to display the expression patterns of poplar genes. Furthermore, the line graph can show the expression level of a gene under different stress treatments or the same treatment on different datasets.

References

[1]  Rhee, S.Y., Beavis, W., Berardini, T.Z., et al. (2003) The Arabidopsis Information Resource (TAIR): a model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and community. Nucleic Acids Research, 31, 224.
[2]  Artimo, P., Jonnalagedda, M., Arnold, K., et al. (2012) ExPASy: SIB bioinformatics resource portal. Nucleic Acids Research, 40, W597.
[3]  Herrero, J., Muffato, M., Beal, K., et al. (2016) Ensembl comparative genomics resources. Database the Journal of Biological Databases & Curation, 2016, bav096.