About GeneDecks

GeneDecks is a novel analysis tool which provides a similarity metric by highlighting shared descriptors between genes, based on the rich annotation within the GeneCards compendium of human genes. GeneDecks features Partner Hunter and Set Distiller.

In Partner Hunter, users supply a query gene, and the system finds putative functional paralogs, namely genes that are similar to the query gene based on combinatorial similarity of attribute annotations.
In Set Distiller, users supply a set of genes, and the system ranks descriptors by their degree of sharing within the given gene set. GeneDecks enables the elucidation of unsuspected putative functional paralogs, and a refined scrutiny of various gene-sets (e.g. from high-throughput experiments) for discovering relevant biological patterns.

Partner Hunter Algorithm

Partner Hunter calculates similarity scores between each query gene and all remaining candidate genes in the GeneCards database for 10 attributes that appear in table 1. For all attributes except Gene Ontology, and sequence paralogy, the similarity score between a query gene and a candidate gene is calculated in the following manner: each descriptor score (DS) is the result of dividing its rank by Log10 of its frequency in the database Descriptor ranks are each assigned the value of 1, except for those associated with the Gene Ontology (GO) attribute, which are assigned the descriptor's evidence code (Buza et al. 2008); for example Inferred from Direct Assay (IDA) will receive a descriptor score of 5 The attribute score (AS) is the sum of the descriptor scores for those descriptors shared by both the query gene and the candidate gene, divided by the sum of the descriptor scores for all descriptors associated with the query gene For the sequence paralogy attribute, if a partner candidate is also identified as a sequence paralog (SP), then it is assigned a value of 1 for this attribute and 0 otherwise Gene expression data was mined from BioGPS ( The similarity score is the mean Pearson correlation (P.Corr) between all expression vectors for the query gene and candidate gene This improves GeneDecks'ing for expression patterns, since it looks for vector correlations rather than binary expression pattern exact matches and is therefore less stringent.

The attribute score is then multiplied by the weight given for the attribute and all attribute scores are then summed to give the Partner Hunter score (PHS)

Set Distiller algorithm

Set Distiller employs descriptors from 9 out of the possible 10 attributes that appear in table1, for user-defined query gene sets. For each descriptor, a p-value is calculated from the binomial distribution, testing the null hypothesis that the frequency of the descriptor in the query set is not significantly different from what is expected with a random sampling of genes, given the frequency of the descriptor in the set of all genes. Expression level vectors were calculated as binary values thereby assigning a binary expression vector for each gene as previously described (Yanai et al. 2005). Only tissues where the expression was observed were used and treated as all other descriptors. Descriptors are sorted by increasing P-value and then by decreasing occurrence counts within the gene set. Bonferroni correction was used to correct for multiple testing and only descriptors with P-value > 0.05 are displayed.

Table 1

The attributes used in GeneDecks algorithms with their contributing data sources. Attribute inclusion in the Partner Hunter or Set Distiller algorithms is marked.

Attribute Partner Hunter Set Distiller Data Source
Sequence paralogy +  
  • Ensembl
  • HomoloGene
Domains + +
  • InterPro (Ensembl)
Super Pathways + +
  • GeneCards
Expression patterns + +
  • BioGPS
Phenotypes + +
  • Mouse Genome Informatics (MGI)
Compounds + +
  • Tocris Bioscience
  • Human Metabolome Database(HMDB)
  • BitterDB
  • DrugBank
  • Novoseek (formerly Alma Knowledge Server)
  • PharmGKB
Disorders + +
  • MalaCards
  • On-line Mendelian Inheritance in Man(OMIM)
  • UniProtKB
  • University of Copenhagen DISEASES
  • Novoseek (formerly Alma Knowledge Server)
  • Genatlas
  • GeneTests (formerly GeneClinics)
  • The Breast Cancer Gene Database (BCGD)
Gene Ontology + +
  • Entrez Gene (National Centerfor Biotechnology Information - NCBI)
  • Ensembl

Developed at the Crown Human Genome Center, Department of Molecular Genetics, the Weizmann Institute of Science
This site does not provide medical advice and is for research use only
Version: 3.12.396 26 May 2015
hostname: index build: 128 solr: 1.4