What's in a GeneCard?


This page provides information about the various GeneCards sections and tables.

General Comments

GeneCards Categories

GeneCard Header

This section provides the gene's symbol, category, GIFtS score (see below), and GCid in the box on the left hand side.
Each gene category has its distinct color: protein-coding, pseudogene, RNA gene, gene cluster, genetic locus, and uncategorized.
The gene's symbol and GCid are the color of the gene's category.
The background color of the box that contains the gene's symbol and GCid is indicative of which database the symbol is from:HGNC Approved Genes, EntrezGene Database, Ensembl Gene Database, or GeneCards Generated Genes.
The header also contains a short description of the gene, and whether or not the gene symbol is HUGO Gene Nomenclature Committee (HGNC) database approved.


GeneCards Inferred Functionality Scores (GIFtS)

The GIFtS algorithm uses the wealth of GeneCards annotations to produce scores aimed at predicting the degree of a gene's functionality. Since the degree of known functionality is correlated with the amount of research done on a particular gene or its product, we use these annotations in a scoring system aimed at inferring functionality. Note that while the accumulation of data for a specific gene in certain databases is merely correlated with functionality, many GeneCards sources, like the Gene Ontology (GO) Consortium and Genatlas provide definitive information about functionality.

Our goal is to use these two types of annotations in order to measure the functionality of GeneCards genes. Our first step, was to produce for each gene, a binary vector of 67 elements , indicating presence or absence of data in each relevant source. The GIFtS score of a particular gene is a percentage which is derived from the sum of these binary values divided by the number of sources (the vector length).

Improved GIFtS includes experimenting with increased resolution by using sub-sectioning of data sources and adjusting scores based on the presence or absence of detailed annotations within a source (currently SwissProt). In addition we have introduced weights related to the quantitative aspects of annotations items, enabling better evaluation of the data relevant to annotation levels (currently orthologs and publications). In order to enrich GIFtS with respect to protein data, we selected the pivotal bioinformatics source for such data, namely SwissProt, and dissected it into 6 sub sources: protein subunit, sub cellular location, post-translational modification, function, catalytic activity, and other. Each of these subfields received a binary score as described above, thereby increasing the GIFtS vector size by 5. To weight proteins effectively in the new vectors, the sum of the binary data was still divided by the original number of sources (with SwissProt treated as 1 source for this denominator, in spite of its sub sources contributions to the numerator). To enrich GIFtS by orthologs or publications data, we define a new score for each of those components, which is then added to the default GIFtS. Specifically, the orthologs and publications scores for each gene are calculated as round (logxsum(i)), where x equals 3 for orthologs and 5 for publications, and sum(i) is the number of relevant orthologs or publications. Genes with no orthologs or publications receive score of zero for the relevant component(s); scores rounded down to 0 (for low counts) are normalized to 1.

Overlapping RNA Genes unified location (ORGUL)

RNA genes from fRNAdb and other sources are grouped into ORGULs. We strived to overcome the problem of having many ncRNA entries originating from fRNAdb and other sources that map to appreciably overlapping positions. To cope with such redundant entries, and to unite presumed parallel versions of the same gene, a clustering algorithm was applied to join entries with overlaps greater than 70% of the genomic territory of the smaller partner, when occurring on the same strand. Only entries belonging to the same RNA class were unified (unless a unique class was not assigned to a given entry in which case it could be unified to more than one class). The above procedure allowed us to define Overlapping RNA Genes with Unified Location (ORGUL) clusters [see Belinky, F., Bahir, I., Stelzer, G., Zimmerman, S., Rosen, N., Nativ, N., Dalah, I., Iny Stein, T., Rappaport, N., Mituyama, T., Safran, M., and Lancet, D. Non-redundant compendium of human ncRNA genes in GeneCards. Bioinformatics 15;29(2):255-61 (2013).[Abstract]]. Bellow is the full legend for a graphical representation of the overlapping entries available in the genomic locations section


Quality score

The score indicates how many RNA databases have information about this gene, and whether the RNA gene is expressed or is known to be functional. An RNA gene that is known to be expressed will have a score of at least 5, and an RNA gene that is known to be functional will have a score of at least 10. The quality score Q is computed as the sum Q=10SF+5SE+0.2SP+0.5SN, where Si denote the count of data sources of the following kind: SF, showing functional annotation; SE, showing expression; SP, reporting prediction; SN, none of the above. In this respect, GeneCards does not simply unify information about ncRNAs from other resources, but also attempts to convey evidence parameters.

GeneCards Sections

Aliases

This section displays synonyms and aliases for the relevant GeneCards gene, as extracted from OMIM, HGNC, Entrez Gene, UniProtKB (Swiss-Prot/TrEMBL), GeneLoc, Ensembl, DME, miRBase, NONCODE, and/or RNAdb. Also shown are accessions from HGNC, EntrezGene, UniProtKB, OMIM, Ensembl, fRNAdb, HinvDB and/or Rfam and previous GC identifiers where relevant (for cases that GeneLoc deems it necessary to assign a new identifier to a gene based on updated information about its chromosomal location). Such GC ids will always remain with their original genes and will not be reused with other symbols. Subcategory for genes with category 'RNA gene' was taken from Ensembl's biotype, Entrez Gene's gene type, HGNC's locus type, fRNAdb sequence ontology, and descriptors from Rfam and H-invDB.

Summaries

This section displays descriptions of a gene's function, cellular localization and a gene's effect on phenotype for the relevant GeneCards gene, as extracted from Entrez Gene, UniProtKB (UniprotKB/Swiss-Prot/UniprotKB/TrEMBL), Tocris Bioscience, PharmGKB, and Gene Wiki. The GeneCards-generated summary compiles significant annotations for the gene (such as aliases, diseases, paralogs, and pathways) into a descriptive text.

Genomic Views

This section displays the chromosome, cytogenetic band and map location of the GeneCards gene as extracted from GeneLoc, HGNC, Entrez Gene, Nature (405, 311-319) and miRBase, as well as genomic views from UCSC and Ensembl, and links to promoters, transcription factor binding sites, and Pyrosequencing assays for human and/or mouse/rat orthologs at Qiagen, and/or SwitchGear Genomics. The GeneLoc integrated location is shown in red on the image. If this differs from the location provided by Entrez Gene and/or Ensembl, their locations are shown on the image in green and/or blue respectively. Also provided are links to the GeneLoc gene density information for this gene's chromosome, which shows the number of genes in each 1 Mb interval along the chromosome, and to detailed exon information as provided by GeneLoc.

Whenever a gene consists of a multi-membered ORGUL or is clustered with one, a figure showing the locations of these overlaping members is presented. See GeneCards ORGULs.

Proteins

This section provides annotated information of the proteins encoded by GeneCards genes according to UniProtKB, HORDE, neXtProt, Ensembl, and/or Reactome, the capability to view phosphorylation sites using PhosphoSitePlus, Specific Peptides from DME, a link to the Protein Expression image from SPIRE MOPED, reference sequences (RefSeq) according to NCBI, links for ordering antibodies from EMD Millipore, Cell Signaling Technology, OriGene, Novus Biologicals, R&D Systems, Abcam, Thermo Fisher Scientific, LSBio, Cloud-Clone Corp, and/or others, recombinant proteins from EMD Millipore, R&D Systems, Enzo Life Sciences, Novus Biologicals, OriGene, GenScript, Sino Biological, ProSpec, and/or Cloud-Clone Corp., and assays from EMD Millipore, Cell Signaling Technology, R&D Systems, OriGene, GenScript, Enzo Life Sciences, Cloud-Clone Corp.. Direct links to three-dimensional visualization of PDB structures provided by the OCA browser and Proteopedia. Visualizations are also provided via the (3D) for OCA Browser or the Proteopedia symbol hyperlink shown next to each PDB identifier.
Genes with similar ontologies can be seen using GeneDecks Partner Hunter (more information)

Post-translational modifications

This subsection provides annotated information of post translational modifications according to UniprotKB and neXtProt and the capability to view phosphorylation sites using PhosphoSitePlus. Specific amino acid identity and position of glycosylation and ubiquitination modifications are mined from neXtProt. Amino acid position refers to the sequence of isoform #1 as defined in neXtProt.


Protein Domains/Families

This section provides annotated information about protein domains and families according to HGNC, IUPHAR, InterPro, ProtoNet, UniProtKB and Blocks.
Genes with similar domains can be seen using GeneDecks Partner Hunter (more information)

Function

This section provides annotated information about gene function according to MGI, UniProtKB IUBMB, DME, Genatlas, and LifeMap Discovery™, including: Human phenotypes from GenomeRNAi, transcription factor targeting from Qiagen and/or HOMER, shRNA for human and/or mouse/rat from OriGene, siRNAs from OriGene and for human and/or mouse/rat from Qiagen, miRNA Gene Targets from miRTarBase, microRNA for human and/or mouse/rat orthologs from Qiagen, SwitchGear Genomics, Gene Editing from DNA2.0, Clones from GenScript, Sino Biological, DNA2.0, and for human and/or mouse/rat from OriGene and Vector BioLabs, Cell Lines from GenScript, ESI BIO, Animal models from inGenious Targeting Laboratory (iTL), genOway, in situ hybridization assays from Advanced Cell Diagnostics, Inc. (ACD), as well as molecular function ontologies visualized by the Gene Ontology Consortium (more information).
Genes with similar ontologies can be seen using GeneDecks Partner Hunter (more information).
Information from MGI includes links to mouse knock-outs, phenotypes for mouse orthologs, and a popup table with information on phenotypic alleles of the orthologs. This table presents the following columns:

Genes with similar phenotypes can be seen using GeneDecks Partner Hunter (more information)

Localization

This section provides information about gene localization according to UniProtKB and COMPARTMENTS Subcellular localization database, as well as cellular component ontologies visualized by the Gene Ontology Consortium (more information).

Subcellular locations from COMPARTMENTS:

COMPARTMENTS localization data is integrated from literature manual curation, high-throughput microscopy-based screens, predictions from primary sequence, and automatic text mining (see COMPARTMENTS: unification and visualization of protein subcellular localization evidence). Unified confidence scores of the localization evidence are assigned based on evidence type and source, and visualized both in a table and in the schematic cell image. Confidence scale is color coded, ranging from light green (1) for low confidence to dark green (5) for high confidence. White (0) indicates an absence of localization evidence.

Pathways & Interactions

This section provides SuperPaths from PathCards, links to pathways and interactions according to information extracted from Kyoto Encyclopedia of Genes and Genomes (KEGG), Cell Signaling Technology, R&D Systems, GeneGo (Thomson Reuters), Reactome, BioSystems, Sino Biological, Tocris Bioscience, PharmGKB, Qiagen, UniProtKB, I2D, STRING and MINT, as well as biological process ontologies visualized by the Gene Ontology Consortium (more information).
Genes with similar ontologies and those in the same pathways can be seen using GeneDecks Partner Hunter (more information)
Links to the Qiagen GeneGlobe Interaction Network and the STRING Interaction Network for the relevant gene are also provided.

SuperPaths: unified GeneCards pathways

This table provides links to pathways in a unified view. All pathways from the sources listed above were clustered into SuperPaths for a better understanding of how the different pathways relate to one another. The left column contains a name representing the SuperPath, based on the most connected pathway in the SuperPath (this name giving pathway may or may not contain the gene to which the GeneCard belongs). SuperPaths are linked to PathCards, an integrated database of human pathways and their annotations. Human pathways were clustered into SuperPaths based on gene content similarity. Each PathCard provides information on one SuperPath, which represents one or more human pathways. The right column contains all current gene's pathways that belong to this SuperPath. Each of the contained pathways (in the right column) is followed by a score which is the Jaccard similarity score (0-1) to the most similar pathway. The SuperPaths are sorted by abundance of sources and then by number of gene-related pathways in the SuperPaths.

Interacting proteins

Each line in this table represents one interacting protein, according to EBI-IntAct, MINT, I2D, and/or String. The following columns are presented:

Drugs & Chemical Compounds

This section provides relationships between GeneCards genes and both chemical compounds, ligands and drugs, as well as links to drugs and compounds for ordering at EMD Millipore, Enzo Life Sciences, Tocris Biosciences, and ApexBio. Chemical compound relationships are from HMDB, BitterDB, and Novoseek. Drug compound relationships are from DrugBank, and PharmGKB. Ligand relationships are from IUPHAR. Pharmaceutical uses are provided by UniProtKB.

Tocris compounds and pharmacological data.

This table presents the following columns:

ApexBio compounds and pharmacological data.

This table presents the following columns:

HMDB chemical compound relationships.

This table presents the following columns:

BitterDB Bitter Compounds [Wiener, A., Shudler, M., Levit, A. and Niv, M. Y. BitterDB: a database of bitter compounds, Nucleic Acids Res., 40: D413-D419 (2011)].

This table presents the following columns:

DrugBank drug compound relationships.

This table presents the following columns:

IUPHAR ligands relationships.

This table presents the following columns:

Novoseek chemical compound relationships.

This table presents the following columns:

PharmGKB related drug/compound annotations.

This table presents the following columns:

Genes with similar drug and compound relationships can be seen using GeneDecks Partner Hunter (more information)

Transcripts

This section contains associated Unigene clusters and representative sequences, REFSEQ mRNAs, RNA secondary structures from fRNAdb, siRNAs from OriGene and Qiagen, shRNA for human and/or mouse/rat from OriGene and microRNA for human and/or mouse/rat orthologs from Qiagen, SwitchGear Genomics, clones from GenScript, Sino Biological, DNA2.0, and for human and/or mouse/rat from OriGene and Vector BioLabs, primers for human and/or mouse/rat orthologs from OriGene, and/or Qiagen, assemblies (sorted by a scoring scheme that gives preferences to mRNAs over EST associations) from DOTS, transcript and alignment information from AceView, additional gene/cDNA sequences from GenBank, exon structure information from GeneLoc, alternative splicing information, and transcript links to Ensembl.

Secondary structures

This subsection contains RNA secondary structures according to fRNAdb.

Alternative Splicing This subsection contains alternative splicing information according to ASD followed by alternative splicing isoforms from ECgene. Exons with alternative splice sites in different isoforms were broken into Exonic Units (ExUns). The letters indicate the order of the ExUns in the exon. The symbol ' ^ ' between ExUns indicates an intron, while ' ' indicates the junction of two ExUns. Mouseovers on the dark blue squares show the Exun's genomic coordinates, while mouseovers on the light blue squares show its transcript coordinates. When showing ASD's splice variants, GeneCards subtracts the 3000 bp flank that ASD adds to the transcript coordinates.
Note: We currently do not have any links to ASD, as their data has been frozen and their site taken down. We plan to upgrade this subsection.

Expression

RNA expression data (presence/absence) for RNA genes is according to H-InvDB, NONCODE, miRBase, and RNAdb.

This section contains expression images based on data from BioGPS, Illumina Human BodyMap, and SAGE, with SAGE tags from CGAP, followed by a table with expression data from LifeMap Discovery™, Protein Expression data from SPIRE MOPED, PaxDb, and MaxQB, links to SOURCE, tissue specificity data from UniProtKB, expression via Pathway & Disease-focused RT2 Profiler PCR Arrays for human and/or mouse/rat from Qiagen, primers for human and/or mouse/rat orthologs from OriGene and/or Qiagen, and in situ hybridization assays from Advanced Cell Diagnostics, Inc. (ACD).

BioGPS
Measurements were obtained for 76 normal human tissues and compartments hybridized against HG-U133A. The Affymetrix MAS5 algorithm was used for array processing and probesets were averaged per gene.

Illumina body map
RNA obtained from 16 normal human tissues was sequenced and mapped to genes via their transcripts. Fragments Per Kilobase of exon per Million fragments mapped (FPKM) were calculated using the Cufflinks program and thereupon rescaled by multiplying FPKM by 100 and then calculating the root.

CGAP: SAGE Normal
Serial Analysis of Gene Expression: For 19 normal human tissues, CGAP datasets Hs.frequencies and Hs.libraries are mined for information about the number of SAGE tags per tissue. Tags are reassigned to a Unigene cluster and after that to a particular gene by mining Hs.best_gene, Hs.best_tag and Hs_GeneData. The expression level of a particular gene in a particular tissue was calculated as the number of appearances of the corresponding tag divided by the total number of tags in libraries derived from that tissue. These fractions were then rescaled by making the geometric mean of all tissues equal. Please note: Currently, only associations with minimal ambiguity participate in the analysis.

Tissues and anatomical compartments are colored according to 6 categories - Immune (red), Nervous (green), Muscle (yellow), Internal (blue), Secretory (violet) and Reproductive (turquoise).

Normalized intensities are drawn on a root scale, which is an intermediate between log and linear scales. Values are not comparable between datasets (i.e. Microarray, RNAseq and SAGE).

Genes with similar binary patterns can be seen using GeneDecks Partner Hunter (more information)

LifeMap Discovery™ Table
This table provides links to developmental and in vitro expression information in LifeMap Discovery™, the Embryonic Development and Stem Cells Database. Linked in-vivo cells or anatomical compartments where the gene is expressed also provide the tissue/organ of origin (using arrows). Links to stem cell differentiation are noted as "in vitro cells" or as "protocol derived cells". Additionally, there are links to datasets from external sources comprising high throughput experiments, such as microarray and RNA sequencing. The expression level (selective marker (cell-identifying gene) , positive , negative ) is also presented for each of the gene expression links. The table is grouped by tissue and sorted by number of hits, so tissues with more information are shown first.

Protein Expression Images

Presentation of protein expression images for 49 tissues, fluids and cells. Data sources:
  1. MOPED - Eugene Kolker, Bioinformatics & High-throughput Analysis Lab, Seattle Children's Research Institute.
  2. PaxDb - Christian von Mering, Bioinformatics Group, Institute of Molecular Life Sciences, University of Zurich.
  3. MaxQB - Matthias Mann, Department of Proteomics and Signal Transduction, Max-Planck Institute of Biochemistry, Germany.
The data was normalized as follows:
  1. For each sample, ppm protein values were calculated, if not provided so by data sources. For each sample from MaxQB, iBAQ expression values were divided by sum of values of each sample, and multiplied by 1,000,000. iBAQ, intensity-based absolute quantification, is a proxy for protein abundance levels (see http://www.nature.com/nature/journal/v473/n7347/full/nature10098.html#supplementary-information). For all samples, data was gene centrically aggregated by summing expression values of all isoforms for each gene.
  2. For better visualization of graphs, expression values are drawn on a root scale, which is an intermediate between log and linear scales as used for our mRNA expression graphs [see Safran, M., Chalifa-Caspi, V., Shmueli, O., Olender, T., Lapidot, M., Rosen, N., Shmoish, M., Peter, Y., Glusman, G., Feldmesser, E., Adato, A., Peter, I., Khen, M., Atarot, T., Groner, Y., and Lancet, D. Human Gene-Centric Databases at the Weizmann Institute of Science: GeneCards, UDB, CroW 21 and HORDE . Nucleic Acids Research 31,1:142-146 (2003).[PDF]].
List of samples and its sources:

SampleMOPEDPaxDbMaxQB
Serum
Plasma
Monocyte
Neutrophil
B-lymphocyte
T-lymphocyte
Platelet
T-lymph Jurkat cancer
Myeloid K562 leukemia
Lymphobl. leukemia, CCRF-CEM
Kidney HEK-293
Adipocyte
Nasal respiratory epithelium
Liver
Liver secretome
Liver HuH-7 cancer
Liver HepG2 cancer
Lung Alveolar lavage
Lung
Lung NSC cancer, NCI-H460
Lung A549 cancer
Kidney RXF393 cancer
Colon RKO cancer
Colon Colo205 cancer
Urine
Heart
Bone
Bone U2OS cancer
Brain
Brain U251 cancer
Brain GAMG cancer
Cerebrospinal fluid
Ovary
Ovarian SKOV3 cancer
Prostate PC3 cancer
Prostate LnCap cancer
Cervical epithelium
Cervical mucus
Cervix HeLa S3 cancer
Cervix HeLa cancer
Saliva
Skin
M14 Melanoma
Breast
Breast LCC2 cancer
Breast MCF7 cancer
Pancreas
Pancreatic juice
Pancreatic cancer cell line

Orthologs

This section contains Orthologs from HomoloGene, Ensembl pan taxonomic compara, euGenes, SGD, and MGI, with possible further links to Flybase and WormBase.

*Ensembl pan taxonomic compara doesn't have its own pages on the Ensembl site.

The table presents the following columns:

The species presented from Ensembl pan taxonomic compara were chosen to constitute a diverse collection of taxa including model organisms and species of interest. Currently, all available species from the Homologene database (old and new) are included.

Superscripts represent the source from which this data was extracted. Data from HomoloGene can have one of two superscripts. If the second one is cited, it means that data for this species exists only in the older version of HomoloGene, which used unfinished genomes and where the homologs found might not be true orthologs.

Following the table are links to Ensembl and TreeFam gene trees.

Paralogs

This section contains Paralogs from HomoloGene, Ensembl (similarities shown on mouseover), and SIMAP , and Pseudogenes from Pseudogene.org. Genes with similar paralogs can be seen using GeneDecks Partner Hunter (more information). Paralogs obtained from SIMAP were chosen according to a fixed similarity score, shown on mouseover, to allow an average of 30 paralogs per protein-coding gene.

Genomic Variants

This section contains SNPs/Variants from the NCBI SNP Database, Ensembl, PupaSUITE and DNA2.0, with descriptions from UniProtKB, Linkage Disequilibrium images from HapMap, Structural Variations (CNVs/InDels/Inversions) from the Database of Genomic Variants, Mutations from HGMD, The Human Cytochrome P450 Allele Nomenclature Database, the Human Genome Variation Society's Locus Specific Mutation Databases (LSDB), and BGMUT, PCR Resequencing Primers for human and/or mouse/rat orthologs from Qiagen, and Cancer Mutataion PCR Arrays and Assays and Copy Number PCR Arrays from Qiagen.

SNPs

SNP information is currently extracted from dbSNP XML and UniProt's Human polymorphisms and disease mutations files. Filtering is done to include only those that are not artifacts, not connected to gene duplication, not withdrawn by NCBI, fully specified, without ambiguous locations or low map quality, and having single Entrez Gene and contig ids. The order of a gene's displayed SNPs can be determined by the user. By default, SNPs are sorted first (shown in the select box as 1st) by validation status (validated before non-validated), then, within these groups, by ordered clinical significance (in the following order: drug-response, histocompatibility, non-pathogenic, pathogenic, probable-non-pathogenic, probable-pathogenic, untested, unknown, other, and none listed) as the secondary (2nd) nested criterion, and finally by location type (first coding non-synonymous, then coding synonymous, followed by coding, splice site, mRNA-UTR, intron, locus, reference, and/or exception). The user can change this default sort order and define up to three hierarchical sorting priorities from fields available as select boxes above the relevant columns on the section's button line as follows: rs-numbers (sorted in ascending order), validation status, clinical significance, position on the chromosome (ascending order), location type, allele frequencies (existing info before non-existing), population types (alphabetical order), and total sample size (largest to smallest). Each displayed line includes genomic, expression, and allele frequency data sections. Only the summary is shown for the expression and allele frequency sections, with a link to the detailed information (via the magnifying glass icon).

This table presents the following columns:

  • More - View individual records
  • Allele freq - Average frequency of the allelles for all populations, displayed as a pie-chart (only if 2 alleles). Alleles are in the same orientation and color as the displayed SNP sequence. Numeric info about the frequencies is available using the mouseover.
  • Pop - population type:
    • CA: CENTRAL ASIA
    • CSA: CENTRAL/SOUTH AFRICA
    • CSAM: CENTRAL/SOUTH AMERICA
    • EA: EAST ASIA
    • EU: EUROPE
    • MN: MULTI-NATIONAL
    • NA: NORTH AMERICA
    • NEA_ME: NORTH/EAST AFRICA & MIDDLE EAST
    • PA: PACIFIC
    • WA: WEST AFRICA
    • NS: NOT SPECIFIED
    • UNK: UNKNOWN
  • Total sample - total data sample size (number of chromosomes)
  • Additional columns in Expression data popup:

    Additional columns in Allele Frequency data popup:

    This section also provides Linkage Disequilibrium (LD) information from HapMap.

    Structural Variation Table
    Information from the Database of Genomic Variants (DGV) is provided, containing each variant ID with its type (CNV or OTHER), its subtype (deletion, duplication, insertion, loss, gain, inversion, gain+loss, CNV, or complex), and a PubMed ID.

    Disorders / Diseases

    This section contains Disorders in which GeneCards genes are involved, according to MalaCards, OMIM, UniProtKB, the University of Copenhagen DISEASES database, Novoseek, Genatlas, GeneTests, GAD, HuGENavigator, and/or TGDB. When possible, disorders are sorted by their relevance to the gene, with scores presented either explicitly in a table, or via mouseovers on disease names.

    Novoseek disease relationships

    This table presents the following columns:

    Genes with similar disease relationships can be seen using GeneDecks Partner Hunter (more information)

    Publications

    This section provides titles of and links to research articles in PubMed, as associated via Novoseek, HGNC, Entrez Gene, UniProtKB, PharmGKB, GAD, HMDB, and/or DrugBank.

    The articles are ranked, first according to the number of GeneCards sources that associate the article with this gene and then by date of publication, and then according to the Novoseek score for this article/gene relationship. The year of publication appears in parentheses after the title of each article. Lower ranked articles may also appear in initial results if their titles or authors contain your search term.

    External Searches

    This section allows the user to search PubMed, OMIM, or NCBI Bookshelf. The current gene's aliases and disorders are provided, as well as the search string that led to the gene, to be used as search fodder. The user can also add new search terms.

    How To Search: The search box allows the user to search for aliases and/or free text in either PubMed, OMIM or NCBI Bookshelf. If you wish to simply search for a variety of aliases, select each aliases while holding down the control key. This type of search will search for any of the aliases, if you wish to search for all aliases selected you must go to the free text box (next to the search button) and change all of the OR's to AND's, manually. You may also enter free text and search for the aliases selected AND/OR (use radio buttons to the left of the box to select this) the free text. Once again, if you would like to only find documents that have all of the aliases selected you must change the OR's to AND's in the Query String box.

    Databases

    These sections provide links to the GeneCards genes in other databases:

    Intellectual Property

    This section features Patent information from GeneIP and technologies that are available for licensing. Institutions currently featured include the Weizmann Institute of Science, the Salk Institute for Biological Studies, and Tufts University. Also included in this section is IP news from LifeMap Sciences, Inc.

    Products

    This section provides links to reagents available from EMD Millipore, and/or R&D Systems, proteins, lysates, and/or antibodies available from Cell Signaling Technology, EMD Millipore, R&D Systems, OriGene, GenScript, Novus Biologicals, Sino Biological, Enzo Life Sciences, Abcam, ProSpec, Thermo Fisher Scientific, LSBio, Cloud-Clone Corp, and/or others, drugs and compounds available from EMD Millipore, Tocris Biosciences, ApexBio, and/or Enzo Life Sciences, Gene Editing from DNA2.0, clones and/or primers available from OriGene, Qiagen, DNA2.0, Vector BioLabs, SwitchGear Genomics, GenScript, and/or Sino Biological, Cell Lines from GenScript and/or ESI BIO, GPCR/Kinase Profiling, Assay development, GPCR & ELISA assays available from GenScript, R&D Systems and/or Cloud-Clone Corp., Animal models from inGenious Targeting Laboratory (iTL), genOway, in situ hybridization kits from ACD, and links to reagents for mouse/rat orthologs from Qiagen.


    Gene Ontology (GO) Tables

    The Gene Ontology sections in Function, Localization, and Pathways & Interactions display a table with the following columns:

    GeneDecks Partner Hunter

    GeneDecks Partner Hunter is available for ontologies, phenotypes, drugs and compounds, sequence-based paralogs, disorders, pathways, and domains. By clicking on the GeneDecks Partner Hunter button for a particular section, one arrives at the GeneDecks home page, where the gene name has been entered and the appropriate fields selected from the attribute list. From this page, changes can be made to the data requested. Submitting this form brings up a result page containing a list of genes similar to the chosen gene and their descriptions.

    Selected Algorithms

    Novoseek Scoring Algorithm

    The relevance scores of elements related to genes (chemical substances and diseases) are based on the analysis of co-occurrences of two elements in Medline documents. The observed number of documents where both elements appear together and the number of documents where both appear independently are compared to an expected value based on a hypergeometric distribution. The more co-occurrences are observed in relation to the number expected the more unlikely it is that this happened by chance and the higher will be the value. Unfortunately the absolute numbers are not meaningful but can only give an order of importance (i.e. in the list of chemicals related to a gene the order is meaningful and the first chemicals in the list are, statistically, stronger related to the gene than the following ones but the absolute values of the scores may change from one release to another).
















    Developed at the Crown Human Genome Center, Department of Molecular Genetics, the Weizmann Institute of Science

    Version: 3.12.166 28 Aug 2014
    hostname: 356980-web2.xennexinc.com index build: 126 solr: 1.4