Version 3
|
|
||||||
GeneCards Version 3 Search Guide
GeneCards Version 3 has an improved search that is faster and provides more accurate results than GeneCards 2.xx. Version 3.0 also provides paging capabilities and more readable minicards. This version of 3.0 is a hybrid system, in which the database, search, and MiniCards have been improved, but the actual cards remain the same as those in 2.xx, with the addition of a search bar at the top of the page. Therefore, it is possible that there might be slight discrepancies between the search results and the information displayed in the card.Differences between GeneCards Versions 2.xx and 3.0
V3 uses stemming in all of its searches so that similar words will be found rather than just exact matches as in 2.xx.
A search for multiple words (ex. zona pellucida) behaves as an AND at the gene level in 3.0 (i.e. each of the words must exist in at least one of the sections of the GeneCard) while 2.xx regards them as a single phrase. To get a similar behavior in 3.0 to the 2.xx version simply add quotes to your search.
Parentheses should be used in searches for complex boolean strings in order to indicate precedence otherwise AND operations will take precedence over OR operations. See the example in Multiword Search. In version 2.xx complex boolean expressions were processed from right to left.
Table of Contents
What Do I Get?
There are 2 types of cards in GeneCards:
Minicard - The search results are first displayed closed, showing the gene symbol, description, category, GCID and a relevance score. To open the minicard click on the plus to the left of the gene symbol on the appropriate minicard. All fields of the GeneCard in which your search term(s) were found will be displayed. All of the keywords entered in your search, including any variants found due to stemming, will be highlighted in the minicards.
The minicard list is sorted by relevance (determined by the relevance scoring method).GeneCard - A detailed display of all the data concerning a specific gene, including relevant links to other important websites.
Superscripts indicating sources for information throughout the card link directly to the information, from a given sources site, on the specific gene you are viewing.
More on: The GeneCard
Simple Search
-
Enter an expression into the search field on the GeneCards
homepage and
choose one of these to search by:
- Symbol only - Brings up the GeneCard for the specified symbol.
This is done by selecting the Symbol only button and typing the requested gene symbol in the search box.
- Symbol/Alias - Searches the database for a gene name, or its alias.
This is done by selecting the Symbol/alias button and typing the requested symbol or alias for the gene in the search box.
- GCid - the Id given to the gene by GeneCards (determined by the
GeneLoc Algorithm).
A gene can be searched by its GCid by selecting the GCid button and typing the requested Id in the search box.
- External Id - A gene can also be searched by an External Id
(accessions from
HGNC,
EntrezGene,
UniProt,
and Ensembl).
This is done by selecting the External Id button and typing the requested Id in the search box.
- Keywords - Searches the database in a full free text search. This is done by selecting the (default) keywords button and typing any kind of text relevant to the search in the search box.
* A use of space in the search string automatically results in a Keywords Search
(see Search Examples for more on multiword searches).
* The GeneCards search is case insensitive.
* All variants of your search terms that are found due to stemming will be highlighted in the search results.
* The * character serves as a wildcard, which matches all possible character strings,
Note that this is different from stemming, which matches strings that are considered to be related to the specified keyword.
* When search terms are encapsulated with double quotes, exact match is made. Exact match enables you to determine the distance between two or more terms. The exact match ignores trivial words, like "a","and","then" etc. For example, searching "heart brain" would also retrieve the string "heart and a brain".
You can also use Tilde (~) to determine the distance between the terms (excluding trivial words), i.e. :
"heart brain"~50, would search for heart brain in the distance of maximum 50 words, excluding trivial words. - Symbol only - Brings up the GeneCard for the specified symbol.
This is done by selecting the Symbol only button and typing the requested gene symbol in the search box.
Advanced Search
Advanced search enables you to browse GeneCards for more specific results.
A broader variety of search options is offered in order to focus the search.
There are two ways to use the Advanced search:
- Click the Advanced Search link next to the search box at the GeneCards homepage
First choose which type of search you wish to perform:
Keywords - only genes that contain the search term in the specified sections will be displayed.
Symbol/Alias - only genes with the symbols/aliases typed in this box will be displayed.
Symbol Only - only genes with the symbols typed in this box will be displayed.
Now type the search string in the search field. If you wish to search in more than one field Click on the + button next to the search box to get another field to add more terms to your search, or to search in multiple fields within a GeneCard for your search terms. Having more than one term in separate fields will search for your first term AND your second term and so on.
Perform a simple search, and improve the search after getting the results -
click on the Show Advanced Search link in the top left corner above the search results.
The advanced search will appear on the top of the page containing your search results.
To further refine your results, you can choose a specific category of genes to look for (e.g. protein-coding, genetic loci, RNA genes),
the source of the gene symbol (HGNC, EntrezGene and/or RNA genes) and to only return genes within a specific range of GIFtS values (all, high, medium or low).
Try it! Refine this simple search example
Search Examples:
* A Keyword Search would result in a list of MiniCards if there is more than on gene that matches your search. If there is only one gene that matches your search the GeneCard will be displayed.
This type of search will usually give out a large number of results.I. Simple Search:
Search String:
Matching Words:
Search Description:
brca1
brca1, BRCA1
Exact word match (case insensitive).*
x84746
x84746
GenBank accession number
A Symbol Only Search would result in the GeneCard itself right away!
II. Wild Card Search (*):
* In version 3.0 the search string "live" is equivalent to "live*".
Search String:
Matching Words:
Search Description:
live
live, lives, liver, lively, lived, living
Any object that begins with the string "live" or some derivative of "live" determined by the search engines stemming algorithm.
A search for live* will return the same results.
* Please note that wild card searches using preceding asterisks, for example *gammaglob*, are not supported in version 3.0 searches.
III. Multiword Search:
* In the search (neurodegenerative or senile) and Alzheimer the use of parentheses is important. Without parentheses the AND takes precedence
in this search so that the results returned are for the neurodegenerative or senile AND alzheimer.
Search String:
Matching Words:
Search Description:
obesity diabetes
diabetes,with hyperproinsulinemia... for obesity in association with Ig
action profile in obese non-diabetic subjects.Search behaves as if an AND was used(see the AND search below!)
obesity AND diabetes
preponderance of insulin resistance, diabetes mellitus ... genetic predisposition to obesity
action profile in obese non-diabetic subjects.All strings or variants of strings must exist in the GeneCard.
obesity OR diabetes
obesity, obese, diabetes, diabetic
At least one of the strings or its variants must exist in the GeneCard
(notice the difference from the AND search!)
(neurodegenerative OR senile) AND Alzheimer
...senile dementia of the Alzheimer type: relation with the cognitive state and
with quantitative studies of senile plaques...
Finds all instances of either neurodegenerative AND alzheimer or senile AND alzheimer.
"macular degeneration"
associated with age-related macular degeneration type 4...
Exact phrase search. All words in the order that they are entered must be in the GeneCard.
IV. Advanced Search:
Example 1: Search by section. Finds all instances where the word angiogenesis is found in summaries and the word cancer is found anywhere in the entry.
Example 2: Search within the disorders section for alzheimers or dementia or within the literature section for those same keywords. Also, choose to view only those genes with a high GIFtS score.
Relevance Scores
The search platform used is SOLR, based on Apache's Lucene text search API.
When a term is searched Lucene returns a set of scored hits.
A "hit" represents a document (in our case a GeneCard), whose fields (actual annotations) were previously indexed by Lucene.
The scoring is calculated by a Lucene defined algorithm: (see Lucene's Similarity class)
the factors in this formula are : (see Solr Relevancy FAQ)
score(q,d) =
coord(q,d) ·
queryNorm(q) ·
∑
(
tf(t in d) ·
idf(t) ·
t.getBoost() ·
norm(t,d)
)
t in q
Each field can be "boosted" - this means increase the weight of a specific field at search time.
In GeneCards, we "boost" a few fields:
You can read more about Lucene's scoring mechanism here:
Apache Lucene - Scoring