A Database of Drosophila Genes & Genomes

FB2008_06, released July 3, 2008
 

QueryBuilder

QueryBuilder Help

QueryBuilder (QB) provides one-stop shopping for information in FlyBase. QB presents a simple user interface that supports powerful searches by offering access to every DataSet|Field pair (for example, Genes|CV:GO:Molecular Function) in FlyBase along with the ability to include any combination of datasets in the same search. QB automatically creates sets of records that are cross-referenced to the records that match your query, providing links to all related records in FlyBase from a single page. Both simple and complex queries can be built in a few steps:

The Getting Started section outlines the basic search strategy.

hide How to QueryBuild -- Getting Started

To query FlyBase, you must build one or more segments. Please note that building a segment using the Controlled Vocabulary (CV) hierarchy as your DataSet is slightly different from building a segment with any other data class.

To start building a query, click the yellow box.

Build A Segment

STEP 1: Select a data class to search from the DataSet menu.

STEP 2: Select a field to search, or use "Any field" to search full records.

STEP 3: Enter text string to search for. The search algorithm will identify data fields which contain and the text string you have entered. You may opt for case sensitivity if desired. To see a list of the most frequently found entries for that field click the "index dictionary". You have the option to select a term from that list.

STEP 4: Click "Done" button.

STEP 5: (optional): To add additional segments, click the "+" button.

Build A Segment using the Controlled Vocabulary Hierarchy

STEP 1: Select "CV Hierarchy (GO/etc.)" from the DataSet menu. Note that the appearance of the window will change to include four tabs instead of three.

STEP 2: To identify a CV term, either select the second tab, "Search CV terms", or the third tab, "Browse CV terms". Choose the "Search" option if you have a good idea of what you're looking for. Otherwise, select "Browse".

STEP 3: Once you've chosen your term, you will be shown the fourth tab, "Enter CV term ID" box with the CV term ID and name automatically entered for you. By default, your search will be performed for CV terms from the whole subtree of the term you've chosen. If you wish to search only for the exact CV term you have chosen, select "This CV term only" from the drop down menu. (Hint: you'll retrieve more results by searching the whole subtree)

STEP 4: Click "Done" button.

STEP 5: (optional): To add additional segments, click the "+" button.

Prepare, Check, and Run Query

STEP 1: Check boolean operators (if query consists of more than one segment). Default is "AND". Change to "OR" or "BUT NOT" if desired.

STEP 2: Check that query segments are correct. Segments can be modified by clicking on them, or deleted, by clicking the "X" in the top right hand corner of the segment boxes.

STEP 3: Select output options. Default is to show related genes, to provide cross-references to other datasets, and to search Dmel data only. Change if desired.

STEP 4: Click "Run query" button.

hide Features
  • Calculations
  • Calculations can be incorporated into searches of fields that contain numbers.
  • The options are greater than (>), less than (<), plus or minus (+/-) and range (-).
  • Any value, no value
  • Search for the presence or absence of information in a field, rather than a specific value.
  • The options are IS NULL and IS NOT NULL (this query is case sensitive).
  • Logical operators
  • Combine multiple query legs with logical operators.
  • The options are and, or, and but not.
  • Phrases
  • Multiple words are treated as a phrase.
  • Only records that include the search words in the order you specify will be matched.
  • Batch queries
  • Upload a list of FlyBase IDs, search for all related records.
  • Standard Batch download is also available for query results.
  • Hierarchical CV queries
  • Full support for GO and Anatomy/Development term relationships.
  • Searches of CV fields within standard data classes (e.g., Genes) find only records that contain the individual term you specify. The GO/Anatomy CV database associates each term in these CVs with all of the terms below it in the hierarchy, allowing a single search to find records that contain a term or any child of that term.
  • Field type tags
  • Five field type tags help organize and identify search options.
    • CV - Controlled Vocabulary, terms are consistent across records
    • Flag - Flags records with the presence of links of specified type (any search of flag field will be performed as "IS NOT NULL", ignoring user-supplied context)
    • Map - Genetic, cytogenetic, or genomic map data
    • Symbol - Symbols are the only, or predominant, datatype
    • Text - Data is free text, usage may not be consistent from record to record
  • Field content dictionaries
  • Preview the information in a field, or select dictionary entries to use in a search.
  • The field dictionary lists up to 100 most-commonly-used symbols, terms, numbers or words from the data in the selected field.
  • Alternative results
  • Related records in other FlyBase data classes are a click away via the green buttons.
  • QB creates a set of cross-references for the records that match your search criteria. An itemized results list (of Genes records, for example) is displayed for the data class that is selected when a search is run. A series of green buttons at the top of the results page provide links to related records in other data classes (Insertions, for example). With QB you do not need to open each report and click through layers of links to find related information. This feature can also be used to find information that may be difficult to search for directly because of unfamiliar nomenclature (such as Insertion Symbols). Only References are excluded from automatic generation of alternative results (because of the large size of this dataset).
  • Linkouts
  • Related information from other databases is a click away via the yellow buttons.
  • If the records identified by your search include links to external databases, these links are available from the yellow button or buttons in the Linkout section of the results page.
hide Further Information and Examples
  • Asterisk is wild. An asterisk (*) on either end of your search string, or embedded in the middle of the string, is interpreted as "any character".
    • Stocks|Symbol mam*
    • Alleles|CV:Phenotype Class *maternal*
    • Insertions|Symbol *ptc*
  • Wild cards are not automatically added to QB searches. If a query is unproductive, try it again with * on one or both ends.
  • Search Flag fields with * or any string of letters.
    • Genes|Flag:InteractiveFly default
    • Polypeptides|Flag:Antibody URL (DSHB Hybridoma: *
  • Case-insensitive searches are standard. There are two exceptions:
    • A case-sensitive Symbol search is available for most data classes.
    • The reserved phrases IS NULL and IS NOT NULL are case sensitive.
  • Multiple words are treated as a phrase.
    • Genes|Text:Other information tissue culture cells
  • Cytological location searches are redirected to the GBrowse dataset, which uses estimated sequence ranges of cytological locations.
  • Join query segments with AND or OR.
  • When using two or more query segments, QB gives precedence to the previous segments.
    • haltere AND wing OR leg is interpreted as (haltere AND wing) OR (leg)
  • Calculation query examples:
    • GBrowse Data|Exact Number of exons > 2   
    • Polypeptides|Protein size (kD: < 50  
    • Annotations|Map:Sequence range 3L:5,787,637..5,819,561 +/- 5000 (commas are optional)
    • Insertions|Map:Cytogenetic location 67B-D
  • References record sets are created only when the References dataset is searched.
    • References|Author Wakimoto (creates a References dataset)
    • Alleles|Text: Discoverer Wakimoto (does not create a References dataset)
hide Notes, Known Problems and Features yet to come
  • To find out more about the controlled ontology databases:
  • GO - Gene Ontology:
  • http://www.geneontology.org
  • To search for GO terms and their definitions, we recommend:
  • http://www.ebi.ac.uk/ego
  • To find out more about our Anatomy and Developmental terms, go to Termlink:
  • http://www.flybase.org/cgi-bin/fbcvq.html?start
  • Cross-references to stocks and images are generated, but cross-references from these data types are blocked. This is because these records may include tangentially related objects, such as the set of genes that are mutant in a multiply marked mapping stock.
  • People data are not included in QB.
  • All of the menus and dictionary files are produced automatically. Dictionary files remain on the server for 2 hours. If an index dictionary for a given field isn't already present on the server, it will take a bit of time to generate it
  • If you encounter any problems with QueryBuilder, or would like help with your queries, please use the contact FlyBase form to write to us.