Sapelo Island Microbial Observatory Sapelo Island Microbial Observatory
powered by

 

Custom Fasta File

 

Overview

In order to support convenient analysis of SIMO sequence data using standard bioinformatics web services and software, results of all SIMO database queries are available as downloadable text files in FASTA file format. Definition lines containing SIMO sequence IDs and taxonomic and ecological context information (e.g. sampling date, environmental characteristics, study site) are automatically generated from the database for each sequence record.

The syntax for creating user-defined HTTP queries (i.e. URL API) is described below for advanced users or projects wishing to mine or create links to the SIMO database. For additional assistance forming custom SIMO queries or linking to the SIMO database, please email the SIMO Database Administrator.


Syntax

Base URL: http://simo.marsci.uga.edu/public_db/fasta_file.asp?

Parameter Syntax: (i.e. appended to base url)

param1=xxx&param2=yyy&param3=zzz...

Supported Parameters and Defaults:

Parameter Description Default*
datemin minimum (earliest) sampling date (e.g. 1/1/2002) none (any)
datemax maximum (latest) sampling date (e.g. 12/31/2002) none (any)
season sampling season (Spring, Summer, Fall, Winter) none (any)
envir code for environment sampled, e.g. P, S, W 
(see HTML list values)
none (any)
microenv code for microenvironment sampled, e.g. PLD, WSW 
(see HTML list values)
none (any)
zone code for marsh zone sampled, e.g. CB, SS, TS, NO
(see HTML list values)
none (any)
max maximum number of records to return none (all)
genbank GenBank accession (e.g. AF439410) none (any)
simo SIMO clone/isolate code (e.g. SIMO-300) none (any)
nucleotides nucleotide motif (e.g. ATGGA or ATG?A) none (any)
minbases minimum sequence length (e.g. 300) 0
sourcetype sequence source type (CL or IS for clone, isolate) none (any)
site code for SIMO sampling site, e.g. DC, DM, DS
(see HTML list values)
none (any)
groupid taxonomic group numeric id (see HTML list values) none (any)
groupname taxonomic group name **
(i.e. Phylum - Class, e.g. Bacteroidetes - Flavobacteria)
none (any)
phylum phylum name (e.g. Proteobacteria) none (any)
seqid SIMO sequence numeric id (e.g. 103) none (any)
rowsize maximum length of nucleotide data rows in FASTA file 60
sortcol database column to sort by (see HTML list values) SequenceID
order sort order (ASC, DESC for ascending, descending) ASC
defline

Fasta file definition line option:
verbose = sequence id followed by full description
sequin = id followed by NCBI Sequin modifiers
id = id only
alias = clone alias only (for SIMO use)

verbose

* default restriction if parameter or value are omitted from the query string
** spaces should be encoded using %20 to avoid URL parsing errors

Example Syntax:

http://simo.marsci.uga.edu/public_db/fasta_file.asp?season=Summer&envir=P&
groupname=Proteobacteria%20-%20Gammaproteobacteria&minbases=300

Syntax Notes:

  • queries should be entered on 1 line (example wrapped for web display purposes only)
  • a plain text message is returned if no matching records are returned or an error occurs
  • some parameter options are mutually exclusive (e.g. envir=P and microenv=WSW, specifying plant environment and surface water microenvironment) and therefore will always return no records
 
 
   
   

National Science FoundationThe Sapelo Island Microbial Observatory is funded by the National Science Foundation

This material is based upon work supported by the National Science Foundation under grant number MCB-0702125. Any opinions, findings, conclusions, or recommendations expressed in the material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

 

UGA Marine Sciences

Contact Us