topbanner
unifi Home Info Staff Didattica Ricerca Bioware

CAIAP (CAI Analyser PAckage) Server


dSb BioInfo

CAIAP Home

Search
For results
In results

Plot
Bars and lines
Pie
XY

Browse
Data
Plots

Download
CAIAP
CAIAP manual
COGdb
Welcome to the CAIAP server, the place in which we store the output of the CAIAP programs. Here will follow some basic info about the server, how it is organized and how to use it to extract data.
The CAIAP results file are basically named as organims are, with different suffixes and file extensions.

SEARCH SECTION

Search for results
From there you can search for result files by choosing the "type". The query is the organism name, since files are name by organism (as common naming convention of the whole CAIAP server).

Search in results
This section propose a fully featured and quite sofisticated search engine that allow inter- and intra-genomic genomic data comparisons.Each result fie of the server is scanned to respect the filtering potions you set. Here will follow a brief description of the the filter types:
String query: filter results at the gene annotation level. Since some ambiguities usually happend at this level, a collection of uniform annotation has been pre-interpolated with COG definitions, in order to perform an exhaustive and impartial search. If an organism has more than one chromosome, this can also be selected.
COG path: if this option is chosen, the string query field is excluded and a collection of COG codes is used instead to define a number of pre-grouped gene that constitute e.g. metabolic pathways or other. If you do not feel comfortable with this approach you can combine your own genes in the String query by separing the different entries with the pipe sign "|", indicating the alternative search options.
Threshold value: after selecting the gene type, you can choose to isolate only those that have CAI or NOST values higher or lower than you preferred threshold. Remember that CAI values ranges from 0 to 1 while NOST have arbitrary values, both positive and negative.
Result type: CAI or NOST, this obviously affect the threshold that you are selecting...
Organism query: even if it seems just a file name filter, it actually is much more. E.g. Querying "escherichia" will apply all the obove mentioned filters on all escherichia strains available, performing a sort of inter-gender comparative genomic analysis. To make this filter more flexible, we collected the taxonomy of each organims so that you can search in taxonomic keys instead of in organism name: each organism belonging to the filtered taxonomy will be displayed.

PLOT SECTION

The CAIAP server is provided with a plotting engine that allow to observe the available data in a graphical manner. Three graph types are defined:

Bars and Lines: used to compare genomic CAI distributions or CUTs up to 5 diffreent organisms.
Pie: provided with a threshold mechanism, it is useful for the categorization or the functional clustering of the CAI or NOST values given a specific organism.
XY: given an organism, it allows to observe how different variables (suc as local GC%, CAI, NOST, Coa axes) may covariate by simply putting them in two perpendicular axes

BROWSE SECTION

Browse data
Those browseable files are the output of some program included in the CAIAP and are all derived from the main result file (the CAI file) in which the CAI values of all genes for each organism is reported. We did not included this file directly in the server because the NOST file already contain those data.

The following file type are available for download:

file SET: from AutoHighXP, it contains the reference set of genes that are considered to be highly expressed according to Carbone et. al (2003) algorithm. Each file has a variable number of entries, since the refset should 1% of the genes of the entire genome

file CUT: from CUTabler, it contains the codon usage table of the reference set (the one used for generating CAI files).

file gCUT: from CUTabler, it contains the codon usage table of the whole genome and is useful for comparison purposes

file NOST: from CAInost, it reports both the CAI values built using refset CUT and their expression as 'number of standard deviations from the average' (NOST), an useful measure for inter genomic comparisons.
file gNOST: from CAInost, it reports both the CAI values built using genomic CUT and their expression as 'number of standard deviations from the average' (NOST), an useful measure for inter genomic comparisons.
file DIST: from CAIdist, it reports the distribution (in classes with size 0.01) of the CAI values of all the genes in the specified organism's genome. Moreover a normal distribution simulation is reported for all classes, according to the same average and standard deviation of the calcualted distribution.

file gDIST: from CAIdist, it reports the distribution (in classes with size 0.01) of the gCAI values of all the genes in the specified organism's genome. Moreover a normal distribution simulation is reported for all classes, according to the same average and standard deviation of the calcualted distribution.

Browse plot
Those plots are simple and very inaccurate graphic representation of the data available in the prevoious section. We put them online just for descriptive purposes...

type CAI: such a strange plot report many informations. Positive red dots are the CAI value sof all genes in the plus strand of the genome. Negative points represents genes in the minus strand. The absolute value of y axis coordinate is the CAI. With this king of plot, one can easily evidence if there are codon preferences for diferent strands. The green line represents all the genes sorted by CAI value, an useful estimator of central tendencies of the whole genome.

type NOST: red dots are the NOST values of genes in the plus strand, green marks genes in the minus strand. That's all.

type CUT: for each codon, the red bar indicate the codon weight value (so, from 0 to 1) in the genome CUT while the green bars express weight in the computed refset. It is very useful for comparing codon preferences at genomic and refset levels

type DIST: the main plot of the DIST file (see above). Genome distribution of CAI values is reported and the standard distribution (with the same average and standard deviation) is also plotted in green.

type DISTcomp: by joining distribution plot of CAI values calculated with genomic CUT and with refset CUT, very different shapes can be observed.

type SET(pie): a simple pie chart indicating the functional classification (according to COG definitions) of genes present in the refeset of highly expressed genes.

type NOST(pie): this pie chart contains the funcional classification of all genes in the genome that have a CAI value higher than 1.5 standard deviation from the genomic average (we call this value 1.5 NOST). Probably this is useful in estimating rough metabolic aspects of organisms.

type CAIvsGC3: this plot is useful for understanding the compositional bias of the genome, that can be drifted and therefore cause an unbalanced CAI calculation. Some authors look at this graph to estimate CAI confidence.

type CoA: this Correspondence Analysis plot is generated from the results of the J. Pedens CodonW program. It is intended to give a multivariate statistical analysis of codon usage and can be used as an estimator of CAI confidence, since it identifies the most important components of condon usage an relate them to gene expressivity (for more info, see CodonW home page here).

DOWNLOAD SECTION

Since the CAIAP is often updated, you can download the latest version from there.
We are keeping a manual also, and we work to keep it "synchronized" with the programs. Any help in this sense will be appreciated.
Finally, we prepared the local BLAST database to use the COGClust progra,m of the CAIAP. Since thsi file is rather large (approx. 50 MB) we keep it separate form the "light" CAIAP.

Last modified: June 11 2007

For any problem, questions or help please contact matteo_dot_ramazzotti_at_unifi_dot_it

Visitatori totali: counter