Abstract:
BACKGROUND: Microarray technology has matured over the past fifteen years into a cost-effective solution with
established data analysis protocols for global gene expression profiling. The Agilent-016047 maize 44 K microarray
was custom-designed from EST sequences, but only reporter sequences with EST accession numbers are publicly
available. The following information is lacking: (a) reporter - gene model match, (b) number of reporters per gene
model, (c) potential for cross hybridization, (d) sense/antisense orientation of reporters, (e) position of reporter on
B73 genome sequence (for eQTL studies), and (f) functional annotations of genes represented by reporters. To
address this, we developed a strategy to annotate the Agilent-016047 maize microarray, and built a publicly
accessible annotation database.
DESCRIPTION: Genomic annotation of the 42,034 reporters on the Agilent-016047 maize microarray was based on
BLASTN results of the 60-mer reporter sequences and their corresponding ESTs against the maize B73 RefGen v2
“Working Gene Set” (WGS) predicted transcripts and the genome sequence. The agreement between the EST, WGS
transcript and gDNA BLASTN results were used to assign the reporters into six genomic annotation groups. These
annotation groups were: (i) “annotation by sense gene model” (23,668 reporters), (ii) “annotation by antisense gene
model” (4,330); (iii) “annotation by gDNA” without a WGS transcript hit (1,549); (iv) “annotation by EST”, in which case
the EST from which the reporter was designed, but not the reporter itself, has a WGS transcript hit (3,390); (v)
“ambiguous annotation” (2,608); and (vi) “inconclusive annotation” (6,489). Functional annotations of reporters were
obtained by BLASTX and Blast2GO analysis of corresponding WGS transcripts against GenBank.
The annotations are available in the Maize Microarray Annotation Database http://MaizeArrayAnnot.bi.up.ac.za/, as
well as through a GBrowse annotation file that can be uploaded to the MaizeGDB genome browser as a custom
track.
The database was used to re-annotate lists of differentially expressed genes reported in case studies of published
work using the Agilent-016047 maize microarray. Up to 85% of reporters in each list could be annotated with
confidence by a single gene model, however up to 10% of reporters had ambiguous annotations. Overall, more
than 57% of reporters gave a measurable signal in tissues as diverse as anthers and leaves.
CONCLUSIONS: The Maize Microarray Annotation Database will assist users of the Agilent-016047 maize microarray in
(i) refining gene lists for global expression analysis, and (ii) confirming the annotation of candidate genes before
functional studies.