BAC-based Comparative Genomics

We are engaged in two comparative genomics projects at JCVI.

1. In collaboration with Ian Bancroft at the John Innes Centre, we have sequenced a set of orthologous BACs derived from a region of the triplicated regions of the A genome of B. rapa, the C genome of B. oleracea and the A and C genomes of B. napus. BAC sequences and annotation can be found in GenBank and published analysis here.

2. Under an NSF-funded Comparative Genomics project, we are sequencing BACs derived from triplicated regions of Brassica oleracea (TO1000) and corresponding regions of the genome of Sisymbrium irio that did not undergo triplication. Annotation of these BACs, searchable by key word, gene identifier or chromosomal location can be found here.

BAC Ends

As part of the comparative genomics project, we have sequenced BAC ends from Brassica oleracea TO1000 and Sisymbrium irio. These can be searched by BLAST and can be downloaded from our FTP site.

Whole Genome Shotgun Sequencing

Sanger Sequencing

The first set of whole genome shotgun sequences was generated by Sanger technology under the aegis of the Arabidopsis Genome Project. Approximately 0.5x coverage of the B.oleracea TO1000 genome was generated and used to improve Arabidopsis genome annotation, including the creation of new gene models.

Towards a complete genome sequence of B. oleracea TO1000

A multinational consortium has identified B. oleracea TO1000 as the strain for the generation of a reference genome sequence. The strategy will use a combination of existing Sanger sequences, including BAC ends and 454 and Illumina sequencing of fragment and various size paired end libraries.

Brassica Transcript Assembly and Microarray Development

All publicly available Brassica ESTs as well as some 454 transcript data have been assembled into contigs that were used to design an all-Brassica microarray. The assembled sequences can be downloaded from our FTP site.

Repeat Databases

A repeat database constructed by similarity to known repeat elements can be found as part of the TIGR Repeat Databases.

For Comments/Questions send mail to [email protected].