Structure, expression, and evolution of a heat shock gene locus in Caenorhabditis elegans that is flanked by repetitive elements.

A locus containing two hsp16 genes in Caenorhabditis elegans has been characterized by DNA sequencing. Each gene encodes a 16-kDa polypeptide which is expressed following heat induction. The two genes, designated hsp16-2 and hsp16-41, are arranged in divergent orientations, and each contains a single intron of 46 and 58 base pairs, respectively. Although both gene transcripts are spliced efficiently in vivo, hsp16-41 corresponds to a previously isolated cDNA which contains an unspliced intron sequence. The 5'-noncoding regions of both genes contain TATA boxes preceded 18 or 19 nucleotides upstream by a heat shock regulatory sequence. The 3'-noncoding regions contain polyadenylation signals (AATAAA) either downstream (hsp16-2) or immediately adjacent ( hsp16-41) to a sequence capable of forming a hairpin. This pair of hsp16 genes is flanked by three copies of an approximately 200-bp dispersed repetitive element (two copies on one side and a single one on the other side of the locus) which occurs in at least 70 copies throughout the C. elegans genome, and has been designated CeRep-16. Together with data described previously (Russnak, R. H., and Candido, E. P. M. (1985) Mol. Cell. Biol. 5, 1268-1278), the results presented here define a family of four distinct, related small heat shock protein genes. These are arranged in divergently transcribed pairs at two loci. The hsp16-48/41 genes code for one class of HSP16, 143-amino acid residues long, while the hsp16-1/2 genes encode the other class, which is 2 amino acid residues longer. Thus each locus codes for the two major types of HSP16. The two loci differ in a number of respects, including the presence of a tandem inverted duplication of two heat shock protein genes at one locus, and of repetitive elements at the other. Sequence comparisons allow us to propose a scheme for the evolution of the four genes and reveal conserved features of noncoding regions which may be involved in the regulation of their transcription, RNA processing, or translation. Using locus-specific hybridization probes, we have found that the genes at locus hsp16-2/41 are expressed at levels approximately 20-40-fold higher than those at locus hsp16-1/48.[1]


