The nucleotide sequence of a 1,163-base-pair fragment that encodes the entire thyA gene of Escherichia coli K-12 was determined. The strategy involved sequence determination of both DNA strands by using overlapping deletions that had been generated in vitro from the two ends of the fragment with BAL-31 nuclease. The amino-terminal sequence of thymidylate synthase (5,10-methylenetetrahydrofolate:dUMP C-methyltransferase, EC 126.96.36.199), the product of the thyA gene, located the 792-base-pair open reading frame, which codes for the 264 amino acid residues of this enzyme. The amino acid sequence deduced from the nucleotide data was confirmed to the extent of 40% by partial sequence analysis of the enzyme purified from extracts of the amplified cloned gene. Transcriptional and translational control areas were apparent in the regions flanking the structural gene. The 5-fluorodeoxyuridylate-binding residue of the active site was identified as cysteine-146. Comparison of the E. coli and Lactobacillus casei synthase sequences reveals consistent homology (62%) over extensive regions. This homology is particularly striking in a very hydrophobic region bordering cysteine-146. In the two enzymes, this region, which probably defines the active site, is 82% homologous. However, a dramatic difference between the two sequences is reflected by the surprising finding that a 51-amino-acid stretch, present midway through the L. casei sequence, is completely absent from the E. coli enzyme.