The set of genes involved in a biological process (e.g., circadian clock)
are sometimes modulated by two complementary transcription factors, i.e. one
transcription factor regulates (activates or represses) a subset of genes, and the
other factor regulates the remaining genes. To address this problem, we consider
the following twin motifs finding problem. Given a set t DNA sequences
Dna={Dna1, Dna2,..., Dnat}, two sets of k-mers, motif1 and motif2, are called a
pair of twin motifs in Dna, if Dna can be partitioned into two subsets Dna1 and
Dna2, and motif1 and motif2 each contains one k-mer from each sequence in Dna1
and Dna2, respectively. The twin motifs finding problem is defined as:
Input: a set t DNA sequences Dna={Dna1, Dna2,..., Dnat} and integer k.
Output: a pair of twin motif motif1 and motif2, minimizing score(motif1) +
score(motif2).
Note: the score of the motif is defined as the sum of the hamming distances
between each k-mer and the consensus k-mer (slides 27-30 of Chapter 2). Devise
an algorithm to solve the twin motif finding problem.