想要帮助:如何设计我的网页,从数据库中排序数据?

I apologize in advance for the long question. I am designing a webpage for a DNA research lab and I am stuck on one particular point. The webpage accesses a single MySQL database, but the database contains dozens of tables. Each table corresponds to one experiment. The tables each follow the same general format: one column lists DNA gene names and the next column displays the amount of the DNA gene present. However, each table contains a different set of genes (the genes in one experiment aren't always the same as from another experiment).

At this point, I want the user to input which gene he is interested in and then the webpage will display which experiments have data for that gene. Basically, I need to figure out which MySQL tables in the database have the data that I want.

The way I see it, I need to cycle through each table in the MySQL database and do a SELECT WHERE query on each table. If it returns something, it is a table that I want and I will add the table name to an array. If not, I just move on to the next table.

What is the best way to do this and what languages do I need? I will use HTML and PHP for the webpage and MySQL for the database queries. However, what can I use to cycle through the tables? I was thinking javascript or ASP?

It sounds like you may need to redesign your database? I think you only need one table, and the "gene set" that is currently distinguishing tables should be a non-unique key on that table.

Then, you should be able to query that single table WHERE the gene set equals the set you are looking for...

Since you are planning to use PHP then that is a good choice for performing the logic that you need.

Do you have control over the structure of the database? If you do, it may be easier to restructure the database itself to support the types of queries that you need. For instance, you can have a single table listing the experiments, another table listing the genes, and a third table connecting the experiment to the gene and the other data that goes with it. This would avoid all the searching through tables for data that you have to do. The advantage would then be that as more experiments are added the application would continue to work without modifying the PHP code.

Let's assume that you can't change your database structure. You can get a list of all of the tables in your database using the query:

SHOW TABLES

Next, you need to know which tables are for experiments you care about. You'll probably have to do some kind of string matching -- hopefully they have names that start with "experiment_" or something.

Then you just run a SELECT statement looking for that gene in the table. Finally, you somehow map the experiment names to the table names, and display those experiment names. The code would be something like:

$result = mysql_query("show tables");

$tables = array();

while ($row = mysql_fetch_array($result)) {
   // Determine whether this is an experiment table.
   if (preg_match("/^experiment_/", $row[0])) {
       $tables[] = $row[0];
   }
}

$tables_with_gene = array();

// As you can see, every search runs bunches of queries.
foreach ($tables as $table_name) {
    $result = mysql_query("select gene_name from $table where gene_name = '$gene_name'");

    if (mysql_num_rows($result)) {
        $tables_with_gene[] = $table_name;
    }
}

// Now you look up the experiment names
$experiment_names = array();

foreach ($tables_with_gene as $table_name) {
    $result = mysql_query("select experiment_name from experiments where table_name = '$table_name'");
    while ($row = mysql_fetch_array($result)) {
        $experiment_names[] = $row[0];
    }

 }

At the end of all this, $experiment_names has a list of the experiments that include the gene in question.

Note that if the gene name is user input you'll want to sanitize it first to avoid SQL injection.

But yeah, you probably want one table that looks like:

experiment_id
gene_name
gene_frequency

Then you could do it all with one query:

SELECT e.experiment_name FROM experiment_data d JOIN experiments e 
ON d.experiment_id = e.id
WHERE d.gene_name = 'your gene name'

You should really consider that redesign people have mentioned if at all possible. Your data format has some real problems. If it were not done this way you wouldn't have this problem. 28000 records is quite small in database terms and it doesn't matter if the gene is involved in more than one experiment. That's really the whole point of multiple fields in databases. They are meant to work with data of exactly that type. You just need another field denoting which experiment the data in the amount column refers to.

So rather than....

-----------------
| Gene | Amount |
-----------------
| abc  |   123  |
| xyz  |   789  |
-----------------

You have:

------------------------------
| Experiment | Gene | Amount |
------------------------------
|     ex1    | abc  |   123  |
|     ex2    | abc  |   456  |
|     ex2    | xyz  |   789  |
|     ex1    | xyz  |   058  |
------------------------------

etc, etc, etc

Then if you need to see just the data from ex1 it's:

SELECT *
FROM tblGeneData
WHERE Experiment = "ex1"

That query will give you the same results as:

SELECT *
FROM tblExperiment1

This is how relational databases are meant to work. They are not generally meant to keep the same type of data in two different tables just because there is a differentiating property.

EDIT:

I feel the need to also point out that you would generally also want an additional field to use as a unique key for the table. I would add an additional field called "Id" to the table and make it autonumber. You could use a compound key made up of your data but the generally accepted "best practice" is to have a separate unique key field that is meaningless outside the context of the inner workings of the database. This field would be used as the primary key for the table.