Mysql性能,一两个表

I use PHP and mysql.

Let's say I have a database table with 10 000 rows. Which of the cases below it the best performance wise?

Case 1

Two tables, products and categories.

SELECT * FROM products INNER JOIN categories ON products.category_id = categories.id

Products

id
name
category_id

Categories

id
name

Case 2

One table, products, containing all the data.

SELECT * FROM products

Products

id
name
category_name

Question(s)

  • Which of these cases have the best performance?
  • Guess, would it take long to get data with 10 000 rows with a structure like it?
  • Any pitfalls with one of the cases?

From my perspective Case 1 is the "correct" way of doing it, but I will save some developing time by using Case 2. Maybe performance too?

This is too long for a comment. The first is the correct (i.e. SQLish) way of storing this data. It allows you do do the following:

  • Validate the category names as they are inserted and updated, using standard foreign key relationships.
  • Change a category name and have it affect all products.
  • Include other information about a category, such as short names, long descriptions, date added, and so on.

Performance is not the main consideration. The SQL engine takes care of performance through the use of fancy join algorithms and indexes. It does this so you can structure the data in the most sensible and maintainable way for your application.

That said, which performs better depends on a number of factors (how long the category names are, how many different names there are, how wide the product record is). Differences in performance between the two scenarios are probably not at all important in getting an application to work optimally.

Case 1 is better than 2 because if you would implement case 2 you would end up with double data. By double data I mean that you would have multiple times the same value in the "category_name" field. This is bad for two reasons, first because it will slow down performance because of too many, unnecessary data (double data). The second reason is because of efficiency. Suppose you would like to change a category name like drinks to drink it would take way more time in the 2nd case than in the 1st case. So to answer your first question, case 1 is the way to do it.

And as you can imagine by reading my answer to question one case 1 is faster than case 2 because case 2 has unnecessary data.

And your last question, like I explained in my answer of question one, one pitfall of case 2 is is you would like to change a category name you would end up with way more work than in case 1. Case 1 has by my knowledge no pitfalls.

I think the question id database design centric.

Now answer to your questions:

  1. Which case will give the best performance?

    Answer - Case 1.

    Why?

    • It is following the basic SQL rule of Normalization which will help you in longer run.If in future you have more than 10,000 rows then it will be tedious to handle it in the single table with redundant data.
    • If you do indexing over the key columns, it will help you in executing join queries faster over large number of rows.
    • Two separate tables will help you in reducing data redundancy.

    Why not case 2?

    There will be violation of the Normalization rule with the single table.Your example shows it that with the single table it will violate these rule.

  2. Will it take long to get 10,000 rows with a structure like it?

    With case 1: It will take a bit long time than the Case 2 as there will be join queries involved.But this time will be negligible and can be reduced by using indexing as well.

    With case 2: It will take bit less time than the Case 1 but it's performance may lack due to redundant data or as when the number of records will grow.

  3. Possible pitfalls?

    With case 1 -

    • You may end up writing complex join queries for some difficult scenario.

    With case 2 -

    • Data redundancy / duplication
    • Low performance in longer run
    • Poor readability

Hope this help you.