I use PHP and mysql.
Let's say I have a database table with 10 000 rows. Which of the cases below it the best performance wise?
Two tables, products
and categories
.
SELECT * FROM products INNER JOIN categories ON products.category_id = categories.id
Products
id
name
category_id
Categories
id
name
One table, products
, containing all the data.
SELECT * FROM products
Products
id
name
category_name
From my perspective Case 1
is the "correct" way of doing it, but I will save some developing time by using Case 2
. Maybe performance too?
This is too long for a comment. The first is the correct (i.e. SQLish) way of storing this data. It allows you do do the following:
Performance is not the main consideration. The SQL engine takes care of performance through the use of fancy join algorithms and indexes. It does this so you can structure the data in the most sensible and maintainable way for your application.
That said, which performs better depends on a number of factors (how long the category names are, how many different names there are, how wide the product record is). Differences in performance between the two scenarios are probably not at all important in getting an application to work optimally.
Case 1 is better than 2 because if you would implement case 2 you would end up with double data. By double data I mean that you would have multiple times the same value in the "category_name" field. This is bad for two reasons, first because it will slow down performance because of too many, unnecessary data (double data). The second reason is because of efficiency. Suppose you would like to change a category name like drinks to drink it would take way more time in the 2nd case than in the 1st case. So to answer your first question, case 1 is the way to do it.
And as you can imagine by reading my answer to question one case 1 is faster than case 2 because case 2 has unnecessary data.
And your last question, like I explained in my answer of question one, one pitfall of case 2 is is you would like to change a category name you would end up with way more work than in case 1. Case 1 has by my knowledge no pitfalls.
I think the question id database design
centric.
Now answer to your questions:
Which case will give the best performance?
Answer - Case 1.
Why?
SQL
rule of Normalization
which will help you in longer run.If in future you have more than 10,000 rows then it will be tedious to handle it in the single table with redundant data
.indexing
over the key
columns, it will help you in executing join
queries faster over large number of rows.redundancy
.Why not case 2?
There will be violation of the Normalization
rule with the single table.Your example shows it that with the single table it will violate these rule.
Will it take long to get 10,000 rows with a structure like it?
With case 1: It will take a bit long time than the Case 2
as there will be join
queries involved.But this time will be negligible
and can be reduced by using indexing
as well.
With case 2: It will take bit less time than the Case 1
but it's performance may lack due to redundant data
or as when the number of records will grow.
Possible pitfalls?
With case 1 -
join
queries for some difficult scenario.With case 2 -
Hope this help you.