如何在数据库中存储大量数据并自动删除旧数据

here is a free created example for my question:

Every 15 seconds I run a php script which checks the status of +5 servers. It gets the ping, status, version and motd. The data should be stored for two weeks, after this time it can be removed.

At the moment I would store it like this:

Table 1:
server_ID    |    name    |    ip    |    last_update

Table 2:
ID    |     server_ID    |    status    |    ping    |    version    |    motd    |    timestamp

I think with this way, nothing is doubled and I can easyly rename a server or change its ip.

But how should I make the drop of old rows? Is it too much if I do this every time the script runs (every 15 seconds) ? 14*24*60*4 = 80640 rows per server for the two weeks

Or how should I store the data, that I´m able to have the latest data and also the data to display a 2 weeks line chart?

Maybe a third table, which has hourly data.


Edit:

Thank you for your answers. What about this:

At the 15 seconds update I put all the Data in Table 1, but I update it, so it is only one row per server.

Then I use this trigger every hour and copy the current data into a new table and delete everything which is older than 2 weeks from it ?

At the moment I have set the server_ID in Table 1 and the ID in Table 2 as primary key, is there a good reason to set the timestamp as primary key instead ?

First of all let us calculate the number of records entered in 2 weeks time after every 5 seconds

60 / 5 = 12 times in a min
12 * 60 = 720 times in an hour
720 * 24 = 17280 times in a day
17280 * 14 = 241920 times two weeks

241920 * 5 = 1209600 records

There is a large chunk of Data Entered every day. I assume you're using a MySQL database. Since you want the latest data to be displayed in the line chart. You have to create the event trigger.

Event Trigger

Event trigger used only if you have privileges to do. Thus first you have to set the event trigger on as under:

SET GLOBAL event_scheduler = ON;

Now your event trigger is ON create the following event trigger to do things:

CREATE EVENT 
    event_name
ON SCHEDULE AT
EVERY 5 MINUTE 
DO
  DELETE 
  FROM
     Database2 
  WHERE
     timestamp < (CURRENT_TIMESTAMP() - (60*60*24*14) )

It will be scheduled every 5 Minutes and It will delete the records which are 2 weeks old from the current time and date stamp. Hope this will resolve your issue. As the table would have only 2 weeks records deleting old records every 5 minutes and insertion would be done every 5 seconds already.

Edited Question

The edited question has changed the things very much differently. Now in table2 data is entered continually. Now every 15 Second you are updating table1 and every hour you are creating a new table and deleting old entries. I didn't understand what you want to achieve. But, I would reply the question in last about Primary Key.

What is Primary Key

  • The PRIMARY KEY constraint uniquely identifies each record in a database table.

  • Primary keys must contain unique values.

  • A primary key column cannot contain NULL values.

  • Each table should have a primary key, and each table can have only ONE primary key.

Now it depends on your requirement if you want the single timestamps entries should be there then change it to timestamps but if you want the single entry for single server then change it to server. But, I recommend you the another way i.e. You create a separate column as ID for primary Key and create the Index of unique key with the combination of two columns i.e. server_id and timestamps

Your question was how to delete rows, right? This is possible with delete...

DELETE FROM table WHERE x = y

I assume you're using a database which supports partitioning (eg. MySql >= 5.5 with InnoDb). In such a case I would add timestamp column to the primary key of the second table (saying Database2 you meant table2, am I right?) and create daily partitions. Then removing older data would be simply a matter of dropping daily partitions older than 2 weeks (which should be a very fast operation).

If you need to display a 2 weeks chart with hour resolution I would recommend first checking whether your current setup is not fast enough and only if it's too slow create an additional table where you would store aggregated data (which might be generated by some worker querying the main table every few minutes for instance)

Since there is no mention of having to use a SQL based database, just for fun you could avoid storing this in a relational database structure.

One option would be a Hash but perhaps you could also use a Sorted Set, in something like Redis. With sorted sets you can add, remove, or update elements very quickly (time proportional to the logarithm of the number of elements).

Every member of a Sorted Set is associated with score, which is used to order the sorted set from the smallest to the greatest score. While members are unique, scores may be repeated.

Your score would be the Epoch Time (or similar) and the member could be a JSON object of the data you're looking to store.

Because this type of operation is so fast, you can easily call back all your JSON objects and sort yourself. You would want to store the Epoch Time in the JSON object to ensure uniqueness. Note that 80640 rows per server is tiny relative to what Redis can handle. To add:

ZADD yourset 1383553120 "<JSON OBJECT>"

The first argument would be the current Epoch Time. Since you know the current Epoch Time, you can easily remove all records that are too old. To remove:

ZREMRANGEBYSCORE yourset -inf 1382344314

The last arguement should be Epoch Time two weeks ago from now.

JSON and PHP play nice, so it might be something you could look into.