MYSQL/PHP, I want to create a record of activities that people perform on the site.
Table ADDED -> EventID, UserID, Time, IP
Table DELETED -> EventID, UserID, Time, IP
Table SHARED -> EventID, UserID, Time, IP.
Is it more efficient to join these tables when querying to read for example the last 10 actions performed by a USERID
, or would it be more efficient to structure like this.
Table EVERYTHING -> EventID, EventType(eg ADDED, DELETED, SHARED), UserID, Time, IP
Using one table is the right thing to do because it is properly normalized. Adding a new event type should not require a new table. It's also much easier to maintain referential integrity and make use of indexes for retrieving and sorting all events for a user. (If you had them in separate tables, getting all events for a user and sorting them by time could be much, much slower than using one table!)
There are ways you can make these tables smaller, though, to save space and keep your indexes small:
enum()
to define your event types. If you have a small number of events, you use at most one byte per row.UNSIGNED
integer type to get more EventID
and UserID
s out of the same number of bytes.I recommend a table format like this:
CREATE TABLE Events (
`EventID` INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
`UserID` MEDIUMINT UNSIGNED NOT NULL COMMENT 'this allows a bit more than 16 million users, and your indexes will be smaller',
`EventType` ENUM('add','delete','share') NOT NULL,
`Time` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
`IP` INTEGER UNSIGNED NOT NULL DEFAULT 0,
PRIMARY KEY (`EventID`),
FOREIGN KEY (`UserID`) REFERENCES `Users` (`UserId`) ON UPDATE CASCADE ON DELETE CASCADE,
KEY (UserID)
);
If you store this using MyISAM, your row length will be 16 bytes, using a fixed format. This means every million rows requires 16MB of space for the data, and probably half that for indexes (depending on what indexes you use). This is so compact that mysql can probably keep the entire working portion of the table in memory most of the time.
Then it's an issue of creating the indexes you need for the operations that are most common. For example, if you always show all a user's events in a certain time range, replace KEY (UserID)
with INDEX userbytime (UserID, Time)
. Then queries which are like SELECT * FROM Events WHERE UserID=? AND Time BETWEEN ? AND ?
will be very fast.
Use one table which logs all events and differentiates the event type, as in your second suggestion.
You are storing only one type of data here, and it is therefore appropriate to store it in one table. In the early stages, you ought not worry too much about the size the table will grow to over time. Having only a few columns in a table like this, it can easily grow to many millions of rows before you would even need to consider partitioning it.
If you have a limited number of event types, you might consider using the ENUM()
data type for the EventType
column.