在非简单情况下处理MySQL存储过程中的并发性

I've got a table where the validity of an insert is a function of all other rows in the table. The table manages date periods for resources and a new row is only valid if no other row overlaps with the same resource.

So the table is

resource_id INT
start_date DATE
end_date DATE

And conceptually a row can be inserted provided there are 0 results for the following query (where newResourceId, newStartDate, newEndDate is the tuple to insert)

select count(entity_id) from mytable where resource_id = newResourceId and start_date < newEndDate and end_date > newStartDate;

The question is how to ensure concurrent inserts are handled correctly? I need some way to ensure no rows get inserted between executing the above select and the subsequent insert. So a concurrent operation should block on the above select until the first operation commits. To achieve this I use the following stored procedure:

DELIMITER //
drop procedure if exists rp_insert //
create procedure rp_insert
(
    IN p_resource_id INT(10), 
    IN p_start_date DATE, 
    IN p_end_date DATE,
    OUT result INT
) 
BEGIN   
    DECLARE num_conflicts INT;
    DECLARE conflicts CURSOR FOR SELECT entity_id from mytable WHERE start_date < p_end_date and end_date > p_start_date and resource_id = p_resource_id FOR UPDATE; 

    START TRANSACTION;
    OPEN conflicts;
    SELECT FOUND_ROWS() into num_conflicts;
    if num_conflicts = 0 then
        insert ignore into mytable (resource_id, start_date, end_date) values (p_resource_id, p_start_date, p_end_date); 
        set result = LAST_INSERT_ID();      
        COMMIT;
    else
        set result = -2;
        ROLLBACK;
    end if; 
END //
DELIMITER ;

So I'd appreciate it if someone can tell me if this is the right track to handle concurrency, specifically:

  • Does the 'For update' on the select cursor mean that a concurrent operation with overlapping where criteria will block on the select until the first transaction commits / rollbacks?

  • In the same concurrent scenario, will the second operation SELECT see the row inserted by the first operation (given that insert will commit after the 2nd transaction starts, but before the 2nd select executes)?

  • is there any risk a smart db engine will realise that i never update any rows returned by the select and decide to not enforce the select for update (e.g. it's easy to see there's no update statement in the transaction)

  • Any other comments on performance, is there a better way etc?

For info, I decided against:

  • locking the table as overkill (e.g. the 2nd operation only needs to block if there is overlap in the select results). also locks dont seem to be allowed in mysql stored procedures

  • coding this in PHP rather than a stored procedure (there's no reason to do the db round trips as I don't need to know what the conflicts are, just whether the insert succeeds).

Ok so at the risk of monologue'ing here's what I did:

The below solution seems to work; I tested with additional params for sleep statements at the 3 points that are interesting for race conditions:

  • before the Select for update
  • after the select for update but before
  • the test for conflicts after the test for conflicts but before the insert

I then ran the stored procedures from two different mysql session, one of which would sleep and the other would run normally with the opporunity to get ahead of the first. In each case the expected stored procedure inserted a row and the other failed. So this implementation is safe for concurrent invocation

I'm still interested if someone knows a better way than this or has other comments. I guess something like this is only necessary when normal db constraints arent able to enforce the table's integrity.

DELIMITER //
create procedure rp_insert(
IN p_resource_id INT(10), IN p_start_date DATE, IN p_end_date DATE, OUT result INT)
BEGIN

    DECLARE num_conflicts INT DEFAULT 0;
    DECLARE num_locked_resources INT DEFAULT 0;
    DECLARE lock_resources CURSOR FOR SELECT resource_id from myresources WHERE resource_id = p_resource_id FOR UPDATE;
    DECLARE conflicts CURSOR FOR SELECT entity_id from mytable WHERE start_date < p_end_date and end_date > p_start_date and resource_id = p_resource_id;

    SET result = 0;
    START TRANSACTION;      
    OPEN lock_resources;
    SELECT FOUND_ROWS() into num_locked_resources;
    IF 0 < num_locked_resources THEN
        OPEN conflicts;
        SELECT FOUND_ROWS() into num_conflicts;
        if num_conflicts = 0 then
            insert ignore into mytable (resource_id, start_date, end_date) values (p_resource_id, p_start_date, p_end_date);
            SET result = LAST_INSERT_ID();
            COMMIT;
        else
            SET result = -2;
            ROLLBACK;
        end if; 
    ELSE
        set result = -3;
    END IF;
END //
DELIMITER ;