I'm reading the amazing "T-SQL Quering book" (http://www.amazon.com/Inside-Microsoft-Querying-Developer-Reference/dp/0735626030). It has an interesting example, where reading all rows from a table, while updating clustered index key column value, one can miss a row or read a row twice. Updating entity key is not a good idea anyway, so this behavior seemed to be acceptable.
I've tried to update this example, so that the same happens with a non-clustered index. This seems not to be acceptable with non-clustered indexes, since keys of these indexes are not domain keys and related values may very well be changing.
In order to simulate such situation, we need to create a covering non-clustered index, which takes at least 2 pages. Then we need to update it's key column in such a way that first row (in index order) becomes last (jumps to second page in the index) and then - back again. While this jumping happens in the endless loop, we need to select all rows in the table in a separate connection. Very soon, this second connection will get a result set, where not all rows are present.
If we try to put second connection into REPEATABLE READ transaction isolation level, we remove data inconsistency problem, but we get into deadlock error there very soon. It looks like SQL Server does not handle such high frequency page jumps very well.
Following are the SQL statements, repeating the problem.
Table structure: First connection, causing row jumping: Second connection, reading data inconsistently: UPDATE: This SO answer proves that such behavior can also happen even with single index page: http://stackoverflow.com/a/27765798/101685
ANOTHER UPDATE: Using snapshot isolation level (just adding SET TRANSACTION ISOLATION LEVEL SNAPSHOT statement to the reading connection) also solves the deadlocking problem.
UPDATE 3: Actually, when you enable shapshots on your database ("Is Read Committed Snapshot On" property in SSMS), the problem is gone even without setting transaction isolation level explicitly.
I've tried to update this example, so that the same happens with a non-clustered index. This seems not to be acceptable with non-clustered indexes, since keys of these indexes are not domain keys and related values may very well be changing.
In order to simulate such situation, we need to create a covering non-clustered index, which takes at least 2 pages. Then we need to update it's key column in such a way that first row (in index order) becomes last (jumps to second page in the index) and then - back again. While this jumping happens in the endless loop, we need to select all rows in the table in a separate connection. Very soon, this second connection will get a result set, where not all rows are present.
If we try to put second connection into REPEATABLE READ transaction isolation level, we remove data inconsistency problem, but we get into deadlock error there very soon. It looks like SQL Server does not handle such high frequency page jumps very well.
Following are the SQL statements, repeating the problem.
Table structure: First connection, causing row jumping: Second connection, reading data inconsistently: UPDATE: This SO answer proves that such behavior can also happen even with single index page: http://stackoverflow.com/a/27765798/101685
ANOTHER UPDATE: Using snapshot isolation level (just adding SET TRANSACTION ISOLATION LEVEL SNAPSHOT statement to the reading connection) also solves the deadlocking problem.
UPDATE 3: Actually, when you enable shapshots on your database ("Is Read Committed Snapshot On" property in SSMS), the problem is gone even without setting transaction isolation level explicitly.
Comments