Wednesday, March 21, 2012

Read Consistency

What is the mechanism used by select statements to return point in time data?

I have a test setup - table t1 has 1000000 rows. A query (call q1) that selects all the rows (in NOLOCK mode and process them) takes 10 minutes. At the same time another process inserts another 1000000 rows into the same table t1. As expected, client that issued query q1 sees just 1000000 rows.

My understanding is that NOLOCK does not hold any locks. So, how did SQL Server know that it should not return the rows that are inserted after I issued the query q1? Some explanation or link to some whitepapers would be helpful.

Thanks

Unless you use the row-level versioning that was introduced in SQL Server 2005, there is no notion of 'point-in-time' data for the locking-based database engines for a certain fixed time in the past, until the transaction commits and only for the precise subset of the data read or written by the transaction and of the data that was intended to be read but it didn't exist when the transaction tried to access it.

Instead, the engine can provide an illusion of a serialized execution as if during the processing of transactions the view of the data accessed and touched by the transaction was 'frozen', i.e. the things that other transactions did, either occurred in the past, will occur in the future, or they don't matter if other transaction never read or wrote the data accessed and touched by our transaction. As a result the data that is seen by a single transaction ultimately can be viewed as a consistent slice of some relevant subset of entire database as of 'now' while the transaction is active and as of commit time after the transaction commits.

Having said that I realize that it might sound really cryptic and confusing - but this is how the things are done in the locking-based transaction processing systems. If you come from the Oracle world it will take some time to adjust to a different paradigm.

|||

Thanks Tengiz but ...

In my scenario, let us say that query q1 started at 10 AM and finished at 10:10 AM. At 10 AM it had 1000000 rows and by 10:05 AM other transactions inserted 1000000 rows more making it a total of 2000000 rows. Why did not SQL server return 2000000 rows to q1 client? Somehow SQL server knew to provide the rows that existed at 10 AM (that I call point in time data; may be wrong terminology?) and ignore the rows that were added after that point. What is the internal mechanism used by the engine to give that illusion? I am inclined to think that even though it does not create locks it might create some sort of semaphores, latches or tables in memory to keep track of the rows that need to be returned to q1 client. As you guessed, I am from Oracle background; may be you answered my question and it might take little longer to get it.

|||

Could you please be more specific describing you scenario? The fact that the query only returned the initial set of rows and didn't see the rows that were inserted after the query started does not really mean that the server somehow knows or takes into account the time when the rows were inserted.

Again, the things are different if you use the row-level versioning - you either do it be switching to the snapshot isolation (after enabling it for the database) or if you allow versioning-based read-committed isolation.

But assuming the you don't use the row-level versioning, depending on the existing indexes, the actual query plans, the key values that existed in the table before the insert and the key values inserted, the select query with the right timing could easily skip the newly inserted records, but it would have nothing to do with the read-consistency provided to the row-level versioning.

|||

Thanks to Tengiz for your interest and perseverance in helping me out.

The database is not setup to use row versioning. Also, this is a data warehouse system. So, DML statements can come only from ETL. No other concurrent user is touching this table while this program is running. However my SSIS program that maintains this table opens two sessions (a reader and a writer) as I explained in step 4. My concern is that these two sessions stepping on each other.

-

Step 0:

-- display isolation level

dbcc useroptions

isolation level = read committed

-

Step 1:

CREATE TABLE Table1 (

Column1[int] NOT NULL,

Column2[int] NOT NULL,

Column3[bit] NOT NULL,

Column4[datetime] NOT NULL,

Column5[int] NOT NULL,

Column6[int] NOT NULL,

Column7[varchar](255) NULL,

Column8[int] NULL,

Column9[int] NULL,

Column10[int] NULL

CONSTRAINT [PKC_REALDB_StatusHistory] PRIMARY KEY CLUSTERED (

[Column1] ASC,

[Column2] ASC,

[Column4] ASC,

[Column5] ASC,

[Column6] ASC )

)

-

Step 2:

INSERT INTO Table1

SELECT *

FROM Source_Table1

10,556,214 rows inserted.

-

Step 3:

-- delete the rows to simulate unexpected results due to dirty reads

delete

from SourceStage.RealDB_StatusHistory

where Column1 % 2 = 0

5,288,119 rows deleted

-

Step 4:

Now, I have an SSIS package that does update else insert operation. It does SSIS left outer merge join to decide update verses insert. Since I use table lock in the destination component, the source component (one that feeds the data from the target table to do merge join), uses NOLOCK hint. The destination component uses fast load with a batch size of 1000 rows. The rows to be updated are saved to a empty intermediate table. A Transact-SQL UPDATE statement is used after the data flow is done to apply those into Table1.

5,288,119 rows inserted

5,268,095 rows updated

-

I repeated the above steps without the clustered index and I get the same row counts. The row counts show no surprises which leads me to believe that there is some kind of lock. I have to make sure this works 100% before I put this code into production. The documentation leads me to believe that it does not work 100% of the time. If that is the case, I should be able to simulate a scenario where I get whacky row counts.

Here is what I think the reason for not getting whacky row counts in my setup. (a) when the clustered index is in place, it rebalances the tree during the delete operation. So the inserted rows go into new pages and SQL Server somehow knows to ignore them. (b) when there is no clustered index, then it sorts the data in Temp database before it is being fed to the SSIS. So data is essentially is sourced from the Temp database during pipeline operation of SSIS. Am I thinking in the right direction?

So the big question is, can I use NOLOCK without any bad effects in this scenario? If you believe this will lead to some dirty read scenario, how can I simulate it?

|||

A quick answer to your question "can I use NOLOCK without any bad effects in this scenario?" is NO. The NOLOCK hint relaxes certain concurrency-related guarantees in the engine and essentially nullifies the notion of transactional consistency. It doesn’t mean that you will never get any consistency if you use NOLOCK, but you will not in general have predictable results.

I'm not an SSIS expert, so I'm not sure how SSIS really performs the 'insert else update' operation and I still don't quite get what you actually do in this scenario. But from the description of it looks like the plan does include spooling in tempdb for sort - the fast load option normally feeds data in through the BCP API which if the destination table is a clustered index assumes that that data needs to be sorted before it gets delivered to the destination. Hence, if the input provided by the SSIS is not sorted (there is a special hint that SSIS can specify in order to avoid extra sort) the then query optimizer adds the sort operator.

Spooling can certainly make it look like there indeed was some kind of 'read consistency' provided, but, again, depending on the actual query and specific conditions the optimizer is free to choose other options as well.

|||Thanks for the reply. Thinking about it further, when I don't use the NOLOCK hint, my SSIS package just waits forever. When I use NOLOCK hint, it seems to work fine. However, what if SQL Server does not honour the NOLOCK hint? My package might wait forever. So, I decided use the Lookup Transformation of SSIS rather than the Merge Join Transformation. Lookup Transformation can cache the data upfront before the Data Flow Task starts prosessing source rows. This way there is no contention.

No comments:

Post a Comment