Pages

Wednesday, December 29, 2010

Use (UPDLOCK, READPAST) to read a SQL Server data queue

The problem
A useful business scenario involves reading data from a "queue", which would be a table consisting of rows generated by one process, but read by a separate process that runs independently.  The second process (queue reader) may be running with multiple threads, or having multiple instances (such as a multi-user desktop or web application).  This can often lead to unexpected results, such as duplicate data retrieval, deadlocks, or other concurrency issues.  The SQL Server stored procedure called by such an application may involve a set of steps similar to:
  1. SELECT the ID of a specific row (meeting some criteria) into a temporary variable
  2. UPDATE a flag on that row to mark it as "processed"
  3. SELECT the data from that row along with rows from any related tables
Enclosing the above logic inside a BEGIN TRANSACTION - COMMIT TRANSACTION segment does not prevent any parallel processes from reading the same ID in step-1 above. This is because while the transaction prevents both processes from trying to update the same row, but the SELECT in step-1 of one transaction does not block the other transaction from also doing the same SELECT.

The solution
An efficient approach to solving this is to use the WITH (UPDLOCK, READPAST) table hint in the SELECT statement. UPDLOCK specifies that update locks are to be taken and held until the transaction completes, and READPAST specifies that SQL Server not read rows (i.e. skip them) that are locked by other transactions.

The combined effect of these table hints is that the first transaction to get to the SELECT statement establishes a lock on the row (or page), and the second transaction simply skips the locked rows and reads the next available row.  Without the READPAST, the second transation would get blocked by the first one until the locks are released.

Here's a sample code segment to use these table hints in a stored procedure:

 
BEGIN TRANSACTION

SELECT TOP 1 @rowid = a.RowID
FROM TestTable a WITH (UPDLOCK, READPAST)
ORDER BY a.ProcessedOnUtc, a.RowID


UPDATE TestTable SET ProcessedOnUtc = GETUTCDATE()
WHERE RowID = @rowid


-- (rest of the data retrieval and processing statements)
 
COMMIT TRANSACTION

A very important Gotcha!
To enable this to work correctly, there must be an index on the column(s) being used to identify the selected row(s), in this case the RowID column.  Without that index, SQL Server will need to look through the table to find the correct row for the second transaction, which will then get blocked since it won't be able to get past the row that's locked by the first transaction.  The result in such a case will be that duplicates will be avoided but deadlocks will still occur - the same as using UPDLOCK without READPAST.

1 comment:

  1. Hi,

    when i use your queue method with multiple table using join, the same id has been selected multiple times.

    My Sql Qry is
    insert into CafEntryAllocation(CA_AuditId,CA_CUSTOMERID,CA_UPLOADID,CA_USERID,CA_ALLOTEDDATE)
    select TOP(1) AM_ID,am_CUSTOMERID,AM_UPLOADID,@userid,GETDATE() from SS_CustomerMaster with (INDEX(cmid_modeofunbar_simregdate))
    join ss_scanimage with (index(indexCMID_UplodDate_distId)) on CM_ID=SI_CUSTOMERID -- and SI_ID =(Select MAX(si_id) from ss_scanimage where CM_ID=SI_CUSTOMERID)
    inner join SS_AuditMaster a WITH (updlock, readpast) on AM_ID=(Select MAX(AM_ID) from SS_AuditMaster where SI_CUSTOMERID=AM_CUSTOMERID)
    left join ss_Formstatus on FS_CM_ID=si_customerid and FS_LCS1RECDATE >=AM_AUDITDATE
    left join ss_dnoteinfo on si_customerid=di_customerid and di_dnotedate>=AM_AUDITDATE
    join ss_distributormaster on si_dist_id=dm_id
    inner join SS_ZoneMaster on ZN_ID=DM_ZONE
    -- left join ss_cafentryallocation on CA_AUDITID=am_id
    where NOT exists (select 1 from CafEntryAllocation WITH (updlock, readpast) where CA_AUDITID=am_id )
    -- NOT exists (select 1 from ss_cafentryallocation where CA_AUDITID=am_id )
    and am_auditdate >=GETDATE()-2 and (am_status = 'Direct Accept' or am_status ='Temporary Accept')
    and cm_modeofunbar in('MNP UNBAR','Normal','POST2PRE','RV','VF DONGLE') and CM_SIMREGDATE is not null
    and DM_DISRIBUTORTYPE in('HUB','Distributor') and ZN_ID=coalesce(4,ZN_ID)
    order by DM_ZONE, CASE WHEN FS_LCS1RECDATE Is NULL Then 1 Else 0 End,
    CASE WHEN di_dnotedate Is NULL Then 1 Else 0 End,di_dnotedate,AM_AUDITDATE

    Babu.M

    insert into CafEntryAllocation(CA_AuditId,CA_CUSTOMERID,CA_UPLOADID,CA_USERID,CA_ALLOTEDDATE)
    select TOP(1) AM_ID,am_CUSTOMERID,AM_UPLOADID,@userid,GETDATE() from SS_CustomerMaster with (INDEX(cmid_modeofunbar_simregdate))
    join ss_scanimage with (index(indexCMID_UplodDate_distId)) on CM_ID=SI_CUSTOMERID -- and SI_ID =(Select MAX(si_id) from ss_scanimage where CM_ID=SI_CUSTOMERID)
    inner join SS_AuditMaster a WITH (updlock, readpast) on AM_ID=(Select MAX(AM_ID) from SS_AuditMaster where SI_CUSTOMERID=AM_CUSTOMERID)
    left join ss_Formstatus on FS_CM_ID=si_customerid and FS_LCS1RECDATE >=AM_AUDITDATE
    left join ss_dnoteinfo on si_customerid=di_customerid and di_dnotedate>=AM_AUDITDATE
    join ss_distributormaster on si_dist_id=dm_id
    inner join SS_ZoneMaster on ZN_ID=DM_ZONE
    -- left join ss_cafentryallocation on CA_AUDITID=am_id
    where NOT exists (select 1 from CafEntryAllocation WITH (updlock, readpast) where CA_AUDITID=am_id )
    -- NOT exists (select 1 from ss_cafentryallocation where CA_AUDITID=am_id )
    and am_auditdate >=GETDATE()-2 and (am_status = 'Direct Accept' or am_status ='Temporary Accept')
    and cm_modeofunbar in('MNP UNBAR','Normal','POST2PRE','RV','VF DONGLE') and CM_SIMREGDATE is not null
    and DM_DISRIBUTORTYPE in('HUB','Distributor') and ZN_ID=coalesce(4,ZN_ID)
    order by DM_ZONE, CASE WHEN FS_LCS1RECDATE Is NULL Then 1 Else 0 End,
    CASE WHEN di_dnotedate Is NULL Then 1 Else 0 End,di_dnotedate,AM_AUDITDATE

    Please quid me.

    babu.m

    ReplyDelete