I need to do a bulk update of 1 field in a data-table (removing
zero-padding). The update is expected to update 52.5 million records.
What techniques could I use so I can ensure
1) Consistency in the transactional log backups (every 10 mins)
2) Consistency in the full DB backup every day
3) Disk drive of Transaction logs do not get filled up completely (I suspect
even if I set rown count to 10,000 and loop through, disk-space would still
be used')
Target environment
SQL Server 2000 cluster on 2 nodes running Windows 2003 Server
(Different disk drives for data and log files)A normal set based operation will do fine. update my table set myfield =
'mynewvalue'
However, this will create significant locking so you're only going to be
this aggressive on a database that's not in use (Overnight, for most) or
write a cursor that includes some logic in it but that's going to be slow.
DatabaseAdmins.com
Databases Reliable. Available.
"Patrick" wrote:
> I need to do a bulk update of 1 field in a data-table (removing
> zero-padding). The update is expected to update 52.5 million records.
> What techniques could I use so I can ensure
> 1) Consistency in the transactional log backups (every 10 mins)
> 2) Consistency in the full DB backup every day
> 3) Disk drive of Transaction logs do not get filled up completely (I suspe
ct
> even if I set rown count to 10,000 and loop through, disk-space would stil
l
> be used')
> Target environment
> SQL Server 2000 cluster on 2 nodes running Windows 2003 Server
> (Different disk drives for data and log files)|||SQL Server takes care of consistency. To make sure you don't blow up the
transaction log, dividing the update into smaller chunks of updates (e.g.
10,000 rows at a time as you mentioned) is the right approach in general.
I'd like to add that you may have to pace your updates (e.g. sleep for some
time after each iteration) so that your transaction log backups at the
10-minute interval have a chance coming in to clear out the inactive portion
of the log for reuse. Alternative, you can backup the transaction log after
each iteration (or every so many iterations) in the loop. You need to
experiment a bit to find out the best chunk size and whether the loop withou
t
any wait time would be too tight in your particular configuration.
Also, it's certainly possible you can choose to update all the rows in a
single shot, depending on the size of your transaction log.
Linchi
"burt_king" wrote:
[vbcol=seagreen]
> A normal set based operation will do fine. update my table set myfield =
> 'mynewvalue'
> However, this will create significant locking so you're only going to be
> this aggressive on a database that's not in use (Overnight, for most) or
> write a cursor that includes some logic in it but that's going to be slow.
> --
> DatabaseAdmins.com
> Databases Reliable. Available.
>
> "Patrick" wrote:
>|||I have composed the following SQL, could you tell me what is wrong? The
Select getDate() only printed <10 dates! when I am expecting millions of
records to be updated!
DECLARE @.recCount int
SELECT @.recCount=50001
SET rowcount 50000
WHILE @.recCount>0
BEGIN
select getdate()
waitfor delay '000:00:5'
BEGIN TRAN
UPDATE
cor_transactions
SET
id_Entity = SUBSTRING(id_Entity,2,3)
WHERE
len(id_Entity)>3
and
id_Entity like '0%'
IF @.@.ERROR= 0
COMMIT TRAN
ELSE
BEGIN
ROLLBACK TRAN
print 'ERROR' + Cast(@.@.ERROR as varchar(50))
BREAK
END
SELECT @.recCount=@.@.ROWCOUNT
END
SELECT @.recCount=50001
SET rowcount 50000
WHILE @.recCount>0
BEGIN
select getdate()
waitfor delay '000:00:5'
BEGIN TRAN
UPDATE
cor_stocks
SET
id_Entity = SUBSTRING(id_Entity,2,3)
WHERE
len(id_Entity)>3
and
id_Entity like '0%'
IF @.@.ERROR= 0
COMMIT TRAN
ELSE
BEGIN
ROLLBACK TRAN
print 'ERROR' + Cast(@.@.ERROR as varchar(50))
BREAK
END
SELECT @.recCount=@.@.ROWCOUNT
END
SELECT @.recCount=50001
SET rowcount 50000
WHILE @.recCount>0
BEGIN
select getdate()
waitfor delay '000:00:5'
BEGIN TRAN
UPDATE
cor_inventory
SET
id_Entity = SUBSTRING(id_Entity,2,3)
WHERE
len(id_Entity)>3
and
id_Entity like '0%'
IF @.@.ERROR= 0
COMMIT TRAN
ELSE
BEGIN
ROLLBACK TRAN
print 'ERROR' + Cast(@.@.ERROR as varchar(50))
BREAK
END
SELECT @.recCount=@.@.ROWCOUNT
END
"Linchi Shea" wrote:
[vbcol=seagreen]
> SQL Server takes care of consistency. To make sure you don't blow up the
> transaction log, dividing the update into smaller chunks of updates (e.g.
> 10,000 rows at a time as you mentioned) is the right approach in general.
> I'd like to add that you may have to pace your updates (e.g. sleep for som
e
> time after each iteration) so that your transaction log backups at the
> 10-minute interval have a chance coming in to clear out the inactive porti
on
> of the log for reuse. Alternative, you can backup the transaction log afte
r
> each iteration (or every so many iterations) in the loop. You need to
> experiment a bit to find out the best chunk size and whether the loop with
out
> any wait time would be too tight in your particular configuration.
> Also, it's certainly possible you can choose to update all the rows in a
> single shot, depending on the size of your transaction log.
> Linchi
> "burt_king" wrote:
>
Thursday, March 29, 2012
Bulk UPDATE on SQL Server 2000
Labels:
bulk,
data-table,
database,
expected,
field,
microsoft,
million,
mysql,
oracle,
records,
removingzero-padding,
server,
sql,
techniques,
update
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment