Wednesday, March 7, 2012

Bulk Insert Data in Millions - Lock Issue

Hi all,
Hope there is a quick fix for this:

I am inserting data from one table to another on the same DB. The
insert is pretty simple as in:

insert into datatable(field1, field2, field3)
select a1, a2, a3 from temptable...

This inserts about 4 millions rows in one go. And since I had the
'cannot obtain lock resources' problem, several methods were suggested
by some web sites:
1) one to split the insert into smaller chunks (I have no idea how I
can spit a insert to insert only n records at a time..)

2)to use waitfor - which I did but did not fix the error.

3)use bulk insert (in t-sql) - I dont know how to do this?

As I see I am simply trying to move data from one table to another
(ofcourse lots of data) in SQL Server 2000 and I dont see one simple
solution to the locking problem.

any ideas on how best I can do this will save my day!

thanks all.Hi,

Adding in smaller chunks doesn't solve the problem. Adding small chunk is
the same as adding big chunk. Size doesn't matter.

Looks like you are out of option.

How about adding the records one by one in the middle of the night instead?

"adi" <Adityanad@.gmail.com> wrote in message
news:1129063850.247250.54770@.g49g2000cwa.googlegro ups.com...
> Hi all,
> Hope there is a quick fix for this:
>
> I am inserting data from one table to another on the same DB. The
> insert is pretty simple as in:
> insert into datatable(field1, field2, field3)
> select a1, a2, a3 from temptable...
>
> This inserts about 4 millions rows in one go. And since I had the
> 'cannot obtain lock resources' problem, several methods were suggested
> by some web sites:
> 1) one to split the insert into smaller chunks (I have no idea how I
> can spit a insert to insert only n records at a time..)
>
> 2)to use waitfor - which I did but did not fix the error.
>
> 3)use bulk insert (in t-sql) - I dont know how to do this?
>
> As I see I am simply trying to move data from one table to another
> (ofcourse lots of data) in SQL Server 2000 and I dont see one simple
> solution to the locking problem.
>
> any ideas on how best I can do this will save my day!
>
> thanks all.|||adi (Adityanad@.gmail.com) writes:
> I am inserting data from one table to another on the same DB. The
> insert is pretty simple as in:
> insert into datatable(field1, field2, field3)
> select a1, a2, a3 from temptable...
>
> This inserts about 4 millions rows in one go. And since I had the
> 'cannot obtain lock resources' problem, several methods were suggested
> by some web sites:
> 1) one to split the insert into smaller chunks (I have no idea how I
> can spit a insert to insert only n records at a time..)

There are a couple of techniques. But for best performance, you need
a clustered index in the table. This does not have to be unqiue though.
Say that you have a date column in the table that has a decent distribution.
Create a clustered index on tbat column:

CREATE CLUSTERED INDEX ON temptable(datecol)

Then you can do:

DECLARE @.this_date datetime,
@.next_date datetim
SELECT @.this_date = MIN(datecol) FROM temptable
WHILE @.this_date <= (SELECT datecol FROM temptable)
BEGIN
SELECT @.next_date = dateadd(MONTH, 1, @.this_date)
INSERT dataable (...)
SELECT ...
FROM temptable
WHERE datecol >= @.this_date
AND daetcol < @.next_date
SELECT @.this_date = @.next_date
END

It's important that the index is clustered to minimize seek times.
In the example I used month, but this can be changed as you see fit.

If you don't have a datetime column, maybe there is some id column or
similar. But you could even use a value like max temperature, as long
there is a decent disitribution.

You can also use SET ROWCOUNT to TOP to batch, but it's messier to
keep track of what you already inserted. (Or you use a WHERE NOT EXISTS,
with could have poor performance.)

> 2)to use waitfor - which I did but did not fix the error.

I will have to admit that I don't see the point with WAITFOR here.

> 3)use bulk insert (in t-sql) - I dont know how to do this?

This requires you to first unload the temptable to file with BCP, and
then load it to the target table with BCP or BULK INSERT. I skip
examples, as this would be a last resort for me.

Then again, your also try:

INSERT datatable (...) WITH (TABLOCKX)
SELECT ...

I have never tried it, but I expect it to lead to a single table lock
in the table, which which address the locking issue.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||Tom (nospam@.yahoo.com) writes:
> Adding in smaller chunks doesn't solve the problem. Adding small chunk is
> the same as adding big chunk. Size doesn't matter.

Huh? Care to elaborate?

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||There is no guarantee that the record in a smaller chunk will not be locked
by another user.

"Erland Sommarskog" <esquel@.sommarskog.se> wrote in message
news:Xns96ED17409A33Yazorman@.127.0.0.1...
> Tom (nospam@.yahoo.com) writes:
>> Adding in smaller chunks doesn't solve the problem. Adding small chunk
>> is
>> the same as adding big chunk. Size doesn't matter.
> Huh? Care to elaborate?
>
> --
> Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
> Books Online for SQL Server SP3 at
> http://www.microsoft.com/sql/techin.../2000/books.asp|||Tom (nospam@.yahoo.com) writes:
> There is no guarantee that the record in a smaller chunk will not be
> locked by another user.

I did not really recognize the error messages that Adi got, but it seemed
to me that the problem was to get locks on the table he is inserting to,
because of resource constraints, not because he was locked by another
user.

And even in a blocking scenario, smaller batches help as it reduces the
risk for exposure for blocking.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||Just a thought, its too early :). What would happen if he used NOLOCK
option? He doesnt really need locks around the place does he?

MC

"Erland Sommarskog" <esquel@.sommarskog.se> wrote in message
news:Xns96ECF2B98CD3FYazorman@.127.0.0.1...
> adi (Adityanad@.gmail.com) writes:
>> I am inserting data from one table to another on the same DB. The
>> insert is pretty simple as in:
>>
>> insert into datatable(field1, field2, field3)
>> select a1, a2, a3 from temptable...
>>
>>
>> This inserts about 4 millions rows in one go. And since I had the
>> 'cannot obtain lock resources' problem, several methods were suggested
>> by some web sites:
>> 1) one to split the insert into smaller chunks (I have no idea how I
>> can spit a insert to insert only n records at a time..)
> There are a couple of techniques. But for best performance, you need
> a clustered index in the table. This does not have to be unqiue though.
> Say that you have a date column in the table that has a decent
> distribution.
> Create a clustered index on tbat column:
> CREATE CLUSTERED INDEX ON temptable(datecol)
> Then you can do:
> DECLARE @.this_date datetime,
> @.next_date datetim
> SELECT @.this_date = MIN(datecol) FROM temptable
> WHILE @.this_date <= (SELECT datecol FROM temptable)
> BEGIN
> SELECT @.next_date = dateadd(MONTH, 1, @.this_date)
> INSERT dataable (...)
> SELECT ...
> FROM temptable
> WHERE datecol >= @.this_date
> AND daetcol < @.next_date
> SELECT @.this_date = @.next_date
> END
> It's important that the index is clustered to minimize seek times.
> In the example I used month, but this can be changed as you see fit.
> If you don't have a datetime column, maybe there is some id column or
> similar. But you could even use a value like max temperature, as long
> there is a decent disitribution.
> You can also use SET ROWCOUNT to TOP to batch, but it's messier to
> keep track of what you already inserted. (Or you use a WHERE NOT EXISTS,
> with could have poor performance.)
>> 2)to use waitfor - which I did but did not fix the error.
> I will have to admit that I don't see the point with WAITFOR here.
>> 3)use bulk insert (in t-sql) - I dont know how to do this?
> This requires you to first unload the temptable to file with BCP, and
> then load it to the target table with BCP or BULK INSERT. I skip
> examples, as this would be a last resort for me.
> Then again, your also try:
> INSERT datatable (...) WITH (TABLOCKX)
> SELECT ...
> I have never tried it, but I expect it to lead to a single table lock
> in the table, which which address the locking issue.
>
> --
> Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
> Books Online for SQL Server SP3 at
> http://www.microsoft.com/sql/techin.../2000/books.asp|||MC (marko_culo#@.#yahoo#.#com#) writes:
> Just a thought, its too early :). What would happen if he used NOLOCK
> option? He doesnt really need locks around the place does he?

If the problem is on the table he is reading from yes. But I suspect that
it's on the table he is inserting to.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||Having used this approach I still want to improve its performance.
My DBA says that splitting the insert statement into smaller chunks
will be very helpful - that is insert n number of records at a time.
so if I have 1 million rows to insert, he says its good to insert 50k
at a time.

Can someone help me on how to split the below insert to do that?

insert into datatable(field1, field2, field3)
select a1, a2, a3 from temptable.|||adi (Adityanad@.gmail.com) writes:
> Having used this approach I still want to improve its performance.
> My DBA says that splitting the insert statement into smaller chunks
> will be very helpful - that is insert n number of records at a time.
> so if I have 1 million rows to insert, he says its good to insert 50k
> at a time.
> Can someone help me on how to split the below insert to do that?
> insert into datatable(field1, field2, field3)
> select a1, a2, a3 from temptable.

Which approach?

I did suggest a method for batching earlier in the thread. For your con-
venience I repost this here:

........................

There are a couple of techniques. But for best performance, you need
a clustered index in the table. This does not have to be unqiue though.
Say that you have a date column in the table that has a decent distribution.
Create a clustered index on tbat column:

CREATE CLUSTERED INDEX ON temptable(datecol)

Then you can do:

DECLARE @.this_date datetime,
@.next_date datetim
SELECT @.this_date = MIN(datecol) FROM temptable
WHILE @.this_date <= (SELECT datecol FROM temptable)
BEGIN
SELECT @.next_date = dateadd(MONTH, 1, @.this_date)
INSERT dataable (...)
SELECT ...
FROM temptable
WHERE datecol >= @.this_date
AND daetcol < @.next_date
SELECT @.this_date = @.next_date
END

It's important that the index is clustered to minimize seek times.
In the example I used month, but this can be changed as you see fit.

If you don't have a datetime column, maybe there is some id column or
similar. But you could even use a value like max temperature, as long
there is a decent disitribution.

You can also use SET ROWCOUNT to TOP to batch, but it's messier to
keep track of what you already inserted. (Or you use a WHERE NOT EXISTS,
with could have poor performance.)

> 2)to use waitfor - which I did but did not fix the error.

I will have to admit that I don't see the point with WAITFOR here.

> 3)use bulk insert (in t-sql) - I dont know how to do this?

This requires you to first unload the temptable to file with BCP, and
then load it to the target table with BCP or BULK INSERT. I skip
examples, as this would be a last resort for me.

Then again, your also try:

INSERT datatable (...) WITH (TABLOCKX)
SELECT ...

I have never tried it, but I expect it to lead to a single table lock
in the table, which which address the locking issue.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp

No comments:

Post a Comment