Saturday, February 25, 2012

Bulk insert - to log or not to log

Dear all
I have a web application which has many customers. Each customer has their
own system, which means they have their own database. Each customer has
many users.
Out of office hours, I allow the customer to run very large data updates
against their systems. These are bulk inserts which use minimal logging. I
do this to ensure that they take as little time as possible.
However, some customers demand that they can do "small" data updates during
the day. If the data updates fall below a particular size threshold, then
they run during office hours. If one runs, there's minimal performance loss
to the other users on the server. However, if more than one runs at the
same time, then the server basically grinds to a halt, giving interactive
users extremely poor performance.
It was recommended to me that during office hours I should do bulk inserts
that were fully logged instead of minimally logged. The reasoning was that
the logging operation takes time, ensuring that the bulk insert can't hog
the processor, giving the interactive user far better performance. The
upload may take up to 10 times longer though.
This makes sense to me, but unfortunately I can't test it without doing it
on the live server. Any thoughts on this?
Thanks in advance
Griff
Griff
>It was recommended to me that during office hours I >should do bulk inserts
>that were fully logged instead of minimally logged.
It depends as far as I know on recovery model of the database. Only if you
have full recovery model the bulk operation is fully logged. if the
transaction is failed you are able to restore the log file at point of time.
"Griff" <Howling@.The.Moon> wrote in message
news:uVuQX7$jEHA.2664@.TK2MSFTNGP11.phx.gbl...
> Dear all
> I have a web application which has many customers. Each customer has
their
> own system, which means they have their own database. Each customer has
> many users.
> Out of office hours, I allow the customer to run very large data updates
> against their systems. These are bulk inserts which use minimal logging.
I
> do this to ensure that they take as little time as possible.
> However, some customers demand that they can do "small" data updates
during
> the day. If the data updates fall below a particular size threshold, then
> they run during office hours. If one runs, there's minimal performance
loss
> to the other users on the server. However, if more than one runs at the
> same time, then the server basically grinds to a halt, giving interactive
> users extremely poor performance.
> It was recommended to me that during office hours I should do bulk inserts
> that were fully logged instead of minimally logged. The reasoning was
that
> the logging operation takes time, ensuring that the bulk insert can't hog
> the processor, giving the interactive user far better performance. The
> upload may take up to 10 times longer though.
> This makes sense to me, but unfortunately I can't test it without doing it
> on the live server. Any thoughts on this?
> Thanks in advance
> Griff
>
|||
> It depends as far as I know on recovery model of the database. Only if you
> have full recovery model the bulk operation is fully logged. if the
> transaction is failed you are able to restore the log file at point of
time.
Hi Uri
True, but presumably the recovery option can be changed so that it's FULL
during office hours and BULK-LOGGED at other times. I'm not sure though
whether it's sensible to keep changing the recovery model...my guess is
not.
Griff
|||Full recovery model will certainly cause a big increase in IO activity in
your log file during th day. Writing those log records also cnosume a lot of
CPU. It is doubtful that other database users can see better performance
while this is going on. I can not say for sure that you will see a
performance degradation, but the chance is pretty high.
I would start with checking the following things first:
Do you have blocking and deadlock problem when you introduce more than one
update workload (bulk insert)?
Is there lock escalation?
Is it 100% CPU? -- check performance monitor
Or Is it very long disk queue length? -- check performance monitor
A starting point is to read this article:
http://msdn.microsoft.com/library/de...l7perftune.asp
and you will find some really great SQL Server performance tuning articles
on the web.
I recommend that you have a test system to experiment with various settings.
If you are MSDN subscriber, you may also want to check out SQL Server 2005
beta 2. It has a feature called "read committed snapshot" that may increase
throughput if you have a blocking and deadlock problem.
Wei Xiao [MSFT]
SQL Server Storage Engine Development
This posting is provided "AS IS" with no warranties, and confers no rights.
"Griff" <Howling@.The.Moon> wrote in message
news:uVuQX7$jEHA.2664@.TK2MSFTNGP11.phx.gbl...
> Dear all
> I have a web application which has many customers. Each customer has
their
> own system, which means they have their own database. Each customer has
> many users.
> Out of office hours, I allow the customer to run very large data updates
> against their systems. These are bulk inserts which use minimal logging.
I
> do this to ensure that they take as little time as possible.
> However, some customers demand that they can do "small" data updates
during
> the day. If the data updates fall below a particular size threshold, then
> they run during office hours. If one runs, there's minimal performance
loss
> to the other users on the server. However, if more than one runs at the
> same time, then the server basically grinds to a halt, giving interactive
> users extremely poor performance.
> It was recommended to me that during office hours I should do bulk inserts
> that were fully logged instead of minimally logged. The reasoning was
that
> the logging operation takes time, ensuring that the bulk insert can't hog
> the processor, giving the interactive user far better performance. The
> upload may take up to 10 times longer though.
> This makes sense to me, but unfortunately I can't test it without doing it
> on the live server. Any thoughts on this?
> Thanks in advance
> Griff
>

No comments:

Post a Comment