Thursday, March 8, 2012

BULK INSERT errors

I'm writing a .NET component that will bulk insert a file into a table.
I also have a string array inside the class whose purpose is to
capture the errors during the bulk insert process. Can this be done?If you are using the SqlBulkCopy class, an error will rollback the batch
containing the problem row and the WriteToServer operation will terminate.
In order to trap insert errors on individual rows yet still save good data,
you could use a hybrid technique in which you write a batch at a time using
WriteToServer and fall back on individual insert commands if an error is
encountered.
1) Read a batch of records from file (e.g. 1000)
2) Write records as a single batch via SqlBulkCopy.WriteToServer
3) If a SqlException is thrown, insert rows individually and trap errors
Note that you'll also need to address file parsing errors, such as an
unexpected number of fields in a CSV record.
Hope this helps.
Dan Guzman
SQL Server MVP
<tobinlim@.gmail.com> wrote in message
news:1149281083.751510.288450@.y43g2000cwc.googlegroups.com...
> I'm writing a .NET component that will bulk insert a file into a table.
> I also have a string array inside the class whose purpose is to
> capture the errors during the bulk insert process. Can this be done?
>|||That's my biggest hurdle. I'm tied to using 1.1 Framework that doesn't
have the SqlBulkCopy class :(
I do plan on sending a batch parameter to control for that. Maybe
there's a property for the OleDbCommand that captures errors.
Dan Guzman wrote:
> If you are using the SqlBulkCopy class, an error will rollback the batch
> containing the problem row and the WriteToServer operation will terminate.
> In order to trap insert errors on individual rows yet still save good data
,
> you could use a hybrid technique in which you write a batch at a time usin
g
> WriteToServer and fall back on individual insert commands if an error is
> encountered.
> 1) Read a batch of records from file (e.g. 1000)
> 2) Write records as a single batch via SqlBulkCopy.WriteToServer
> 3) If a SqlException is thrown, insert rows individually and trap errors
> Note that you'll also need to address file parsing errors, such as an
> unexpected number of fields in a CSV record.
> --
> Hope this helps.
> Dan Guzman
> SQL Server MVP
> <tobinlim@.gmail.com> wrote in message
> news:1149281083.751510.288450@.y43g2000cwc.googlegroups.com...|||With OleDbCommand (and SqlCommand) about all you can do is try/catch. If
you send multiple statements in a batch, it's hard to identify which insert
caused the error.
Hope this helps.
Dan Guzman
SQL Server MVP
<tobinlim@.gmail.com> wrote in message
news:1149365527.051997.242230@.u72g2000cwu.googlegroups.com...
> That's my biggest hurdle. I'm tied to using 1.1 Framework that doesn't
> have the SqlBulkCopy class :(
> I do plan on sending a batch parameter to control for that. Maybe
> there's a property for the OleDbCommand that captures errors.
> Dan Guzman wrote:
>|||What I meant is my batchsize is something that gets passed on to a BULK
INSERT statement: "BATCHSIZE = " + intBatchSize + ", ..."
Anyway, if what you're saying is true, I might have to force the
batchsize to 1 and bulk insert one row at a time and try to catch
errors.
Is this what you're saying?
Dan Guzman wrote:
> With OleDbCommand (and SqlCommand) about all you can do is try/catch. If
> you send multiple statements in a batch, it's hard to identify which inser
t
> caused the error.
> --
> Hope this helps.
> Dan Guzman
> SQL Server MVP
> <tobinlim@.gmail.com> wrote in message
> news:1149365527.051997.242230@.u72g2000cwu.googlegroups.com...|||Let me try to clarify. Some errors, like a primary key violation, will
abort the entire BULK INSERT batch. You'll need to resort to individual
inserts to identify the problem data and persist the good records. I
wouldn't recommend BULK INSERT with batch size of 1 because BULK INSERT is
optimized for large volumes of data. A batch size 1 will be much slower
than a regular inserts and defeat the purpose of BULK INSERT.
Other errors, such as a data type conversion error, will err the problem
record but not the batch. This will still throw a SqlException and you can
iterate over the Errors collection to get the details of the individual
errors. Unfortunately, it's not easy to programmatically distinguish
between errors that abort the BULK INSERT batch and those that err
individual records.
If your data is usually clean and you want the performance of BULK INSERT,
you might consider a modified version of the approach I suggested earlier:
1) BEGIN TRAN
2) BULK INSERT entire file
3) If no errors, COMMIT
4) If errors, ROLLBACK and use individual inserts to identify problem data
Hope this helps.
Dan Guzman
SQL Server MVP
<tobinlim@.gmail.com> wrote in message
news:1149512589.648528.145350@.g10g2000cwb.googlegroups.com...
> What I meant is my batchsize is something that gets passed on to a BULK
> INSERT statement: "BATCHSIZE = " + intBatchSize + ", ..."
> Anyway, if what you're saying is true, I might have to force the
> batchsize to 1 and bulk insert one row at a time and try to catch
> errors.
> Is this what you're saying?
> Dan Guzman wrote:
>|||Thanks. I discovered the SqlException object in .NET. Very useful.
It's pretty much what I've been looking for. Now I noticed that if
multiple errors occur within a row insert, SqlException captures the
first error and moves on. I don't necessarily need to have all the
errors for each row. Just want to understand the nature of BULK INSERT
errors.
Dan Guzman wrote:
> Let me try to clarify. Some errors, like a primary key violation, will
> abort the entire BULK INSERT batch. You'll need to resort to individual
> inserts to identify the problem data and persist the good records. I
> wouldn't recommend BULK INSERT with batch size of 1 because BULK INSERT is
> optimized for large volumes of data. A batch size 1 will be much slower
> than a regular inserts and defeat the purpose of BULK INSERT.
> Other errors, such as a data type conversion error, will err the problem
> record but not the batch. This will still throw a SqlException and you ca
n
> iterate over the Errors collection to get the details of the individual
> errors. Unfortunately, it's not easy to programmatically distinguish
> between errors that abort the BULK INSERT batch and those that err
> individual records.
> If your data is usually clean and you want the performance of BULK INSERT,
> you might consider a modified version of the approach I suggested earlier:
> 1) BEGIN TRAN
> 2) BULK INSERT entire file
> 3) If no errors, COMMIT
> 4) If errors, ROLLBACK and use individual inserts to identify problem data
>
> --
> Hope this helps.
> Dan Guzman
> SQL Server MVP
> <tobinlim@.gmail.com> wrote in message
> news:1149512589.648528.145350@.g10g2000cwb.googlegroups.com...|||> Thanks. I discovered the SqlException object in .NET. Very useful.
> It's pretty much what I've been looking for. Now I noticed that if
> multiple errors occur within a row insert, SqlException captures the
> first error and moves on.
Have you looked at SqlException.Errors? This contains a collection of
SqlError objects and contains details of row-level errors in my test.
However, as I mentioned earlier, you might find it problematic to
distinguish between errors that roll back the entire batch and errors that
err individual records.
Hope this helps.
Dan Guzman
SQL Server MVP
<tobinlim@.gmail.com> wrote in message
news:1149605426.090781.127220@.j55g2000cwa.googlegroups.com...
> Thanks. I discovered the SqlException object in .NET. Very useful.
> It's pretty much what I've been looking for. Now I noticed that if
> multiple errors occur within a row insert, SqlException captures the
> first error and moves on. I don't necessarily need to have all the
> errors for each row. Just want to understand the nature of BULK INSERT
> errors.
>
> Dan Guzman wrote:
>

No comments:

Post a Comment