Showing posts with label contains. Show all posts
Showing posts with label contains. Show all posts

Sunday, March 25, 2012

BULK INSERT: high disk queues (SQL Server 2000 SP4)

Hi,
I am using BUK INSERT to fill a table that usually contains about 100
mio. rows with about 10000 new rows per batch. This usually works fine
and takes about 1-2 seconds. Such bulk insert operation is executed
about every five seconds.
The problem is, that after about every 2 minutes one bulk exec will
take about 75 seconds, causing enourmous disk queues. After that
everything is fine again - for about two minutes.
There are some other strange facts:
1) this behaviour did only show up after the installation of service
pack 4;
2) this behaviour usually vanished after a couple of days with
continuous bulk inserts and would then only show up again after a sql
server server restart (but currently I have one server that shows the
problem since three weeks already);
3) I have this behaviour on both of our production databases but on one
server it simply disappeared (but will probably be there again after a
server restart) and on the other server it won't go away now.
I am quiet desperate with this situation and would be most thankful for
any hint.
Regards
DCIs this slowdown and high disk queues associated with a checkpoint?
Checkthe Sql perfmon counter for Checkpoint Pages Per second.
Andrew J. Kelly SQL MVP
<dc@.upsize.de> wrote in message
news:1147686058.831935.28980@.v46g2000cwv.googlegroups.com...
> Hi,
> I am using BUK INSERT to fill a table that usually contains about 100
> mio. rows with about 10000 new rows per batch. This usually works fine
> and takes about 1-2 seconds. Such bulk insert operation is executed
> about every five seconds.
> The problem is, that after about every 2 minutes one bulk exec will
> take about 75 seconds, causing enourmous disk queues. After that
> everything is fine again - for about two minutes.
> There are some other strange facts:
> 1) this behaviour did only show up after the installation of service
> pack 4;
> 2) this behaviour usually vanished after a couple of days with
> continuous bulk inserts and would then only show up again after a sql
> server server restart (but currently I have one server that shows the
> problem since three weeks already);
> 3) I have this behaviour on both of our production databases but on one
> server it simply disappeared (but will probably be there again after a
> server restart) and on the other server it won't go away now.
> I am quiet desperate with this situation and would be most thankful for
> any hint.
> Regards
> DC
>|||Dear Andrew,
I added the counter and yes: the very moment when the disk queues
start, the number of checkpoints per seconds jumps right up from about
0-2 per second to a number of 500-1000 checkpoints per second. I don't
know what that means - but sure hope you have another clue for me.
Regards
DC|||May I add to this: the secondary server, which is identical (physical
and configuration-wise) to the queuing server and also receives the
exact same bulk inserts, does only show a few checkpoint pages per
second and only every few minutes for only two or three seconds.|||Are both databases using the same recovery model ..?
SELECT DATABASEPROPERTYEX('MyDatabase', 'Recovery')
Also, is the recovery interval the same for both databases, sp_configure
HTH. Ryan
<dc@.upsize.de> wrote in message
news:1147700793.823997.289510@.g10g2000cwb.googlegroups.com...
> May I add to this: the secondary server, which is identical (physical
> and configuration-wise) to the queuing server and also receives the
> exact same bulk inserts, does only show a few checkpoint pages per
> second and only every few minutes for only two or three seconds.
>|||Both servers use the simple recovery model.
sp_configure output is also identical on both server, recovery interval
(min) is set to the default 0.|||In simple recovery mode CHECKPOINT happens automatically when the log
reaches 70% full. Are the log files the same size ..?
dbcc sqlperf(logspace)
HTH. Ryan
<dc@.upsize.de> wrote in message
news:1147703050.779437.211430@.u72g2000cwu.googlegroups.com...
> Both servers use the simple recovery model.
> sp_configure output is also identical on both server, recovery interval
> (min) is set to the default 0.
>|||Both machines are configured to use up to 50 GB of logspace (on three
different harddrives; harddrive size and raid type is the same on both
boxes).
The server with the massive queuing currently utilizes only 500 MB of
transaction log size (and displays that 20% of that is being used),
while the server which is working fine utilizes 37 GB of log files (and
reports that 97% of that is being used).|||> The server with the massive queuing currently utilizes only 500 MB of
> transaction log size (and displays that 20% of that is being used),
> while the server which is working fine utilizes 37 GB of log files (and
> reports that 97% of that is being used).
Seems that the physical file size for the log files are not the same, and yo
u are seeing a side
effect of having a "too small" log file. SQL Server does a checkpoint when t
he log file is 70% full,
which turn out to be very frequently on the system with the smaller file. Gr
ow the file and you will
probably see a difference.
Good catch by Ryan, btw...
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
<dc@.upsize.de> wrote in message news:1147706961.661923.196630@.u72g2000cwu.googlegroups.com..
.
> Both machines are configured to use up to 50 GB of logspace (on three
> different harddrives; harddrive size and raid type is the same on both
> boxes).
> The server with the massive queuing currently utilizes only 500 MB of
> transaction log size (and displays that 20% of that is being used),
> while the server which is working fine utilizes 37 GB of log files (and
> reports that 97% of that is being used).
>|||How can I force the log file to grow? I thought it would grow
automatically on demand. Can I for example set the log file to a fixed
size?
BTW: I have been watching dbcc sqlperf(logspace) on the server with the
massive checkpoints and queueing. The log file utilization was growing
from around 10 to 20 percent when the queueing started - it definitely
did not hit 70%.

BULK INSERT: high disk queues (SQL Server 2000 SP4)

Hi,
I am using BUK INSERT to fill a table that usually contains about 100
mio. rows with about 10000 new rows per batch. This usually works fine
and takes about 1-2 seconds. Such bulk insert operation is executed
about every five seconds.
The problem is, that after about every 2 minutes one bulk exec will
take about 75 seconds, causing enourmous disk queues. After that
everything is fine again - for about two minutes.
There are some other strange facts:
1) this behaviour did only show up after the installation of service
pack 4;
2) this behaviour usually vanished after a couple of days with
continuous bulk inserts and would then only show up again after a sql
server server restart (but currently I have one server that shows the
problem since three weeks already);
3) I have this behaviour on both of our production databases but on one
server it simply disappeared (but will probably be there again after a
server restart) and on the other server it won't go away now.
I am quiet desperate with this situation and would be most thankful for
any hint.
Regards
DCIs this slowdown and high disk queues associated with a checkpoint?
Checkthe Sql perfmon counter for Checkpoint Pages Per second.
--
Andrew J. Kelly SQL MVP
<dc@.upsize.de> wrote in message
news:1147686058.831935.28980@.v46g2000cwv.googlegroups.com...
> Hi,
> I am using BUK INSERT to fill a table that usually contains about 100
> mio. rows with about 10000 new rows per batch. This usually works fine
> and takes about 1-2 seconds. Such bulk insert operation is executed
> about every five seconds.
> The problem is, that after about every 2 minutes one bulk exec will
> take about 75 seconds, causing enourmous disk queues. After that
> everything is fine again - for about two minutes.
> There are some other strange facts:
> 1) this behaviour did only show up after the installation of service
> pack 4;
> 2) this behaviour usually vanished after a couple of days with
> continuous bulk inserts and would then only show up again after a sql
> server server restart (but currently I have one server that shows the
> problem since three weeks already);
> 3) I have this behaviour on both of our production databases but on one
> server it simply disappeared (but will probably be there again after a
> server restart) and on the other server it won't go away now.
> I am quiet desperate with this situation and would be most thankful for
> any hint.
> Regards
> DC
>|||Dear Andrew,
I added the counter and yes: the very moment when the disk queues
start, the number of checkpoints per seconds jumps right up from about
0-2 per second to a number of 500-1000 checkpoints per second. I don't
know what that means - but sure hope you have another clue for me.
Regards
DC|||May I add to this: the secondary server, which is identical (physical
and configuration-wise) to the queuing server and also receives the
exact same bulk inserts, does only show a few checkpoint pages per
second and only every few minutes for only two or three seconds.|||Are both databases using the same recovery model ..?
SELECT DATABASEPROPERTYEX('MyDatabase', 'Recovery')
Also, is the recovery interval the same for both databases, sp_configure
--
HTH. Ryan
<dc@.upsize.de> wrote in message
news:1147700793.823997.289510@.g10g2000cwb.googlegroups.com...
> May I add to this: the secondary server, which is identical (physical
> and configuration-wise) to the queuing server and also receives the
> exact same bulk inserts, does only show a few checkpoint pages per
> second and only every few minutes for only two or three seconds.
>|||Both servers use the simple recovery model.
sp_configure output is also identical on both server, recovery interval
(min) is set to the default 0.|||In simple recovery mode CHECKPOINT happens automatically when the log
reaches 70% full. Are the log files the same size ..?
dbcc sqlperf(logspace)
HTH. Ryan
<dc@.upsize.de> wrote in message
news:1147703050.779437.211430@.u72g2000cwu.googlegroups.com...
> Both servers use the simple recovery model.
> sp_configure output is also identical on both server, recovery interval
> (min) is set to the default 0.
>|||Both machines are configured to use up to 50 GB of logspace (on three
different harddrives; harddrive size and raid type is the same on both
boxes).
The server with the massive queuing currently utilizes only 500 MB of
transaction log size (and displays that 20% of that is being used),
while the server which is working fine utilizes 37 GB of log files (and
reports that 97% of that is being used).|||> The server with the massive queuing currently utilizes only 500 MB of
> transaction log size (and displays that 20% of that is being used),
> while the server which is working fine utilizes 37 GB of log files (and
> reports that 97% of that is being used).
Seems that the physical file size for the log files are not the same, and you are seeing a side
effect of having a "too small" log file. SQL Server does a checkpoint when the log file is 70% full,
which turn out to be very frequently on the system with the smaller file. Grow the file and you will
probably see a difference.
Good catch by Ryan, btw...
--
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
<dc@.upsize.de> wrote in message news:1147706961.661923.196630@.u72g2000cwu.googlegroups.com...
> Both machines are configured to use up to 50 GB of logspace (on three
> different harddrives; harddrive size and raid type is the same on both
> boxes).
> The server with the massive queuing currently utilizes only 500 MB of
> transaction log size (and displays that 20% of that is being used),
> while the server which is working fine utilizes 37 GB of log files (and
> reports that 97% of that is being used).
>|||How can I force the log file to grow? I thought it would grow
automatically on demand. Can I for example set the log file to a fixed
size?
BTW: I have been watching dbcc sqlperf(logspace) on the server with the
massive checkpoints and queueing. The log file utilization was growing
from around 10 to 20 percent when the queueing started - it definitely
did not hit 70%.|||If you do enough changes the recovery interval can force a checkpoint before
the 70% fullness is reached. You can change the size of the log file with
alter database command or thru Enterprise Manager. I would make both of them
the same size so you can rule that part out. Just because the machines seem
identical they may not be. One thing that affects how long a checkpoint
takes is the size and configuration of the cache on the disk controllers or
the SAN. IS the cache the same on both servers? Do you have write back
cache enabled and if so what is the read to write ratio?
--
Andrew J. Kelly SQL MVP
<dc@.upsize.de> wrote in message
news:1147726750.397611.19450@.i39g2000cwa.googlegroups.com...
> How can I force the log file to grow? I thought it would grow
> automatically on demand. Can I for example set the log file to a fixed
> size?
> BTW: I have been watching dbcc sqlperf(logspace) on the server with the
> massive checkpoints and queueing. The log file utilization was growing
> from around 10 to 20 percent when the queueing started - it definitely
> did not hit 70%.
>|||A slightly different approach would be to take control of the CHECKPOINT
yourself.
Try a different batch size (say 1000) and issue a CHECKPOINT after each
batch, that way the disk controller should receive a steady trickle of data
rather than being swamped
--
HTH. Ryan
"Andrew J. Kelly" <sqlmvpnooospam@.shadhawk.com> wrote in message
news:ePFexUGeGHA.2188@.TK2MSFTNGP05.phx.gbl...
> If you do enough changes the recovery interval can force a checkpoint
> before the 70% fullness is reached. You can change the size of the log
> file with alter database command or thru Enterprise Manager. I would make
> both of them the same size so you can rule that part out. Just because the
> machines seem identical they may not be. One thing that affects how long a
> checkpoint takes is the size and configuration of the cache on the disk
> controllers or the SAN. IS the cache the same on both servers? Do you
> have write back cache enabled and if so what is the read to write ratio?
> --
> Andrew J. Kelly SQL MVP
>
> <dc@.upsize.de> wrote in message
> news:1147726750.397611.19450@.i39g2000cwa.googlegroups.com...
>> How can I force the log file to grow? I thought it would grow
>> automatically on demand. Can I for example set the log file to a fixed
>> size?
>> BTW: I have been watching dbcc sqlperf(logspace) on the server with the
>> massive checkpoints and queueing. The log file utilization was growing
>> from around 10 to 20 percent when the queueing started - it definitely
>> did not hit 70%.
>|||Coincidentally this morning the queueing on the problematic server has
disappeared. Without obvious reason, after three weeks of problems.
However, I am sure that the problems will reappear once I restart the
server, since that has happened in the past.
When I manually launch a CHECKPOINT command on the server, it will
queue for about five minutes. And when I launch another checkpoint just
a few seconds later, it will again cause the disks to spin for a couple
of minutes.
dbcc sqlperf(logspace) now shows, that the database on the problematic
server is using 20 GB of log space. Maybe
ALTER DATABASE MyDB
MODIFY FILE
(NAME = MyLogfile,
SIZE = 40GB)
will help to avoid the automatic checkpoints once I have to restart one
of the servers.
I will also try to install the post SP4 hotfixes next time. I think
that these bulk insert related queues / checkpoints are a buggy sql
server behaviour since it did not appear before SP4.|||Today the problematic server started unwanted checkpoints and queueing
again. Which is strange, since in the past the servers were fine once
the queueing ceased - until the server had to be restarted.
I increased the logfile size with ALTER DATABASE - no change. I
installed the post-SP4 hotfixed - still no success.
The only countermeasure that had an effect is
sp_configure "recovery_interval", 60
RECONFIGURE WITH OVERRIDE
but this will still result in longer (about 3 minutes) queueing in
bigger intervals (~20 minutes).
Can I somehow find out what exactly is causing the checkpoint? Can I
see the checkpoint time and reason somewhere in the system tables? The
only difference between the two servers that I can see now, is that the
problematic server uses only a few percent of logspace (never hitting
70%), while the OK server always uses > 95% of the log space.
It is too bad that I cannot use non-logged or minimally-logged BULK
INSERTS (I cannot use TABLOCK because the table is also permanently
being queryied) since I believe that would make a difference in my
scenario.
I am wondering if SQL Server 2005 will perform better in my scenario
with permanent bulk inserts and also massive updating and simultaneous
querying. Data integrity is really not that important for my
application, but speed is. I wish I could deactivate all logging.|||Are you sure they are all in the same recovery model? In any case I really
think you are going down the wrong road. The bottom line is that if
checkpoints cause too much pain and last too long you do not have a proper
disk subsystem or configuration to handle the load.IS the log file on a RAID
1 or RAID 10 with no other types of files? What is the array configuration
for the data? How much and what is the configuration for the cache on the
controller or SAN processor?
--
Andrew J. Kelly SQL MVP
<dc@.upsize.de> wrote in message
news:1147959206.497801.43880@.j55g2000cwa.googlegroups.com...
> Today the problematic server started unwanted checkpoints and queueing
> again. Which is strange, since in the past the servers were fine once
> the queueing ceased - until the server had to be restarted.
> I increased the logfile size with ALTER DATABASE - no change. I
> installed the post-SP4 hotfixed - still no success.
> The only countermeasure that had an effect is
> sp_configure "recovery_interval", 60
> RECONFIGURE WITH OVERRIDE
> but this will still result in longer (about 3 minutes) queueing in
> bigger intervals (~20 minutes).
> Can I somehow find out what exactly is causing the checkpoint? Can I
> see the checkpoint time and reason somewhere in the system tables? The
> only difference between the two servers that I can see now, is that the
> problematic server uses only a few percent of logspace (never hitting
> 70%), while the OK server always uses > 95% of the log space.
> It is too bad that I cannot use non-logged or minimally-logged BULK
> INSERTS (I cannot use TABLOCK because the table is also permanently
> being queryied) since I believe that would make a difference in my
> scenario.
> I am wondering if SQL Server 2005 will perform better in my scenario
> with permanent bulk inserts and also massive updating and simultaneous
> querying. Data integrity is really not that important for my
> application, but speed is. I wish I could deactivate all logging.
>|||====================================="Andrew J. Kelly" <sqlmvpnooospam@.shadhawk.com> wrote in message
news:#jJQM#teGHA.4828@.TK2MSFTNGP05.phx.gbl...
> Are you sure they are all in the same recovery model? In any case I
really
> think you are going down the wrong road. The bottom line is that if
> checkpoints cause too much pain and last too long you do not have a proper
> disk subsystem or configuration to handle the load.IS the log file on a
RAID
> 1 or RAID 10 with no other types of files? What is the array
configuration
> for the data? How much and what is the configuration for the cache on the
> controller or SAN processor?
> --
> Andrew J. Kelly SQL MVP
>
> <dc@.upsize.de> wrote in message
> news:1147959206.497801.43880@.j55g2000cwa.googlegroups.com...
> > Today the problematic server started unwanted checkpoints and queueing
> > again. Which is strange, since in the past the servers were fine once
> > the queueing ceased - until the server had to be restarted.
> >
> > I increased the logfile size with ALTER DATABASE - no change. I
> > installed the post-SP4 hotfixed - still no success.
> >
> > The only countermeasure that had an effect is
> >
> > sp_configure "recovery_interval", 60
> >
> > RECONFIGURE WITH OVERRIDE
> >
> > but this will still result in longer (about 3 minutes) queueing in
> > bigger intervals (~20 minutes).
> >
> > Can I somehow find out what exactly is causing the checkpoint? Can I
> > see the checkpoint time and reason somewhere in the system tables? The
> > only difference between the two servers that I can see now, is that the
> > problematic server uses only a few percent of logspace (never hitting
> > 70%), while the OK server always uses > 95% of the log space.
> >
> > It is too bad that I cannot use non-logged or minimally-logged BULK
> > INSERTS (I cannot use TABLOCK because the table is also permanently
> > being queryied) since I believe that would make a difference in my
> > scenario.
> >
> > I am wondering if SQL Server 2005 will perform better in my scenario
> > with permanent bulk inserts and also massive updating and simultaneous
> > querying. Data integrity is really not that important for my
> > application, but speed is. I wish I could deactivate all logging.
> >
>|||I am ruling out disk performance, since the problem occurs on two
servers with identical hardware and configuration (including recovery
model) and it only occurs after the servers have been restarted. And on
one of the servers the problem seems to have manifested since it is
still there even weeks after restarting.
I never had this "too many checkpoints" problem before SP4. I am
thinking about going back to SP3. However, if I could somehow diagnose
why the checkpoints are occuring that would help. As mentioned before,
the situation got better with changing the "recovery_interval" to 300.
However, if I change the interval to 3000 or 30000 that does not seem
to make much of a difference, still a checkpoint will occur in 20-30
minute intervals.|||I'm not sure if this posted since it's been over 10 minutes and I got a blank
page after clicking the post button, so please ignore if my response is
already there:
Just out of curiosity, is there any reason why you're using the Simple
Recovery Model on a Production server that also receives a majority of their
inserts through Bulk-Insert? Why aren't you using the Bulk-Logged Recovery
Model which is designed for this particular situation? Each time you do a
Bulk-Insert of any magnitude, it is being logged, which will cause your
transaction logs to grow because each row that's being inserted is being
recorded, whereas using the Bulk-Logged Recovery will only record that you're
doing a Bulk-Insert, which will be less writes to your transaction log, less
auto-growth, and possibly less automatic checkpoints. I would follow the
suggestion of doing a manual checkpoint before each Bulk-Insert for piece of
mind though. Hopefully this helps.
Laurence
"Andrew J. Kelly" wrote:
> Are you sure they are all in the same recovery model? In any case I really
> think you are going down the wrong road. The bottom line is that if
> checkpoints cause too much pain and last too long you do not have a proper
> disk subsystem or configuration to handle the load.IS the log file on a RAID
> 1 or RAID 10 with no other types of files? What is the array configuration
> for the data? How much and what is the configuration for the cache on the
> controller or SAN processor?
> --
> Andrew J. Kelly SQL MVP
>
> <dc@.upsize.de> wrote in message
> news:1147959206.497801.43880@.j55g2000cwa.googlegroups.com...
> > Today the problematic server started unwanted checkpoints and queueing
> > again. Which is strange, since in the past the servers were fine once
> > the queueing ceased - until the server had to be restarted.
> >
> > I increased the logfile size with ALTER DATABASE - no change. I
> > installed the post-SP4 hotfixed - still no success.
> >
> > The only countermeasure that had an effect is
> >
> > sp_configure "recovery_interval", 60
> >
> > RECONFIGURE WITH OVERRIDE
> >
> > but this will still result in longer (about 3 minutes) queueing in
> > bigger intervals (~20 minutes).
> >
> > Can I somehow find out what exactly is causing the checkpoint? Can I
> > see the checkpoint time and reason somewhere in the system tables? The
> > only difference between the two servers that I can see now, is that the
> > problematic server uses only a few percent of logspace (never hitting
> > 70%), while the OK server always uses > 95% of the log space.
> >
> > It is too bad that I cannot use non-logged or minimally-logged BULK
> > INSERTS (I cannot use TABLOCK because the table is also permanently
> > being queryied) since I believe that would make a difference in my
> > scenario.
> >
> > I am wondering if SQL Server 2005 will perform better in my scenario
> > with permanent bulk inserts and also massive updating and simultaneous
> > querying. Data integrity is really not that important for my
> > application, but speed is. I wish I could deactivate all logging.
> >
>
>|||It is always OK for us to go back to the last backup. I thought that in
that scenario "simple" is the least demanding recovery model in terms
of logging. My understanding is that "bulk logged" mimics "simple" for
the bulk operations, but "full" for other operations. Since one of our
tables also receives a massive amount of updates, "bulk logged" mode
should result in more log file action.
However, out of desperation I switched to "bulk logged" on the
problematic server. That did not fix the problem, there is still a
checkpoint about every two minutes. I am currently working around by
setting the recovery_interval to 32767, which will still issue a
checkpoint or two per day and then block the server for about five
minutes (I wonder why the checkpoint does not take 32767 minutes since
that is what I thought the recovery_interval option indicates). I am
quiet sure that it will not take my database 32767 minutes to recover
after restarting the server but I am not willing to try.
I think that Sql Server is not calculating the time for a checkpoint
correctly under some circumstances with SP4 (don't know what exactly
the circumstances are since the other server is still operating
perfectly). I believe that the rows inserted with a bulk insert should
not trigger a checkpoint as quickly as they currently do, in other
words (sorry for my naive assumption) Sql Server takes to bulk inserted
rows for a reason to make a checkpoint appear.
Laurence schrieb:
> I'm not sure if this posted since it's been over 10 minutes and I got a blank
> page after clicking the post button, so please ignore if my response is
> already there:
> Just out of curiosity, is there any reason why you're using the Simple
> Recovery Model on a Production server that also receives a majority of their
> inserts through Bulk-Insert? Why aren't you using the Bulk-Logged Recovery
> Model which is designed for this particular situation? Each time you do a
> Bulk-Insert of any magnitude, it is being logged, which will cause your
> transaction logs to grow because each row that's being inserted is being
> recorded, whereas using the Bulk-Logged Recovery will only record that you're
> doing a Bulk-Insert, which will be less writes to your transaction log, less
> auto-growth, and possibly less automatic checkpoints. I would follow the
> suggestion of doing a manual checkpoint before each Bulk-Insert for piece of
> mind though. Hopefully this helps.
> Laurence
> "Andrew J. Kelly" wrote:
> > Are you sure they are all in the same recovery model? In any case I really
> > think you are going down the wrong road. The bottom line is that if
> > checkpoints cause too much pain and last too long you do not have a proper
> > disk subsystem or configuration to handle the load.IS the log file on a RAID
> > 1 or RAID 10 with no other types of files? What is the array configuration
> > for the data? How much and what is the configuration for the cache on the
> > controller or SAN processor?
> >
> > --
> > Andrew J. Kelly SQL MVP
> >
> >
> > <dc@.upsize.de> wrote in message
> > news:1147959206.497801.43880@.j55g2000cwa.googlegroups.com...
> > > Today the problematic server started unwanted checkpoints and queueing
> > > again. Which is strange, since in the past the servers were fine once
> > > the queueing ceased - until the server had to be restarted.
> > >
> > > I increased the logfile size with ALTER DATABASE - no change. I
> > > installed the post-SP4 hotfixed - still no success.
> > >
> > > The only countermeasure that had an effect is
> > >
> > > sp_configure "recovery_interval", 60
> > >
> > > RECONFIGURE WITH OVERRIDE
> > >
> > > but this will still result in longer (about 3 minutes) queueing in
> > > bigger intervals (~20 minutes).
> > >
> > > Can I somehow find out what exactly is causing the checkpoint? Can I
> > > see the checkpoint time and reason somewhere in the system tables? The
> > > only difference between the two servers that I can see now, is that the
> > > problematic server uses only a few percent of logspace (never hitting
> > > 70%), while the OK server always uses > 95% of the log space.
> > >
> > > It is too bad that I cannot use non-logged or minimally-logged BULK
> > > INSERTS (I cannot use TABLOCK because the table is also permanently
> > > being queryied) since I believe that would make a difference in my
> > > scenario.
> > >
> > > I am wondering if SQL Server 2005 will perform better in my scenario
> > > with permanent bulk inserts and also massive updating and simultaneous
> > > querying. Data integrity is really not that important for my
> > > application, but speed is. I wish I could deactivate all logging.
> > >
> >
> >
> >|||Hi
Don't forget, a checkpoint is a flush of dirty buffers to disk. This flush
will take a lot less time than the recovery interval setting. The recovery
interval setting is based on a log replay, not a simple flush to disk.
The amount of available RAM also play a role in the checkpoint interval
(e.g. if your DB is 20GB, and you have 2GB buffer RAM, if every page was
modified, ideally at least 10 checkpoints have to occur). If you are in
simple recovery mode, 70% log full generates a checkpoint.
Making sure that your disk subsystem is optimal is the answer to checkpoint
performance problems. Andrew listed the points to look for.
--
Mike
This posting is provided "AS IS" with no warranties, and confers no rights.
<dc@.upsize.de> wrote in message
news:1149591061.614095.295260@.f6g2000cwb.googlegroups.com...
> It is always OK for us to go back to the last backup. I thought that in
> that scenario "simple" is the least demanding recovery model in terms
> of logging. My understanding is that "bulk logged" mimics "simple" for
> the bulk operations, but "full" for other operations. Since one of our
> tables also receives a massive amount of updates, "bulk logged" mode
> should result in more log file action.
> However, out of desperation I switched to "bulk logged" on the
> problematic server. That did not fix the problem, there is still a
> checkpoint about every two minutes. I am currently working around by
> setting the recovery_interval to 32767, which will still issue a
> checkpoint or two per day and then block the server for about five
> minutes (I wonder why the checkpoint does not take 32767 minutes since
> that is what I thought the recovery_interval option indicates). I am
> quiet sure that it will not take my database 32767 minutes to recover
> after restarting the server but I am not willing to try.
> I think that Sql Server is not calculating the time for a checkpoint
> correctly under some circumstances with SP4 (don't know what exactly
> the circumstances are since the other server is still operating
> perfectly). I believe that the rows inserted with a bulk insert should
> not trigger a checkpoint as quickly as they currently do, in other
> words (sorry for my naive assumption) Sql Server takes to bulk inserted
> rows for a reason to make a checkpoint appear.
>
> Laurence schrieb:
>> I'm not sure if this posted since it's been over 10 minutes and I got a
>> blank
>> page after clicking the post button, so please ignore if my response is
>> already there:
>> Just out of curiosity, is there any reason why you're using the Simple
>> Recovery Model on a Production server that also receives a majority of
>> their
>> inserts through Bulk-Insert? Why aren't you using the Bulk-Logged
>> Recovery
>> Model which is designed for this particular situation? Each time you do
>> a
>> Bulk-Insert of any magnitude, it is being logged, which will cause your
>> transaction logs to grow because each row that's being inserted is being
>> recorded, whereas using the Bulk-Logged Recovery will only record that
>> you're
>> doing a Bulk-Insert, which will be less writes to your transaction log,
>> less
>> auto-growth, and possibly less automatic checkpoints. I would follow the
>> suggestion of doing a manual checkpoint before each Bulk-Insert for piece
>> of
>> mind though. Hopefully this helps.
>> Laurence
>> "Andrew J. Kelly" wrote:
>> > Are you sure they are all in the same recovery model? In any case I
>> > really
>> > think you are going down the wrong road. The bottom line is that if
>> > checkpoints cause too much pain and last too long you do not have a
>> > proper
>> > disk subsystem or configuration to handle the load.IS the log file on a
>> > RAID
>> > 1 or RAID 10 with no other types of files? What is the array
>> > configuration
>> > for the data? How much and what is the configuration for the cache on
>> > the
>> > controller or SAN processor?
>> >
>> > --
>> > Andrew J. Kelly SQL MVP
>> >
>> >
>> > <dc@.upsize.de> wrote in message
>> > news:1147959206.497801.43880@.j55g2000cwa.googlegroups.com...
>> > > Today the problematic server started unwanted checkpoints and
>> > > queueing
>> > > again. Which is strange, since in the past the servers were fine once
>> > > the queueing ceased - until the server had to be restarted.
>> > >
>> > > I increased the logfile size with ALTER DATABASE - no change. I
>> > > installed the post-SP4 hotfixed - still no success.
>> > >
>> > > The only countermeasure that had an effect is
>> > >
>> > > sp_configure "recovery_interval", 60
>> > >
>> > > RECONFIGURE WITH OVERRIDE
>> > >
>> > > but this will still result in longer (about 3 minutes) queueing in
>> > > bigger intervals (~20 minutes).
>> > >
>> > > Can I somehow find out what exactly is causing the checkpoint? Can I
>> > > see the checkpoint time and reason somewhere in the system tables?
>> > > The
>> > > only difference between the two servers that I can see now, is that
>> > > the
>> > > problematic server uses only a few percent of logspace (never hitting
>> > > 70%), while the OK server always uses > 95% of the log space.
>> > >
>> > > It is too bad that I cannot use non-logged or minimally-logged BULK
>> > > INSERTS (I cannot use TABLOCK because the table is also permanently
>> > > being queryied) since I believe that would make a difference in my
>> > > scenario.
>> > >
>> > > I am wondering if SQL Server 2005 will perform better in my scenario
>> > > with permanent bulk inserts and also massive updating and
>> > > simultaneous
>> > > querying. Data integrity is really not that important for my
>> > > application, but speed is. I wish I could deactivate all logging.
>> > >
>> >
>> >
>> >
>|||rahul sharma
rahul.sharma822@.gmail.com
*** Sent via Developersdex http://www.developersdex.com ***

Thursday, March 22, 2012

Bulk Insert Unicode

Good day,

We are using bulk insert with a formatfile to load a text file into sqlexpress. One field in the text file contains non-ascii (unicode) charaters and the corresponding database field is nvarchar.

When the record and row are specified in the format file as:

<FIELD ID="23" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="400" COLLATION="Latin1_General_CI_AS"/>
<COLUMN SOURCE="23" NAME="FullName" xsi:type="SQLNVARCHAR"/>

the value "Boca CurĂ¡" gets imported as "Boca Cur??". This is true even with datafiletype set to widenative or widechar, and/or codepage set to raw or acp.

When the field is specfied in the record section of the format file as NCharTerm, the bulk insert terminates immediately with:

Msg 4863, Level 16, State 1, Server MINT\SQLEXPRESS, Line 1
Bulk load data conversion error (truncation) for row 2, column 23 (FullName).

This is true regardless of which bulk insert options specified.

What could be going wrong and how can we address it?

Thanks in advance...

Hi,

I see it's been a while since you posted this, but I have just run into this same issue today and I was wondering if you were able to resolve it gracefully?

regards

|||

Hi knightEknight!

We solved this with MS using a developer support incident. I should have updated this post when the information was fresh! Just went back through my notes and the readme for the project, and the solution was to convert the input file from UTF8 text encoding to ANSI text encoding. This is a Save As option in Notepad and most text editors. If you have a large file and need some C# code to do this, let me know....

|||Thanks! My input comes from various sources and I've pretty much concluded the same thing. I'll just have to find a way to convert all the input to ANSI. - Regards|||

Here is a C# example:

// utf8 in, ansi out

StreamReader inStream = File.OpenText(@."E:\Data\DEM\Panama\panama_canal_final.txt");

FileStream outStream = File.Create(@."E:\Data\DEM\Panama\panama_canal_fsubset.txt");

Encoding ansi = Encoding.Default;

string inLine;

byte[] outBytes;

while (true)

{

inLine = inStream.ReadLine();

if (inLine == null)

break;

outBytes = ansi.GetBytes(inLine + "\r\n");

outStream.Write(outBytes, 0, outBytes.Length);

}

inStream.Close();

outStream.Close();

|||

Instead of converting your files to ANSI as suggested you could also fix your delimiter problem for Unicode files. The TERMINATOR should be '\t\0' instead of just '\t' for tabs and '\r\0\n\0' for the row delimiter if you are using standard CR+LF.

Regards,

Lars

sql

Bulk Insert Unicode

Good day,

We are using bulk insert with a formatfile to load a text file into sqlexpress. One field in the text file contains non-ascii (unicode) charaters and the corresponding database field is nvarchar.

When the record and row are specified in the format file as:

<FIELD ID="23" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="400" COLLATION="Latin1_General_CI_AS"/>
<COLUMN SOURCE="23" NAME="FullName" xsi:type="SQLNVARCHAR"/>

the value "Boca CurĂ¡" gets imported as "Boca Cur??". This is true even with datafiletype set to widenative or widechar, and/or codepage set to raw or acp.

When the field is specfied in the record section of the format file as NCharTerm, the bulk insert terminates immediately with:

Msg 4863, Level 16, State 1, Server MINT\SQLEXPRESS, Line 1
Bulk load data conversion error (truncation) for row 2, column 23 (FullName).

This is true regardless of which bulk insert options specified.

What could be going wrong and how can we address it?

Thanks in advance...

Hi,

I see it's been a while since you posted this, but I have just run into this same issue today and I was wondering if you were able to resolve it gracefully?

regards

|||

Hi knightEknight!

We solved this with MS using a developer support incident. I should have updated this post when the information was fresh! Just went back through my notes and the readme for the project, and the solution was to convert the input file from UTF8 text encoding to ANSI text encoding. This is a Save As option in Notepad and most text editors. If you have a large file and need some C# code to do this, let me know....

|||Thanks! My input comes from various sources and I've pretty much concluded the same thing. I'll just have to find a way to convert all the input to ANSI. - Regards|||

Here is a C# example:

// utf8 in, ansi out

StreamReader inStream = File.OpenText(@."E:\Data\DEM\Panama\panama_canal_final.txt");

FileStream outStream = File.Create(@."E:\Data\DEM\Panama\panama_canal_fsubset.txt");

Encoding ansi = Encoding.Default;

string inLine;

byte[] outBytes;

while (true)

{

inLine = inStream.ReadLine();

if (inLine == null)

break;

outBytes = ansi.GetBytes(inLine + "\r\n");

outStream.Write(outBytes, 0, outBytes.Length);

}

inStream.Close();

outStream.Close();

|||

Instead of converting your files to ANSI as suggested you could also fix your delimiter problem for Unicode files. The TERMINATOR should be '\t\0' instead of just '\t' for tabs and '\r\0\n\0' for the row delimiter if you are using standard CR+LF.

Regards,

Lars

Bulk Insert Unicode

Good day,

We are using bulk insert with a formatfile to load a text file into sqlexpress. One field in the text file contains non-ascii (unicode) charaters and the corresponding database field is nvarchar.

When the record and row are specified in the format file as:

<FIELD ID="23" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="400" COLLATION="Latin1_General_CI_AS"/>
<COLUMN SOURCE="23" NAME="FullName" xsi:type="SQLNVARCHAR"/>

the value "Boca CurĂ¡" gets imported as "Boca Cur??". This is true even with datafiletype set to widenative or widechar, and/or codepage set to raw or acp.

When the field is specfied in the record section of the format file as NCharTerm, the bulk insert terminates immediately with:

Msg 4863, Level 16, State 1, Server MINT\SQLEXPRESS, Line 1
Bulk load data conversion error (truncation) for row 2, column 23 (FullName).

This is true regardless of which bulk insert options specified.

What could be going wrong and how can we address it?

Thanks in advance...

Hi,

I see it's been a while since you posted this, but I have just run into this same issue today and I was wondering if you were able to resolve it gracefully?

regards

|||

Hi knightEknight!

We solved this with MS using a developer support incident. I should have updated this post when the information was fresh! Just went back through my notes and the readme for the project, and the solution was to convert the input file from UTF8 text encoding to ANSI text encoding. This is a Save As option in Notepad and most text editors. If you have a large file and need some C# code to do this, let me know....

|||Thanks! My input comes from various sources and I've pretty much concluded the same thing. I'll just have to find a way to convert all the input to ANSI. - Regards|||

Here is a C# example:

// utf8 in, ansi out

StreamReader inStream = File.OpenText(@."E:\Data\DEM\Panama\panama_canal_final.txt");

FileStream outStream = File.Create(@."E:\Data\DEM\Panama\panama_canal_fsubset.txt");

Encoding ansi = Encoding.Default;

string inLine;

byte[] outBytes;

while (true)

{

inLine = inStream.ReadLine();

if (inLine == null)

break;

outBytes = ansi.GetBytes(inLine + "\r\n");

outStream.Write(outBytes, 0, outBytes.Length);

}

inStream.Close();

outStream.Close();

Bulk Insert Unicode

Good day,

We are using bulk insert with a formatfile to load a text file into sqlexpress. One field in the text file contains non-ascii (unicode) charaters and the corresponding database field is nvarchar.

When the record and row are specified in the format file as:

<FIELD ID="23" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="400" COLLATION="Latin1_General_CI_AS"/>
<COLUMN SOURCE="23" NAME="FullName" xsi:type="SQLNVARCHAR"/>

the value "Boca CurĂ¡" gets imported as "Boca Cur??". This is true even with datafiletype set to widenative or widechar, and/or codepage set to raw or acp.

When the field is specfied in the record section of the format file as NCharTerm, the bulk insert terminates immediately with:

Msg 4863, Level 16, State 1, Server MINT\SQLEXPRESS, Line 1
Bulk load data conversion error (truncation) for row 2, column 23 (FullName).

This is true regardless of which bulk insert options specified.

What could be going wrong and how can we address it?

Thanks in advance...

Hi,

I see it's been a while since you posted this, but I have just run into this same issue today and I was wondering if you were able to resolve it gracefully?

regards

|||

Hi knightEknight!

We solved this with MS using a developer support incident. I should have updated this post when the information was fresh! Just went back through my notes and the readme for the project, and the solution was to convert the input file from UTF8 text encoding to ANSI text encoding. This is a Save As option in Notepad and most text editors. If you have a large file and need some C# code to do this, let me know....

|||Thanks! My input comes from various sources and I've pretty much concluded the same thing. I'll just have to find a way to convert all the input to ANSI. - Regards|||

Here is a C# example:

// utf8 in, ansi out

StreamReader inStream = File.OpenText(@."E:\Data\DEM\Panama\panama_canal_final.txt");

FileStream outStream = File.Create(@."E:\Data\DEM\Panama\panama_canal_fsubset.txt");

Encoding ansi = Encoding.Default;

string inLine;

byte[] outBytes;

while (true)

{

inLine = inStream.ReadLine();

if (inLine == null)

break;

outBytes = ansi.GetBytes(inLine + "\r\n");

outStream.Write(outBytes, 0, outBytes.Length);

}

inStream.Close();

outStream.Close();

Bulk Insert Unicode

Good day,
We are using bulk insert with a formatfile to load a text file into
sqlexpress. One field in the text file contains non-ascii (unicode)
charaters and the corresponding database field is nvarchar.
When the record and row are specified in the format file as:
<FIELD ID=3D"23" xsi:type=3D"CharTerm" TERMINATOR=3D"\t" MAX_LENGTH=3D"400"
COLLATION=3D"Latin1_General_CI_AS"/>
<COLUMN SOURCE=3D"23" NAME=3D"FullName" xsi:type=3D"SQLNVARCHAR"/>
the value "Boca Cur=E1" gets imported as "Boca Cur=C3=A1". This is true
even with datafiletype set to widenative or widechar, and/or codepage
set to raw or acp.
When the field is specfied in the record section of the format file as
NCharTerm, the bulk insert terminates immediately with:
Msg 4863, Level 16, State 1, Server MINT\SQLEXPRESS, Line 1
Bulk load data conversion error (truncation) for row 2, column 23
(FullName).
This is true regardless bulk insert options specified.
What could be going wrong and how can we address it?
Thanks in advance...Hi
Can you specify SQLNVARCHAR(50) for example? If I remember well it truncates
the string if you do not specify a length.As it is NVARCHAR , so it is like
NVARCHAR(2)
<rgreene@.icanmarine.com> wrote in message
news:1143039002.249599.122720@.g10g2000cwb.googlegroups.com...
Good day,
We are using bulk insert with a formatfile to load a text file into
sqlexpress. One field in the text file contains non-ascii (unicode)
charaters and the corresponding database field is nvarchar.
When the record and row are specified in the format file as:
<FIELD ID="23" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="400"
COLLATION="Latin1_General_CI_AS"/>
<COLUMN SOURCE="23" NAME="FullName" xsi:type="SQLNVARCHAR"/>
the value "Boca Cur" gets imported as "Boca CurĂ¡". This is true
even with datafiletype set to widenative or widechar, and/or codepage
set to raw or acp.
When the field is specfied in the record section of the format file as
NCharTerm, the bulk insert terminates immediately with:
Msg 4863, Level 16, State 1, Server MINT\SQLEXPRESS, Line 1
Bulk load data conversion error (truncation) for row 2, column 23
(FullName).
This is true regardless bulk insert options specified.
What could be going wrong and how can we address it?
Thanks in advance...|||Hi Uri. Thanks for the quick reply.
When I use NCharTerm in the record section of the formatfile and
SQLNVARCHAR(200) in the row section I get:
Msg 4858, Level 16, State 1, Server MINT\SQLEXPRESS, Line 1
Line 49 in format file "C:\Documents and Settings\rgreene\My
Documents\PlaceName
s\format.xml": bad value SQLNVARCHAR(200) for attribute "xsi:type".
Any other ideas?|||Well, can you show us a format file? Is it a XML file?
<rgreene@.icanmarine.com> wrote in message
news:1143039873.112229.184880@.g10g2000cwb.googlegroups.com...
> Hi Uri. Thanks for the quick reply.
> When I use NCharTerm in the record section of the formatfile and
> SQLNVARCHAR(200) in the row section I get:
> Msg 4858, Level 16, State 1, Server MINT\SQLEXPRESS, Line 1
> Line 49 in format file "C:\Documents and Settings\rgreene\My
> Documents\PlaceName
> s\format.xml": bad value SQLNVARCHAR(200) for attribute "xsi:type".
> Any other ideas?
>|||Here is the original XML format file (the one that imports, but
changes, the unicode text):
<?xml version="1.0"?>
<BCPFORMAT
xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="5"/>
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="12"/>
<FIELD ID="3" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="12"/>
<FIELD ID="4" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="30"/>
<FIELD ID="5" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="30"/>
<FIELD ID="6" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="7"/>
<FIELD ID="7" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="8"/>
<FIELD ID="8" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="4"/>
<FIELD ID="9" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="7"/>
<FIELD ID="10" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="2"
COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="11" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="10"
COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="12" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="5"/>
<FIELD ID="13" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="4"
COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="14" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="4"
COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="15" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="400"
COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="16" xsi:type="CharTerm" TERMINATOR="\t"
MAX_LENGTH="12"/>
<FIELD ID="17" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="4"
COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="18" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="2"
COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="19" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="6"
COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="20" xsi:type="CharTerm" TERMINATOR="\t"
MAX_LENGTH="128"/>
<FIELD ID="21" xsi:type="CharTerm" TERMINATOR="\t"
MAX_LENGTH="128"/>
<FIELD ID="22" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="400"
COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="23" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="400"
COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="24" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="400"
COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="25" xsi:type="CharTerm" TERMINATOR="\r\n"
MAX_LENGTH="24"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="RegionCode" xsi:type="SQLTINYINT"/>
<COLUMN SOURCE="2" NAME="UniqueFeatureIdentifier"
xsi:type="SQLINT"/>
<COLUMN SOURCE="3" NAME="UniqueNameIdentifier" xsi:type="SQLINT"/>
<COLUMN SOURCE="4" NAME="Lat" xsi:type="SQLFLT8"/>
<COLUMN SOURCE="5" NAME="Lon" xsi:type="SQLFLT8"/>
<COLUMN SOURCE="10" NAME="FeatureClassificationCode"
xsi:type="SQLNCHAR"/>
<COLUMN SOURCE="11" NAME="FeatureDesignationCode"
xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="12" NAME="PopulatedPlaceClassification"
xsi:type="SQLTINYINT"/>
<COLUMN SOURCE="13" NAME="PrimaryCountryCode"
xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="14" NAME="ADM1Code" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="15" NAME="ADM2" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="16" NAME="Dimension" xsi:type="SQLINT"/>
<COLUMN SOURCE="17" NAME="SecondaryCountryCode"
xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="18" NAME="NameType" xsi:type="SQLNCHAR"/>
<COLUMN SOURCE="19" NAME="LanguageCode" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="22" NAME="SortName" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="23" NAME="FullName" xsi:type="SQLNVARCHAR()"/>
<COLUMN SOURCE="24" NAME="FullNameND" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="25" NAME="ModificationDate"
xsi:type="SQLDATETIME"/>
</ROW>
</BCPFORMAT>

Bulk Insert Task - Where Does The First Row Begin?

Hi,

I am using the Bulk Insert Task to bulk copy data from a flat file to a SQL Server 2005 table. The flat file contains pipe ( | ) delimited columns and several thousand records which are {CR}{LF} delimited. The file also contains a header record and trailer record, which contain several words but also contains a ( | ) symbol among those words.

E.g.:

HEDR | yadi yadi yada
500 | Data Data | More Data | 600
460 | Datum | More More | 705
550 | Data | Data | 603
FOOTR | yadi yadi yada

I need to ignore the header and trailer records and just pickup the proper records. But even though I manually set the First Row property to 2 and the Last Row property to 4, It does not pickup the first true record i.e. the record which begins with 500, and if I set the First Row to 1, it throws me an error citing 'truncation error' or similar. I think it is taking the first record (i.e. header row along with the first row as one, and since there are now more pipes ( | ) the error is thrown)

I've tried setting different values for these properties but to no avail... Any help will be deeply appreciated...

Thanks
Gogula

Bulk insert task expects the header and footer records also to be in the same format as other Data Rows, so to do, what you are trying to do, your data needs to be in this format:

HEDR | yadi | yadi | yada
500 | Data Data | More Data | 600
460 | Datum | More More | 705
550 | Data | Data | 603
FOOTR| yadi | yadi | yada

This is the same behavior as Transact-SQL bulk insert, and if you try something like the following with your file, you will encounter errors as well. Therefore, this is by design, according to what bulk insert supports.

BULK INSERT tempdb.dbo.Table_1

FROM 'C:\\File.txt'

WITH

(

FIELDTERMINATOR ='|',

ROWTERMINATOR ='\n',

firstrow = 2,

lastrow = 4

)

|||Hi Rangeeta,

Thanks for the reply. But what if in case there might be more than one header and footer (which I programmatic ally count and take into consideration) and those headers cannot be separated by the delimiters (since the client needs to send it to us)? It seems a little unfair that Bulk Insert expects the headers to be in the same format as the other rows...
Are there any alternate methods to do this?

Thanks again
Gogula

Bulk Insert remove extra quotes

I am using Bulk Insert to Import a Tab Delimeted file that contains quotes(") in some of the fields.

Example 1: """"This is a just a test."""

Results should be "This is just a test"

Example 2: """This is only a test"" and nothing more"

Results should be:"This is only a test" and nothing more.

Now.

If I use DTS Import Wizard process and select "Double Quotes" as the Text Qualifier, I get the result I am expecting. The extra quotes are gone.

Question: Is there a way or a switch in Bulk Insert that I could use to get rid of the extra quotes?

Below is my syntax

SET @.sql ='BULK INSERT '+ @.table_name + ' FROM "' + @.conversion_data_in + '\' + @.source_file + '"
WITH ( DATAFILETYPE='NATIVE' FIELDTERMINATOR = ''\t'',ROWTERMINATOR = ''\n'')'


Thanks
Conrad...

There is no special switch in BULK INSERT or BCP to strip double-quotes. If you know the number of double-quotes that needs to be removed then you can use a format file. Of course in this case, each line/value should have the same format. See BOL for more details on using format files especially the topic that talks about inserting data files that has more columns than the table.|||

Thanks for the information.

When you use DTS Import Wizard, it asks you to select "Text Qualifier". If you select "Double Quotes {"}" and Tab Delimiter on the next screen, the data gets imported without the extra quotes.

On the Other hand, I do not know, how to remove the extra quotes using BCP or Bulk Insert.

Keep in mind, that the quotes may not appeared in all of the fields.

Thanks

Conrad...

|||As I said, you will have to use a format file to strip the text qualifiers but these have to exist on all values of a particular column for example. So you cannot have a value for one column in a row with text qualifiers and another row having no text qualifiers. If this doesn't work then you will have to stick with the DTS approach or use another utility or program to format the data file so that you can use BCP/BULK INSERT directly.|||

Umachandar Jayachandran - MS wrote:

As I said, you will have to use a format file to strip the text qualifiers but these have to exist on all values of a particular column for example. So you cannot have a value for one column in a row with text qualifiers and another row having no text qualifiers. If this doesn't work then you will have to stick with the DTS approach or use another utility or program to format the data file so that you can use BCP/BULK INSERT directly.

Hello,

I have a similar problem that might be solved with a format file. In my files, the decimal separator is a comma and the format for datetime is dd.mm.yyyy. But I do not find any documentation on that topic. I found a general description of XML format files at http://msdn2.microsoft.com/en-us/library/ms189327.aspx but nothing with respect to stripping or replacing characters.

Any links to documentation or examples would be welcome!

Thanks a lot

Detlef

Bulk Insert remove extra quotes

I am using Bulk Insert to Import a Tab Delimeted file that contains quotes(") in some of the fields.

Example 1: """"This is a just a test."""

Results should be "This is just a test"

Example 2: """This is only a test"" and nothing more"

Results should be:"This is only a test" and nothing more.

Now.

If I use DTS Import Wizard process and select "Double Quotes" as the Text Qualifier, I get the result I am expecting. The extra quotes are gone.

Question: Is there a way or a switch in Bulk Insert that I could use to get rid of the extra quotes?

Below is my syntax

SET @.sql ='BULK INSERT '+ @.table_name + ' FROM "' + @.conversion_data_in + '\' + @.source_file + '"
WITH ( DATAFILETYPE='NATIVE' FIELDTERMINATOR = ''\t'',ROWTERMINATOR = ''\n'')'


Thanks
Conrad...

There is no special switch in BULK INSERT or BCP to strip double-quotes. If you know the number of double-quotes that needs to be removed then you can use a format file. Of course in this case, each line/value should have the same format. See BOL for more details on using format files especially the topic that talks about inserting data files that has more columns than the table.|||

Thanks for the information.

When you use DTS Import Wizard, it asks you to select "Text Qualifier". If you select "Double Quotes {"}" and Tab Delimiter on the next screen, the data gets imported without the extra quotes.

On the Other hand, I do not know, how to remove the extra quotes using BCP or Bulk Insert.

Keep in mind, that the quotes may not appeared in all of the fields.

Thanks

Conrad...

|||As I said, you will have to use a format file to strip the text qualifiers but these have to exist on all values of a particular column for example. So you cannot have a value for one column in a row with text qualifiers and another row having no text qualifiers. If this doesn't work then you will have to stick with the DTS approach or use another utility or program to format the data file so that you can use BCP/BULK INSERT directly.|||

Umachandar Jayachandran - MS wrote:

As I said, you will have to use a format file to strip the text qualifiers but these have to exist on all values of a particular column for example. So you cannot have a value for one column in a row with text qualifiers and another row having no text qualifiers. If this doesn't work then you will have to stick with the DTS approach or use another utility or program to format the data file so that you can use BCP/BULK INSERT directly.

Hello,

I have a similar problem that might be solved with a format file. In my files, the decimal separator is a comma and the format for datetime is dd.mm.yyyy. But I do not find any documentation on that topic. I found a general description of XML format files at http://msdn2.microsoft.com/en-us/library/ms189327.aspx but nothing with respect to stripping or replacing characters.

Any links to documentation or examples would be welcome!

Thanks a lot

Detlef

bulk insert question? Another way?

hi guys,
I have a text file contains about 1 million rows, I tried to import to a
table but I received data conversion errors for some rows, however, the row
number is like 504500 for example, I want to troubleshoot this row but the
only way I know is to open the text file (300MB) which will take forever to
open on notepad, I'm not sure if there is an easy way to troubleshoot such
a problem. Another thing is, I set Batchsize=10000, so I guess if I
received errors for 20 rows, it would mean that 200000 rows (20 *10000)
didn't get inserted into table, correct?
ThanksKevin,
Get a copy of TextPad or one of the many shareware or freeware text editors
that can handle large files with ease.
Another alternative is to import the file into a one-column staging table on
your SQL Server, and then scrub the data before inserting it into the target
table.
Steve Kass
Drew University
kevin wrote:

>hi guys,
> I have a text file contains about 1 million rows, I tried to import to
a
>table but I received data conversion errors for some rows, however, the ro
w
>number is like 504500 for example, I want to troubleshoot this row but the
>only way I know is to open the text file (300MB) which will take forever to
>open on notepad, I'm not sure if there is an easy way to troubleshoot such
>a problem. Another thing is, I set Batchsize=10000, so I guess if I
>received errors for 20 rows, it would mean that 200000 rows (20 *10000)
>didn't get inserted into table, correct?
>Thanks
>
>
>

Tuesday, March 20, 2012

Bulk insert problem

Hi guys
I am using the Bulk insert command to import a flat file to the SQL server
2005.
My flat file contains about 1 million rows.
I have already created an empty table to insert the rows from the flat file.
But I dont want to import the first row into my table in the database.
Is there a way I can give some some sort of condition in my bulk insert
command so that the first row from the flat file does not get imported'
Here is my bulk insert statement:
BULK INSERT new1 FROM 'f:\My Documents\flatFileBaseline\one.txt' WITH
(FIELDTERMINATOR = ',')
Cheers
Mita"mita" <mita@.discussions.microsoft.com> wrote in message
news:F742B532-8FA9-4AC1-89AC-7E50FC757619@.microsoft.com...
> Hi guys
> I am using the Bulk insert command to import a flat file to the SQL server
> 2005.
> My flat file contains about 1 million rows.
> I have already created an empty table to insert the rows from the flat
> file.
> But I dont want to import the first row into my table in the database.
> Is there a way I can give some some sort of condition in my bulk insert
> command so that the first row from the flat file does not get imported'
> Here is my bulk insert statement:
> BULK INSERT new1 FROM 'f:\My Documents\flatFileBaseline\one.txt' WITH
> (FIELDTERMINATOR = ',')
>
Read the entry in Books Online for BULK INSERT.
David

Bulk insert problem

Hi guys
I am using the Bulk insert command to import a flat file to the SQL server
2005.
My flat file contains about 1 million rows.
I have already created an empty table to insert the rows from the flat file.
But I dont want to import the first row into my table in the database.
Is there a way I can give some some sort of condition in my bulk insert
command so that the first row from the flat file does not get imported'
Here is my bulk insert statement:
BULK INSERT new1 FROM 'f:\My Documents\flatFileBaseline\one.txt' WITH
(FIELDTERMINATOR = ',')
Cheers
Mita"mita" <mita@.discussions.microsoft.com> wrote in message
news:F742B532-8FA9-4AC1-89AC-7E50FC757619@.microsoft.com...
> Hi guys
> I am using the Bulk insert command to import a flat file to the SQL server
> 2005.
> My flat file contains about 1 million rows.
> I have already created an empty table to insert the rows from the flat
> file.
> But I dont want to import the first row into my table in the database.
> Is there a way I can give some some sort of condition in my bulk insert
> command so that the first row from the flat file does not get imported'
> Here is my bulk insert statement:
> BULK INSERT new1 FROM 'f:\My Documents\flatFileBaseline\one.txt' WITH
> (FIELDTERMINATOR = ',')
>
Read the entry in Books Online for BULK INSERT.
David

Monday, March 19, 2012

Bulk insert of bit values

Hi Gang:
I've inherited a database that contains a number of tables with bit value
columns. I am trying to write a script with bulk insert statements to
reconstruct the database. When I export the table data, the bit fields get
converted to string equivalents (True and False.) When the bulk insert
statement for the same table executes, I get errors saying that these values
can't be converted to bits.
Can you tell me how to export the data from my tables into text files so
that the bit values can be re-imported / bulk inserted correctly?Hi Robert,
Should be 1,0 or NULL
In what format did you export the data, if you export them to excel its
getting converted to True or False...
HTH, Jens Suessmeyer.
http://www.sqlserver2005.de
--
"Robert Burdick [eMVP]" <RobertBurdickeMVP@.discussions.microsoft.com>
schrieb im Newsbeitrag
news:2530932F-BFEC-48D6-B30A-A0C9F03EEC25@.microsoft.com...
> Hi Gang:
> I've inherited a database that contains a number of tables with bit value
> columns. I am trying to write a script with bulk insert statements to
> reconstruct the database. When I export the table data, the bit fields
> get
> converted to string equivalents (True and False.) When the bulk insert
> statement for the same table executes, I get errors saying that these
> values
> can't be converted to bits.
> Can you tell me how to export the data from my tables into text files so
> that the bit values can be re-imported / bulk inserted correctly?
>|||Hi
Valid BIT values are 0, 1 and NULL.
True and False are not supported.
VB.NET, VB, VBScript regard True as -1, so that is not convertible. Most
other languages regard True as +1
Regards
--
Mike Epprecht, Microsoft SQL Server MVP
Zurich, Switzerland
IM: mike@.epprecht.net
MVP Program: http://www.microsoft.com/mvp
Blog: http://www.msmvps.com/epprecht/
"Robert Burdick [eMVP]" <RobertBurdickeMVP@.discussions.microsoft.com> wrote
in message news:2530932F-BFEC-48D6-B30A-A0C9F03EEC25@.microsoft.com...
> Hi Gang:
> I've inherited a database that contains a number of tables with bit value
> columns. I am trying to write a script with bulk insert statements to
> reconstruct the database. When I export the table data, the bit fields
> get
> converted to string equivalents (True and False.) When the bulk insert
> statement for the same table executes, I get errors saying that these
> values
> can't be converted to bits.
> Can you tell me how to export the data from my tables into text files so
> that the bit values can be re-imported / bulk inserted correctly?
>|||I exported it from SQL Server using the import / export wizard. I exported
it as a comma delimited text file. Are there other export options I need to
check to make this work?
"Jens SĂ¼?meyer" wrote:

> Hi Robert,
> Should be 1,0 or NULL
> In what format did you export the data, if you export them to excel its
> getting converted to True or False...
>
> HTH, Jens Suessmeyer.
> --
> http://www.sqlserver2005.de
> --
> "Robert Burdick [eMVP]" <RobertBurdickeMVP@.discussions.microsoft.com>
> schrieb im Newsbeitrag
> news:2530932F-BFEC-48D6-B30A-A0C9F03EEC25@.microsoft.com...
>
>|||Hi
I don't have any issue with the following:
CREATE TABLE [MyBits] (
[Id] [int] IDENTITY (1, 1) NOT NULL ,
[Bit1] [bit] NOT NULL CONSTRAINT [DF_MyBits_Bit1] DEFAULT (0),
[Bit2] [bit] NOT NULL CONSTRAINT [DF_MyBits_Bit2] DEFAULT (0),
[Bit3] [bit] NOT NULL CONSTRAINT [DF_MyBits_Bit3] DEFAULT (1),
[Bit4] [bit] NOT NULL CONSTRAINT [DF_MyBits_Bit4] DEFAULT (1),
CONSTRAINT [PK_MyBits] PRIMARY KEY CLUSTERED
(
[Id]
) ON [PRIMARY]
)
GO
INSERT INTO MYBits ( Bit1 )
SELECT 1
UNION ALL SELECT 0
UNION ALL SELECT 1
UNION ALL SELECT 0
UNION ALL SELECT 1
UNION ALL SELECT 0
SELECT * FROM MyBits ORDER BY ID
/*
Id Bit1 Bit2 Bit3 Bit4
-- -- -- -- --
1 1 0 1 1
2 0 0 1 1
3 1 0 1 1
4 0 0 1 1
5 1 0 1 1
6 0 0 1 1
(6 row(s) affected)
*/
-- Create new table
SELECT * INTO MyBits2 FROM MyBits WHERE 1 = 0
-- Command Prompt
-- bcp test..mybits2 IN mybits.txt -c -T
/* MyBits.txt
1 1 0 1 1
2 0 0 1 1
3 1 0 1 1
4 0 0 1 1
5 1 0 1 1
6 0 0 1 1
*/
-- At command prompt import
-- bcp test..mybits2 IN mybits.txt -c -T
SELECT * FROM MyBits2 ORDER BY ID
/* Output
Id Bit1 Bit2 Bit3 Bit4
-- -- -- -- --
1 1 0 1 1
2 0 0 1 1
3 1 0 1 1
4 0 0 1 1
5 1 0 1 1
6 0 0 1 1
(6 row(s) affected)
*/
Can you post DDL, example data and your statements
http://www.aspfaq.com/etiquett_e.asp?id=5006
John
"Robert Burdick [eMVP]" <RobertBurdickeMVP@.discussions.microsoft.com> wrote
in message news:2530932F-BFEC-48D6-B30A-A0C9F03EEC25@.microsoft.com...
> Hi Gang:
> I've inherited a database that contains a number of tables with bit value
> columns. I am trying to write a script with bulk insert statements to
> reconstruct the database. When I export the table data, the bit fields
> get
> converted to string equivalents (True and False.) When the bulk insert
> statement for the same table executes, I get errors saying that these
> values
> can't be converted to bits.
> Can you tell me how to export the data from my tables into text files so
> that the bit values can be re-imported / bulk inserted correctly?
>|||If it wont work for you, you can cast the values to int, thatll work.
Jens Suessmeyer.
"Robert Burdick [eMVP]" <RobertBurdickeMVP@.discussions.microsoft.com>
schrieb im Newsbeitrag
news:A1EF625E-F6AA-4C3F-AD0F-BFCF8CA98375@.microsoft.com...
> I exported it from SQL Server using the import / export wizard. I
> exported
> it as a comma delimited text file. Are there other export options I need
> to
> check to make this work?
> "Jens Smeyer" wrote:
>|||Hi
It seems that this may be feature of the text file driver! As Jens says you
can cast as an int and it will work. To do this choose the option to specify
a query as the source of your export and then put in a statement like:
SELECT [Id],
CAST([Bit1] AS INT) AS [Bit1],
CAST([Bit2] AS INT) AS [Bit2],
CAST([Bit3] AS INT) AS [Bit3],
CAST([Bit4] AS INT) AS [Bit4]
FROM [MyBits]
Alternatively use BCP as in my other post.
John
"Robert Burdick [eMVP]" <RobertBurdickeMVP@.discussions.microsoft.com> wrote
in message news:A1EF625E-F6AA-4C3F-AD0F-BFCF8CA98375@.microsoft.com...
> I exported it from SQL Server using the import / export wizard. I
> exported
> it as a comma delimited text file. Are there other export options I need
> to
> check to make this work?
> "Jens Smeyer" wrote:
>|||Robert,
Ignoring why you have the strings True and False in your text
file, you can import your data into a staging table that matches
your ultimate destination except for the bit column, which in the
staging table should be varchar(5). Then once the 'True' and 'False'
strings are in the staging table,
insert into Destination
select
nonBitcolA,
nonBitcolB,
cast(case when preBit1 = 'True' then 1 when 'False' then 0 end as bit)
as BitCol1,
..
from Staging
Anything from your text file that is not 'True' or 'False' in the bit
columns will import as NULL. You can check for that either in
the staging table or the destination table with
select * from Staging
where preBit1 not in ('True','False')
or preBit2 not in ('True','False')
...
or
select * from Destination
where BitCol1 is null
or BitCol1 is null
...
Steve Kass
Drew University
Robert Burdick [eMVP] wrote:

>Hi Gang:
>I've inherited a database that contains a number of tables with bit value
>columns. I am trying to write a script with bulk insert statements to
>reconstruct the database. When I export the table data, the bit fields get
>converted to string equivalents (True and False.) When the bulk insert
>statement for the same table executes, I get errors saying that these value
s
>can't be converted to bits.
>Can you tell me how to export the data from my tables into text files so
>that the bit values can be re-imported / bulk inserted correctly?
>
>|||Thanks, I've almost got it. SQL Server Books Online doesn't explain how to
tell BCP to comma dellimit the fields. Any ideas?
"Steve Kass" wrote:

> Robert,
> Ignoring why you have the strings True and False in your text
> file, you can import your data into a staging table that matches
> your ultimate destination except for the bit column, which in the
> staging table should be varchar(5). Then once the 'True' and 'False'
> strings are in the staging table,
> insert into Destination
> select
> nonBitcolA,
> nonBitcolB,
> cast(case when preBit1 = 'True' then 1 when 'False' then 0 end as bit)
> as BitCol1,
> ...
> from Staging
> Anything from your text file that is not 'True' or 'False' in the bit
> columns will import as NULL. You can check for that either in
> the staging table or the destination table with
> select * from Staging
> where preBit1 not in ('True','False')
> or preBit2 not in ('True','False')
> ...
> or
> select * from Destination
> where BitCol1 is null
> or BitCol1 is null
> ...
> Steve Kass
> Drew University
>
> Robert Burdick [eMVP] wrote:
>
>|||>> I've inherited a database that contains a number of tables with bit
value columns. <<
The right answer is to re-design the database from an assembly language
model of data to an RDBMS and get rid of the proprietary BIT data.
reconstruct the database. When I export the table data, the bit fields
[sic] get converted to string equivalents (True and False.) <<
Why do you think that (0,1) mapping to (FALSE, TRUE) is a correct model
for casting BITs. Some host languages use (0,-1) and other use (1,0).
And there are no BOOLEANs in SQL-92 for very good reasons that have
to do with 3VL, NULLs and the data model in SQL.
The fifth labor of Hercules was to clean the stables of King Augeas in
a single day. The Augean stables held thousands of animals and were
over a mile long. This story has a happy ending for three reasons: (1)
Hercules solved the problem in a clever way (2) Hercules got one tenth
of the cattle for his work (3) At the end of the story of the Labors of
Hercules, he got to kill the bastard that gave him this job. Maybe you
will be lucky, too.

Sunday, March 11, 2012

Bulk Insert Import Question

Hi:
I am importing data from text files in sql tables using bulk insert object
in dts package. First row of text files contains headings. For some text
files, headings are also imported in the tables and for some headings are no
t
imported. I am not sure why is that happening.
I do not want to import headings. Is there anyway to tell bulk insert
object to not import first row (headings)? Please let me know.
Thanks a lotBULK INSERT [YourTable] FROM 'c:\yourfile.txt'
WITH (FIRSTROW = 5)
Where 5 indicates "start importing from the 5th row of text". Adjust for
your import as necessary. Don't use quotation marks around the number.
"Mike" wrote:

> Hi:
> I am importing data from text files in sql tables using bulk insert object
> in dts package. First row of text files contains headings. For some text
> files, headings are also imported in the tables and for some headings are
not
> imported. I am not sure why is that happening.
> I do not want to import headings. Is there anyway to tell bulk insert
> object to not import first row (headings)? Please let me know.
> Thanks a lot|||Thank you very much
"Mark Williams" wrote:
> BULK INSERT [YourTable] FROM 'c:\yourfile.txt'
> WITH (FIRSTROW = 5)
> Where 5 indicates "start importing from the 5th row of text". Adjust for
> your import as necessary. Don't use quotation marks around the number.
> --
>
> "Mike" wrote:
>

BULK INSERT from a table

Hi there.

I have a table that contains comma delimited text, and I am trying to convert this into another table

eg my target table looks like

Produce|Price|QuantityPerPrice

and my input table contains strings such as

"apples","7.5","10"

"pears","10","8"

"oranges","8","6"

Does anyone have any ideas on how to do this? I am after a solution that does them all at once: I am currently using charindex() to find each column, one at a time, but given the speed of BULK INSERT I would much rather do it as a table. The one solution that I don't want to resort to is to export the table with delimited strings to a data file, then BULK INSERT it...

Cheers

Neil

After reading the problem I am very interested to see the solution as I cant think of one without moving data to file or using charindex.|||Bulk insert or openrowset(bulk) (in SQL Server 2005) requires file as a data source. So what you are trying is not possible. You could however optimize some of the string splitting and do just insert. If there are large number of rows then perform the insert in batches in a transaction to get better performance. But I am confused as to why you would store such inefficient format in the database table and then try to manipulate it relationally. Why can't you store the data directly into the target table?|||

Thanks for the reply.

We are taking inputs with unknown numbers of fields and unknown delimiters, and working with them. We do this by importing the data into a single field and then analysing it. This leads to single-field tables with delimited data in them.

I think we are going to have to export with bcp, create a format file, then re-import with the known delimiter and fieldnames.

Which raises another question: is there a command line parameter I can pass to bulk insert so that it will take a delimiter and use the first row as fieldnames, or do I need a format file?

Regards,

Neil

|||I am still confused as to why you would first import the data into single column values and then try to do the parsing operation on the server. This seems inefficient to me. You could easily design DTS/SSIS packages or write your own scripting code to import the files directly in the first place. There is no facility in any of the bulk API to consider the first row as column names. You will have to skip it using the FIRSTROW parameter in BULK INSERT or similar option in BCP. There are also options to specify delimiter. Use of format files depends on the complexity of your data format. For the simple, delimiter case you may not require it. See BOL for more details on the command syntax and the BCP utility.

BULK INSERT from a table

Hi there.

I have a table that contains comma delimited text, and I am trying to convert this into another table

eg my target table looks like

Produce|Price|QuantityPerPrice

and my input table contains strings such as

"apples","7.5","10"

"pears","10","8"

"oranges","8","6"

Does anyone have any ideas on how to do this? I am after a solution that does them all at once: I am currently using charindex() to find each column, one at a time, but given the speed of BULK INSERT I would much rather do it as a table. The one solution that I don't want to resort to is to export the table with delimited strings to a data file, then BULK INSERT it...

Cheers

Neil

After reading the problem I am very interested to see the solution as I cant think of one without moving data to file or using charindex.|||Bulk insert or openrowset(bulk) (in SQL Server 2005) requires file as a data source. So what you are trying is not possible. You could however optimize some of the string splitting and do just insert. If there are large number of rows then perform the insert in batches in a transaction to get better performance. But I am confused as to why you would store such inefficient format in the database table and then try to manipulate it relationally. Why can't you store the data directly into the target table?|||

Thanks for the reply.

We are taking inputs with unknown numbers of fields and unknown delimiters, and working with them. We do this by importing the data into a single field and then analysing it. This leads to single-field tables with delimited data in them.

I think we are going to have to export with bcp, create a format file, then re-import with the known delimiter and fieldnames.

Which raises another question: is there a command line parameter I can pass to bulk insert so that it will take a delimiter and use the first row as fieldnames, or do I need a format file?

Regards,

Neil

|||I am still confused as to why you would first import the data into single column values and then try to do the parsing operation on the server. This seems inefficient to me. You could easily design DTS/SSIS packages or write your own scripting code to import the files directly in the first place. There is no facility in any of the bulk API to consider the first row as column names. You will have to skip it using the FIRSTROW parameter in BULK INSERT or similar option in BCP. There are also options to specify delimiter. Use of format files depends on the complexity of your data format. For the simple, delimiter case you may not require it. See BOL for more details on the command syntax and the BCP utility.

Thursday, March 8, 2012

Bulk Insert does not handle text qualifier

I'm trying to use Bulk Insert to upload a flat file.
The file is comma delimited but contains one field that has quotation marks around it. That field typically contains a comma (which of course, I don't want to consider as a field delimeter)
here is an example of the layout.

1,test,D,"OCB, England",450727,8575

I use the DTS Bulk Insert task to help me generate a format file to use with the bulk insert statement. I tell the wizard that the file is comma delimited with a text qualifer and it's fine with that. But when I run the bulk insert, it bombs with a "String or Binary data would be trunctated error.
So, any idea's on how to bulk insert to be smart enough to deal with this situation?"String or Binary data would be trunctated" usually means that the string you are trying to insert is longer than that allowed by the field in your destination table. If, for instance, your table's field is defined as 50 characters (the default) and one of your quoted strings is longer than 50 characters, you will get this error.

Either increase the size of your destination field, or cast your string as a field of the proper width before attempting to insert it.

blindman

Friday, February 24, 2012

Bulk insert

hi all,
I am using bulk insert to insert records from a text file to a table.
The text file contains # or comma as fieldterminator. how to handle
both the field terminators in a stored procedure.
thanks to all,
kc.Do you mean that some columns are seperated by # and some columns are
seperated by commas? Or that there's no pattern at all to the separators?
-Paul
<kc4u7@.yahoo.com> wrote in message
news:1141886075.704092.89530@.i39g2000cwa.googlegroups.com...
> hi all,
> I am using bulk insert to insert records from a text file to a table.
> The text file contains # or comma as fieldterminator. how to handle
> both the field terminators in a stored procedure.
> thanks to all,
> kc.
>|||Hi Paul,
I am using bulk insert to transfer hugh data from a .dbf file to my sql
server table. Can you guide me in this case what shoud I specify as
fieldterminator and rowterminator?
Thanks in advance.
Regards,
Shailesh
"Paul Nielsen (MVP)" wrote:

> Do you mean that some columns are seperated by # and some columns are
> seperated by commas? Or that there's no pattern at all to the separators?
> -Paul
>
> <kc4u7@.yahoo.com> wrote in message
> news:1141886075.704092.89530@.i39g2000cwa.googlegroups.com...
>
>|||Shailesh wrote:
> Hi Paul,
> I am using bulk insert to transfer hugh data from a .dbf file to my sql
> server table. Can you guide me in this case what shoud I specify as
> fieldterminator and rowterminator?
> Thanks in advance.
> Regards,
> Shailesh
>
Shailesh,
Please don't post the same question multiple times. I answered in one
of the several other threads you posted to. I'll leave you to work out
which it was! :-(
David Portas, SQL Server MVP
Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.
SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--

Bulk Import Diacritics Issue

Hi There,
When I do a bulk import from a DTS Package into a table the import fails at
record that contains diacritic characters like C, Í and E. The field type is
nvarchar.
Is it a collaction type I need to assign the field?
Do you have any idea how I can bypass this?
Thanks,
Kevin.Hi Kevin
"Kevin Humphreys" wrote:
> Hi There,
> When I do a bulk import from a DTS Package into a table the import fails at
> record that contains diacritic characters like C, Ã? and E. The field type is
> nvarchar.
> Is it a collaction type I need to assign the field?
> Do you have any idea how I can bypass this?
> Thanks,
> Kevin.
>
This seems that you need to change the codepage (-C) or specify a unicode
(-w or -N) parameters for BCP. Check in books online for the values.
John