Thursday, March 8, 2012

Bulk Insert fails to import data files created on Unix

It seems to me that files created on Unix machines with line terminator \n, or chr(10), cannot be imported using the Bulk Insert statement. Is this a bug, or an oversight by Microsoft? Does this mean that unless one replaces all \n with \r\n, there is no way to use Bulk Insert to import Unix files? This is a very strange behavior by MSSQL. Even lessor programs such as Excel and Word automatically recognize chr(10) as a line termination character. Am I missing something, or is this just the way MSSQL is?

You will need to use a format file, in this you can specify the terminator for the last column in a row.

Have a look in BOL. This page shows an example of a file using /r/n which you can obviously change

ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/udb9/html/ecfc546d-f708-45f4-878d-fb71b5fd1a0a.htm

|||

Of if you are using the BULK INSERT TSQL statement look at this page

ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/tsqlref9/html/be3984e1-5ab3-4226-a539-a9f58e1e01e2.htm

|||I figured out that the problem has to do with MSSQL's native behavior. It turns out that whenever it sees \n, it automatically converts it to \r\n, without notifiying the user. There are at least three ways to work around this strange behavior: 1) replace every \n in your file with \r\n before using Bulk Insert, 2) build your sql statement dynamically either in a stored procedure or VB.net, i.e., use & chr(10) & ,or + chr(10) +, instead of '\n' as the line terminator in your statement, and 3) load your file into a Datatable via ADO.net, and then insert the entire Datatable into MSSQL.|||

Note sure what figuring out was required, as BOL has an example that just works.

|||

I have never succeeded to convince BULK INSERT to read Unix

files, and I find one of these solutions usually works:

1. See if the process that moved the files from the Unix

machine can correct the line ends (for example, ftp can do

this)

2. Run the Unix utility unix2dos on the files before they

leave the Unix machine, or afterwards, under the Cygwin

Unix shell for Windows. (Or write a tiny command-line

Windows program to do this.)

Steve Kass

Drew University

ktto@.discussions.microsoft.com wrote:

> I figured out that the problem has to do with MSSQL's native behavior.

> It turns out that whenever it sees \n, it automatically converts it to

> \r\n, without notifiying the user. There are at least three ways to work

> around this strange behavior: 1) replace every \n in your file with \r\n

> before using Bulk Insert, 2) build your sql statement dynamically either

> in a stored procedure or VB.net, i.e., use & chr(10) & ,or + chr(10) +,

> instead of '\n' as the line terminator in your statement, and 3) load

> your file into a Datatable via ADO.net, and then insert the entire

> Datatable into MSSQL.

>

No comments:

Post a Comment