Sunday, February 19, 2012

Bulk CSV File Import?

Hi, don't know how to do this,

i've got 100+ .CSV text files (50mb in total, not each) i need to import into a SQL server database but don't know how.

i've tried hunting the web for some file concatenation program but just came against pay-for-me software, which didn't help.

any ideas?, would the BCP command do it?Yes .. you could use the bcp command with the -t flag to do the same .. in case you have the table structure ...|||a Possibility is in dos concatenate the files into one by

C:\type *.csv >> Big.csv

which shold lump all the files into one - then certainly use BCP or DTS to import

Rgds

GW|||We're assuming that files are all the same structure...if they're not you need to do something else...

Are they? Or are they not?

Will they all be going in to the same table?

Do you need to know the source of the data going forward (Which file did it come from..)|||Open Enterprise Manager, nav down to the DB, right click the DB, select "All Tasks", then "Import Data". The Data Source will be a "Text File", browse to your CSV and click next. Then set your CSV settings, and go on to the Data Destination, which will be a table in your DB. Keep in mind that if you don't have column names in your CSV, you'll need to direct each unnamed column to their destination column in the destination table (unless the table doesn't already exist).|||brilliant replies, you lot are geniuses!!!!!!!!!

yes, all the files follow the same format.
yes, they will be going into the same table.
no, i don't need to know where they came from.

hope that helps,|||ok,

tried c:\type *.csv >> big.csv

but obviously at the end it grabbed big.csv and put itself in the file twice (as it's a .csv as well), re-ran as,

c:\type *.csv >> big.dat

and no problems, will now have a look at the BCP command to import the data directly into the separate database i've got lined up for this, concatenated output came to a whopping 150mb!!! so it definitely needs a space of it's own.

thanks again, will let you know how it goes, appreciate any further tips if you've got them,|||well, i tried using the bcp tool to import the data into the table...

C:\Temp\bcp mydatabase..mytable in big.csv -t , -r /r -Sservername -Uusername -Ppassword

got this error:

SQLState = S1000, NativeError = 0
Error = [Microsoft][ODBC SQL Server Driver]Unexpected EOF encountered in BCP data-file

1210124 rows copied

the data is , delimited with a carriage return at each end of row, the fields in the table are all varchar fields,

But i did have a date field at the end of the table, so the data might not be following the pattern, any ideas?|||try using the import/export wizard from EM|||try using the import/export wizard from EM

Nah...get used to bcp...

Where did you run it from?

Do you know what xp_cmdshell is?|||ran it form the command-line

tried EM Import Wizard, did the initial job however this will need to be done each month.

yep, xp_cmdshell is a system procedure allowing you to run commands outside the scope of SQL 2000, e.g. command-lines., but i'd probably put this into a batch file and then put it into a timed process on the server, but sql could help.

any pointers to where i can find better information on BCP, books online isn't that helpful

?|||You cant do much with BCP if you keep getting that nagging error ...

SQLState = S1000, NativeError = 0
Error = [Microsoft][ODBC SQL Server Driver]Unexpected EOF encountered in BCP data-file

Bet its due to the concatenation of files :D|||But i did have a date field at the end of the table, so the data might not be following the pattern, any ideas?

Anyway to lose that last row?

That's the problem

Is the record count the same as what you got, less the last row?

Can the files be fixed width?

What happended to the date field when you dtsed it in?|||Well I fiddled around with the DTS Import Wizard some more and finally got it to work, much better than BCP although i can fully see the reason to use BCP but can't find that much documentation on it, i've got the 'Guru's Guide To SQL Server Architecture' on order from Amazon so that should help (i have the rest, brilliant).

Built the package, set it to APPEND the records into the table of my choice, so the extra fields (date stamps) would be ignored in the import, and hey presto, it works.

However one month's dump for this is gonna cost 200mb disk space and they want to keep each month in succession so we're gonna need HUGE disks for this, ah well, not really my prob but i can hear the screams somewhere ;-)

Thanks guys for all the help, much appreciated and keep up the good work, hope someday i'll be as good as you lot.

Cheers All,|||run bcp manually for one of the files without specifying -c or -n, so that you can create a format file. once the file is created, edit it to skip the fields you don't want to insert (see bol, there are examples available on how to skip fields)

script below will give you an idea on how to create one file out of your csv's. it's crude, and actually can be rewritten without using xp_cmdshell.

declare @.filename varchar(128), @.cmd varchar(8000)
create table #tmp (files varchar(128) null)
create table #output ([output] varchar(128) null)
insert #tmp exec master.dbo.xp_cmdshell 'dir c:\*.csv /a-d /b'
insert #output exec master.dbo.xp_cmdshell 'if exist c:\big.csv del c:\big.csv'
set @.cmd = ''
select @.cmd = @.cmd + 'c:\' + files + ' ' from #tmp where files is not null
set @.cmd = 'copy ' + replace(rtrim(@.cmd), ' ', '+') + ' c:\big.csv'
select @.cmd
insert #output exec master.dbo.xp_cmdshell @.cmd
go
drop table #tmp
go

after this use bulk insert and specify the format file that you created earlier.

No comments:

Post a Comment