To understand the problem that I have, I will exemplify but the problem it is general in the SQL2k5.
So any BCP insert it is slower with the SQL2k5 in compare with the SQL2k, doenst matter the edition of SQL2k5.
1. First test environment comparing Enterprise versions of SQL2k and SQL2k5 with memory seted 4GB RAM available on the computer it is 6GB Ram(awe enabled), Windows Server 2003 R2 SP1(this window version was special installed because I got the Info that can work better with the SQL2k5).Changing the allocated memory for the SQL Server makes no difference because the SQL Server need only maximum 200MB Ram for the import.
I was installing the system database from the both system on the same hardisk and the client database on the second harddisk.
I was creating a table with this query:
create table T02
(
autoid int identity (1, 1) not null, --identity (1, 1)
intValue int null,
floatValue float null,
stringValue varchar(255) null
) on [primary]
go
and I was generating a .dat file with 1 milion rows to be imported with the bulk insert Transact-SQL command.
1,1,0.002220,Dies ist eine nicht ganz so zuf?llige Zufallszahl: 1
2,2,0.239019,Dies ist eine nicht ganz so zuf?llige Zufallszahl: 2
3,3,1.219092,Dies ist eine nicht ganz so zuf?llige Zufallszahl: 3
4,4,0.390967,Dies ist eine nicht ganz so zuf?llige Zufallszahl: 4
5,5,1.102109,Dies ist eine nicht ganz so zuf?llige Zufallszahl: 5
6,1,0.202700,Dies ist eine nicht ganz so zuf?llige Zufallszahl: 6
7,2,1.383550,Dies ist eine nicht ganz so zuf?llige Zufallszahl: 7
8,3,20.287169,Dies ist eine nicht ganz so zuf?llige Zufallszahl: 8
9,4,0.250796,Dies ist eine nicht ganz so zuf?llige Zufallszahl: 9
10,5,0.887969,Dies ist eine nicht ganz so zuf?llige Zufallszahl: 10
This are the first 10 lines from the file. You can duplicate this line to obtain the milion lines file.
Now the next step was stopping SQL2k and starting new the SQL2k5 services(no cache, no memory already allocated).
Running the next import command:
bulk insert T02 from 'D:\bulk_insert.dat' with (fieldterminator=',')
obtaining the next time results: ~ 17 seconds.
The next step was stopping SQL2k5 and starting new the SQL2k services.
the same command the the time was about 11 seconds.
2. Second test environment comparing Standard versions of SQL2k and SQL2k5 with memory seted 2GB RAM available on the computer it is 4GB Ram, Windows 2000 SP4(a different computer as the first time).Changing the allocated memory for the SQL Server makes no difference because the SQL Server need only maximum 200MB Ram for the import.
I was repeting the tests before and getting the next results:
With SQL2k about 12 seconds.
With SQL2k5 about 20 seconds.
For me this performance it is not acceptable becuase some parts of my application are using the BCP Inserts as an optimize method and when this optimzed parts it is running slower the effect are very strong.
At the end the application it is about 25% slower with Sql2k5 as with Sql2k.
As I sayd at the beginning this problem it is general I can attach some tests programs thats are using the BCP Insert from the DB Library of OLE DB Consumer Templates from ATL. I've choosed this example just becuase it is easear to reproduce.
This programs are having the same procent of performance lost. As the Transact SQL Command the programs are about 50% slower with SQL2k5 as with SQL2k.
Please help me if you know any solution to optimize BCP Inserts on SQL2k5.
Vasile Mirea, Diplom Computer Science.
To what I already sayd I will add some remarks about the tests.
The Sql Servers 2005 had the Service Pack 1 already installed and the SQL2k the service pack 4.
About the tests using the Perfmon tool from windows I saw that the Sql2k5 it is using excesiv the hardware in compare with 2k that it is using only at the beggining strong and after that only little bit the hard work doing it the CPU's.
So seams that are important implementation difference between the versions in Sql Server but unfortunatly not positives ones(for the performance).
Another important fact was the memory used by the SQL Server 2k and SQL Server 2k5 the 2k it is using about 250MBRam and 2k5 it is using 150MBRam for the same BCP Import. This means that Microsoft tryed to improved the memory use but paying with performance.
I can send in an email the test data file and the print screens with the perfmon from the Sql Server.
So if everybody can help I wait your answear.
Thanks,
Vasile.
|||Vasile,I can think of a few possibilities. #1: For SQL Server 2005, BULK INSERT enforces data validation more carefully, and this enhancement may result in a performance penalty:
http://msdn2.microsoft.com/en-us/library/ms188365.aspx
"In SQL Server 2005, BULK INSERT enforces new data validation and data checks
that could cause existing scripts to fail when they are executed on invalid data
in a data file."
#2: If your SQL Server 2000 and 2005 instances do not use the same character collation., perhaps code page translation is causing more work in 2005?
#3: This might be most noticeable for the first BULK INSERT after starting the server, if some library code must be loaded initially.
If you cannot improve BULK INSERT performance with the parameters available to you (ROWS_PER_BATCH, for example), you will have to decide how serious this slowdown is relative to the benefits of upgrading to 2005. Since you are not seeing a dramatic slowdown (you are not seeing the insert take twice as long or more), improvements in hardware in just a few months may overcome the difference.
Steve Kass
Drew University
www.stevekass.com|||
Hi Steve,
So I've checked the possibilities that you gave me.
#3 I've repeated the test until 10 times on every server the times stays the same(without restarting the server). So for the second time the libraryes must be already loaded. This new tests should exclude this possibilities.
#2 I've checked the collation and on the both servers was the same: Latin1_General_CI_AS the default collation for the west Europe. Collation excluded, for sure still can be that the collation operators are new implemented but this I dont think it is the reason because this should produce a performance problem for the select queryes with a order list too. So collation excluded.
#1 I think this one can be the problem as I supposed too, a different implementation. Not necesary a "new data validation " can be something else too.
But this should be a Microsoft problem. I hope somebody will find this performance not acceptable and in the SP2 from SQL Server 2005 will come with a fix.
Or with some option for BCP Insert like "BCP data validation inactive", exactly as it is for the log BCP.
50% less performant for a query like BCP Insert it should not be ok for Microsoft.
Still looking for a solution to improve the performance. In our case the application it runs about 12 hours so 20% it is a lot.
|||
Hi Steve,
I made some research of my own.
Asking one consultant from Microsoft that was having direct access to the SQL Server development team I got a nice answear.
It seams that in the SQL Server 2005 implementation are some new layers to optimize the BCP Insert.
But this new layers can slow a normal BCP Insert on a table with lets say 20 fields(The size of a row it is not big enought).
If the size of the row it is bigger then a limit(i dont know the limit exactly) then the BCP Insert with SQL Server 2005 can be faster then in the SQL Server 2000.
We opened a ticket to Microsoft-Support to be sure and we got the answear: Slow by Design! (For our case of course)
This means for a normal case for lets say normal tables that dont have images or bulk fields then the new implementation of the SQL Server 2005 bulk Insert it is slower.
Are my research results true ?
I wait your answear,
Thanks, Vasile.
|||
In SQL Server 2005 some of the bulk load code indeed was re-architected. However, for the most common scenarios it did not cause any measurable performance slowdowns and instead many 'normal' scenarios run faster than in SQL Server 2000. There are some corner cases though that may run slower, however your scenario certainly doesn't look like one that should experience performance problems when moving from SQL Server 2000 to SQL Server 2005 provided that you make use of the bulk load optimizations.
Please make sure that your database recovery mode is either SIMPLE or BULK_LOGGED and make sure that you specify the TABLOCK option when issuing the BULK INSERT statement. You may as well want to consider using the CODEPAGE='RAW' to bypass the CP translation code if you know that it would be safe for the particular data you are loading.
Please also note that the actual performance of the BULK INSERT statement may depend on the specifics of the test scenario that expose some other differences in the SQL Server 2005 behavior - for instance: do you truncate or drop/re-create the table before repeating the bulk loads or you just keep appending the data to the table? Do you issue a checkpoint before/after the BULK INSERT statement?
No comments:
Post a Comment