Sunday, March 25, 2012

BULK INSERT, setting static data using the format file

Hello dbforums,

I are using a BULK INSERT to insert the data from a ascii file to a sql table. The table has a ProductInstanceId column that exists in the tables but does not exist in the ascii DICast data. I am setting the ProductInstanceId to a Guid that will be used for Metrics. I would like to create the Guid in C++ and then set it somehow during the BULK INSERT DICastRaw1hr and DICastRaw6hr. I am calling the BULK INSERT from C++/ADO. I do not see how you can set a static data in the BULK INSERT for a column that exists in the table but does not the source data ... seems there should be a way to do this with the format file?

The other way to do this is with a TRIGGER. I have the TRIGGER below. Prior to the calling the BULK INSERT using ADO I will use ADO to ALTER the TRIGGER with the new Guid. When the BULK INSERT runs the ProductInstanceId will be populated with the new Guid.

ALTER TRIGGER DICastRaw1hrInsertGuid
ON Alphanumericdata.dbo.DICastRaw1hr
FOR INSERT AS UPDATE dbo.DICastRaw1hr SET ProductInstanceId = '4f9a44eb-092b-445b-a224-cc7cdd207092'
WHERE modelrundatetime = (select max(modelrundatetime) from Alphanumericdata.dbo.DICastraw1hr(NOLOCK))

More Questions:

- The Trigger is slow. The Bulk Insert without the Trigger runs in about 10 sec ... with the Trigger in about 40 sec. I tried to use the sql code below in the TRigger but it was only doing the UPDATE on the last row. The TRIGGER must run after the BULK INSERT is complete. Now I am using the select (bad). Any comments ...

ALTER TRIGGER DICastRaw1hrInsertDate
ON Alphanumericdata.dbo.DICastRaw1hr
FOR INSERT
AS
DECLARE @.ID as integer
SELECT @.ID = i.recordid from inserted i
UPDATE dbo.DICastRaw1hr SET ProductInstanceId = '4f9a44eb-092b-445b-a224-cc7cdd207092'
WHERE recordid = @.ID

- I understand that I could set the Guid in the Default Value part of the table definition using the NEWID() function. I need the Guid to be the same for all the rows that are inserted during the BULK INSERT (all have the same modelrundatetime) ... how would I do this?

Thanks,
ChrisOr create a table that matches your file and load it...then transfer the data to it's final destination with a sproc...|||Loading into one table and then moving data into another? How about adding one more for fun?

Just look for this topic in BOL:

Using a Data File with Fewer Fields|||Originally posted by rdjabarov
Loading into one table and then moving data into another? How about adding one more for fun?

Just look for this topic in BOL:

Using a Data File with Fewer Fields

Why not...a staging table works great, especially when you need to audit data, add stuff ect.

Yes, he can use a format file...just set the server column order to 0

Lots of posts seem to deal with "unknown" file structures or missing data...or data types that are "iffy"..

Me, I prefer to let the feeder know there file is garbage, and have them resend it and have a locked down process...

Not always the case though...|||i can help with the Bulk Insert / ADO problem
but give me an example of what you are trying to load
in what kind of table|||Originally posted by Karolyn
i can help with the Bulk Insert / ADO problem
but give me an example of what you are trying to load
in what kind of table

Thanks to all for your responses.

I think I have a solution. I will create a sql function and use it to set the GUID in the Default Field attribute of the ProductInstanceId column.

Before I call the BULK INSERT from ADO I will get the new GUID (in C++) and then ALTER the FUNCTION using ADO. I will then call BULK INSERT and the ProductInstanceId will be set to the same Guid for all the rows that are inserted during the BULK INSERT.

The next time ... I will ALTER the FUNCTION again getting a new Guid and then do the same thing. This should be VERY fast.

I have only done the prototype. When I get all the code I will post it.

Off to my Neighborhood Caucus to remove Bush from office :--)

Thanks Again,
Peace and Love,
Chris|||Hi ...

Below is the code in C++ / ADO to achieve what I did. Again, During and BULK INSERT operation we want to set a column in the table that does not exist in the source file to a specific value for all the rows that will be inserted into the table for the BULK INSERT.

To do this we are using the ALTER TABLE ADD CONTRAINT to add a static string to the default value of a specific field of the table. We will set this to a GUID and then do the BULK INSERT. When we are done with the BULK INSERT we will ALTER TABLE DROP CONSTRAINT just be be safe.

Below is the code:

UINT CDICastLoadProcess::ThreadProcessLongTermWithBulkI nsert(LPVOID pParam)
{

// Define ADO connection pointers
_ConnectionPtr pConnection = NULL;
_RecordsetPtr pRecordset = NULL;

// Set the stored procedure name
csProcedureNameForecast6Hr = "DICastForecast6HrInsert";

try
{

// Get the starting tick count to do some duration measurment
nStart = GetTickCount();

// When we open the application we will open the ADO connection
pConnection.CreateInstance(__uuidof(Connection));

csConnect.Format("Provider='%s';Data Source='%s';",
pDICastLoadProcess->GetProvider(),
pDICastLoadProcess->GetDataSource());
csTemp.Format("Initial Catalog='%s';",
pDICastLoadProcess->GetInitialCatalog());
csConnect += csTemp;
csTemp.Format("User Id=%s;Password=%s",
pDICastLoadProcess->GetUserId(),
pDICastLoadProcess->GetPassword());
csConnect += csTemp;

bstrConnect = csConnect.AllocSysString();

pConnection->Open(bstrConnect,"","",adConnectUnspecified);

pConnection->CommandTimeout = DICL_ADO_CONNECTION_TIMEOUT;

// Create the GUID object and then create the GUID using the
// CoCreateGuid() method
pguidProductInstanceId = new GUID;
hrReturn = CoCreateGuid(pguidProductInstanceId);

// Convert the GUID to a string
// must link in Rpcrt4.lib for UuidToString
UuidToString(pguidProductInstanceId, &strProductInstanceId);
delete pguidProductInstanceId;

// We are going to drop the constraint just in case that it
// exists.
try
{
// Drop the guid constrain to the table, incase it exisits
csSql = "ALTER TABLE Alphanumericdata.dbo.DICastRaw6hr";
csSql += " DROP CONSTRAINT DF_DICastRaw6hr_ProductInstanceId";
bstrSql = csSql.AllocSysString();
pConnection->Execute(bstrSql, NULL, adExecuteNoRecords);
}
catch(...)
{
// If the contraint doe not exist, as it should not then we will
// fall into here
csMessage.Format("Guid contraint does not exist prior to alter, normal status. Thread:%d", nThreadCount);
pDICastLoadProcess->m_pLog->Write(MXLOG_SEVERITY_DEBUG, csMessage, csMethodName);
}

// Add the guid constrain to the table
csMessage.Format("Add guid constraint to table. Thread:%d", nThreadCount);
pDICastLoadProcess->m_pLog->Write(MXLOG_SEVERITY_INFO, csMessage, csMethodName);
csMessage.Format("Guid value strProductInstanceId: %s Thread:%d",strProductInstanceId, nThreadCount);
pDICastLoadProcess->m_pLog->Write(MXLOG_SEVERITY_DEBUG, csMessage, csMethodName);

csSql = "ALTER TABLE Alphanumericdata.dbo.DICastRaw6hr";
csSql += " ADD CONSTRAINT DF_DICastRaw6hr_ProductInstanceId";
// csSql += " DEFAULT 'AA9a44eb-092b-445b-a224-cc7cdd207092' FOR ProductInstanceId";
csTemp.Format(" DEFAULT '%s' FOR ProductInstanceId", strProductInstanceId);
csSql += csTemp;
bstrSql = csSql.AllocSysString();
pConnection->Execute(bstrSql, NULL, adExecuteNoRecords);

// Set the ProductId (GUID) in the recordset
// sval.SetString((const char *)strProductInstanceId);
// pRecordset->Fields->GetItem(L"ProductInstanceId")->PutValue(sval);

// pDICastLoadProcess->GetInputFilename()

// Typically the filename will come to use with a path. We only want
// the filename, we will remove the path if we find one. We are adding
// one to the position as ReverseFind() returned the position of
// the '\\', and we want the position of the next char
csInputFilename = pDICastLoadProcess->GetInputFilename();
nPos = csInputFilename.ReverseFind('\\');
nPos++;
csInputFilename = csInputFilename.Mid(nPos);

csMessage.Format("Begin sql bulk insert: %s Thread:%d", csInputFilename, nThreadCount);
pDICastLoadProcess->m_pLog->Write(MXLOG_SEVERITY_INFO, csMessage, csMethodName);

// Create the SQL string using
csTemp = "BULK INSERT [Alphanumericdata].[dbo].[DICastRaw6Hr]";
//csTemp += " FROM '\\\\filer1\\DATA.TEST\\csv\\DICASTST.CSV'";
csTemp += " FROM '\\\\filer1\\DATA.TEST\\csv\\";
csTemp += csInputFilename;
csTemp += "' WITH (FIELDTERMINATOR = ',',";
csTemp += " FIRSTROW = 2,";
// csTemp += " LASTROW = 5000,";
csTemp += " ROWTERMINATOR = '\n',";
csTemp += " FIRE_TRIGGERS,";
csTemp += " FORMATFILE = '\\\\filer1\\DATA.TEST\\csv\\DICASTLTF.CSV')";

// Set the bstr and then execute the sql
bstrSql = csTemp.AllocSysString();
pConnection->Execute(bstrSql, NULL, adExecuteNoRecords);

csMessage.Format("Bulk insert completed successfully: %s Thread:%d", csInputFilename, nThreadCount);
pDICastLoadProcess->m_pLog->Write(MXLOG_SEVERITY_STATIC, csMessage, csMethodName);

nEnd = GetTickCount();
nElapsedTime = (float)(nEnd - nStart)/(float)1000;

// Set the end time
COleDateTime oledtEndDateTime = COleDateTime::GetCurrentTime();

// Create the time span object using the values passed
COleDateTimeSpan oledtDuration = oledtEndDateTime - oledtStartDateTime;

// Format the elapse time
CString csDuration = oledtDuration.Format("%H:%M:%S");

// write message to introduce the Method
csLogMessage.Format("Bulk insert stats: %s Count: %d Error: %d Elapsed time: %s Thread:%d",
csInputFilename,
nCount,
nError,
csDuration,
nThreadCount);
pDICastLoadProcess->m_pLog->Write(MXLOG_SEVERITY_INFO, csLogMessage, csMethodName);

// Drop the guid constrain to the table
csMessage.Format("Drop guid constraint from table. Thread:%d", nThreadCount);
pDICastLoadProcess->m_pLog->Write(MXLOG_SEVERITY_INFO, csMessage, csMethodName);
csSql = "ALTER TABLE Alphanumericdata.dbo.DICastRaw6hr";
csSql += " DROP CONSTRAINT DF_DICastRaw6hr_ProductInstanceId";
bstrSql = csSql.AllocSysString();
pConnection->Execute(bstrSql, NULL, adExecuteNoRecords);


// write message to introduce the Method
csLogMessage.Format("ThreadProcessLongTermWithBulkInsert() thread ending normally. Thread:%d", nThreadCount);
pDICastLoadProcess->m_pLog->Write(MXLOG_SEVERITY_STATIC, csLogMessage, csMethodName);

}
catch(_com_error *e)
{
}
catch(...)
{
}

// Clean up objects before exit the thread.
if (pConnection)
{
if (pConnection->State == adStateOpen)
{
pConnection->Close();
}
}

delete pvTemp;

if (pDICastLoadProcess->GetOwner())
{
// Send a message that we are done
::PostMessage(pDICastLoadProcess->GetOwner()->m_hWnd,
WM_PROCESS_DICAST_THREAD,
(WPARAM) 103,
(LPARAM) nRowCount);
}

// We need these messages at the end.
// We need a flush in the Logging Class ?
csLogMessage.Format("Posting message to calling application. Thread:%d", nThreadCount);
pDICastLoadProcess->m_pLog->Write(MXLOG_SEVERITY_STATIC, csLogMessage, csMethodName);

csLogMessage.Format("Ending ThreadProcessLongTermWithBulkInsert(). Thread:%d", nThreadCount);
pDICastLoadProcess->m_pLog->Write(MXLOG_SEVERITY_STATIC, csLogMessage, csMethodName);

// Decrement the thread count on the way out
pDICastLoadProcess->SetThreadCountDec();

return 0;
}

No comments:

Post a Comment