Showing posts with label ascii. Show all posts
Showing posts with label ascii. Show all posts

Sunday, March 25, 2012

BULK INSERT, setting static data using the format file

Hello dbforums,

I are using a BULK INSERT to insert the data from a ascii file to a sql table. The table has a ProductInstanceId column that exists in the tables but does not exist in the ascii DICast data. I am setting the ProductInstanceId to a Guid that will be used for Metrics. I would like to create the Guid in C++ and then set it somehow during the BULK INSERT DICastRaw1hr and DICastRaw6hr. I am calling the BULK INSERT from C++/ADO. I do not see how you can set a static data in the BULK INSERT for a column that exists in the table but does not the source data ... seems there should be a way to do this with the format file?

The other way to do this is with a TRIGGER. I have the TRIGGER below. Prior to the calling the BULK INSERT using ADO I will use ADO to ALTER the TRIGGER with the new Guid. When the BULK INSERT runs the ProductInstanceId will be populated with the new Guid.

ALTER TRIGGER DICastRaw1hrInsertGuid
ON Alphanumericdata.dbo.DICastRaw1hr
FOR INSERT AS UPDATE dbo.DICastRaw1hr SET ProductInstanceId = '4f9a44eb-092b-445b-a224-cc7cdd207092'
WHERE modelrundatetime = (select max(modelrundatetime) from Alphanumericdata.dbo.DICastraw1hr(NOLOCK))

More Questions:

- The Trigger is slow. The Bulk Insert without the Trigger runs in about 10 sec ... with the Trigger in about 40 sec. I tried to use the sql code below in the TRigger but it was only doing the UPDATE on the last row. The TRIGGER must run after the BULK INSERT is complete. Now I am using the select (bad). Any comments ...

ALTER TRIGGER DICastRaw1hrInsertDate
ON Alphanumericdata.dbo.DICastRaw1hr
FOR INSERT
AS
DECLARE @.ID as integer
SELECT @.ID = i.recordid from inserted i
UPDATE dbo.DICastRaw1hr SET ProductInstanceId = '4f9a44eb-092b-445b-a224-cc7cdd207092'
WHERE recordid = @.ID

- I understand that I could set the Guid in the Default Value part of the table definition using the NEWID() function. I need the Guid to be the same for all the rows that are inserted during the BULK INSERT (all have the same modelrundatetime) ... how would I do this?

Thanks,
ChrisOr create a table that matches your file and load it...then transfer the data to it's final destination with a sproc...|||Loading into one table and then moving data into another? How about adding one more for fun?

Just look for this topic in BOL:

Using a Data File with Fewer Fields|||Originally posted by rdjabarov
Loading into one table and then moving data into another? How about adding one more for fun?

Just look for this topic in BOL:

Using a Data File with Fewer Fields

Why not...a staging table works great, especially when you need to audit data, add stuff ect.

Yes, he can use a format file...just set the server column order to 0

Lots of posts seem to deal with "unknown" file structures or missing data...or data types that are "iffy"..

Me, I prefer to let the feeder know there file is garbage, and have them resend it and have a locked down process...

Not always the case though...|||i can help with the Bulk Insert / ADO problem
but give me an example of what you are trying to load
in what kind of table|||Originally posted by Karolyn
i can help with the Bulk Insert / ADO problem
but give me an example of what you are trying to load
in what kind of table

Thanks to all for your responses.

I think I have a solution. I will create a sql function and use it to set the GUID in the Default Field attribute of the ProductInstanceId column.

Before I call the BULK INSERT from ADO I will get the new GUID (in C++) and then ALTER the FUNCTION using ADO. I will then call BULK INSERT and the ProductInstanceId will be set to the same Guid for all the rows that are inserted during the BULK INSERT.

The next time ... I will ALTER the FUNCTION again getting a new Guid and then do the same thing. This should be VERY fast.

I have only done the prototype. When I get all the code I will post it.

Off to my Neighborhood Caucus to remove Bush from office :--)

Thanks Again,
Peace and Love,
Chris|||Hi ...

Below is the code in C++ / ADO to achieve what I did. Again, During and BULK INSERT operation we want to set a column in the table that does not exist in the source file to a specific value for all the rows that will be inserted into the table for the BULK INSERT.

To do this we are using the ALTER TABLE ADD CONTRAINT to add a static string to the default value of a specific field of the table. We will set this to a GUID and then do the BULK INSERT. When we are done with the BULK INSERT we will ALTER TABLE DROP CONSTRAINT just be be safe.

Below is the code:

UINT CDICastLoadProcess::ThreadProcessLongTermWithBulkI nsert(LPVOID pParam)
{

// Define ADO connection pointers
_ConnectionPtr pConnection = NULL;
_RecordsetPtr pRecordset = NULL;

// Set the stored procedure name
csProcedureNameForecast6Hr = "DICastForecast6HrInsert";

try
{

// Get the starting tick count to do some duration measurment
nStart = GetTickCount();

// When we open the application we will open the ADO connection
pConnection.CreateInstance(__uuidof(Connection));

csConnect.Format("Provider='%s';Data Source='%s';",
pDICastLoadProcess->GetProvider(),
pDICastLoadProcess->GetDataSource());
csTemp.Format("Initial Catalog='%s';",
pDICastLoadProcess->GetInitialCatalog());
csConnect += csTemp;
csTemp.Format("User Id=%s;Password=%s",
pDICastLoadProcess->GetUserId(),
pDICastLoadProcess->GetPassword());
csConnect += csTemp;

bstrConnect = csConnect.AllocSysString();

pConnection->Open(bstrConnect,"","",adConnectUnspecified);

pConnection->CommandTimeout = DICL_ADO_CONNECTION_TIMEOUT;

// Create the GUID object and then create the GUID using the
// CoCreateGuid() method
pguidProductInstanceId = new GUID;
hrReturn = CoCreateGuid(pguidProductInstanceId);

// Convert the GUID to a string
// must link in Rpcrt4.lib for UuidToString
UuidToString(pguidProductInstanceId, &strProductInstanceId);
delete pguidProductInstanceId;

// We are going to drop the constraint just in case that it
// exists.
try
{
// Drop the guid constrain to the table, incase it exisits
csSql = "ALTER TABLE Alphanumericdata.dbo.DICastRaw6hr";
csSql += " DROP CONSTRAINT DF_DICastRaw6hr_ProductInstanceId";
bstrSql = csSql.AllocSysString();
pConnection->Execute(bstrSql, NULL, adExecuteNoRecords);
}
catch(...)
{
// If the contraint doe not exist, as it should not then we will
// fall into here
csMessage.Format("Guid contraint does not exist prior to alter, normal status. Thread:%d", nThreadCount);
pDICastLoadProcess->m_pLog->Write(MXLOG_SEVERITY_DEBUG, csMessage, csMethodName);
}

// Add the guid constrain to the table
csMessage.Format("Add guid constraint to table. Thread:%d", nThreadCount);
pDICastLoadProcess->m_pLog->Write(MXLOG_SEVERITY_INFO, csMessage, csMethodName);
csMessage.Format("Guid value strProductInstanceId: %s Thread:%d",strProductInstanceId, nThreadCount);
pDICastLoadProcess->m_pLog->Write(MXLOG_SEVERITY_DEBUG, csMessage, csMethodName);

csSql = "ALTER TABLE Alphanumericdata.dbo.DICastRaw6hr";
csSql += " ADD CONSTRAINT DF_DICastRaw6hr_ProductInstanceId";
// csSql += " DEFAULT 'AA9a44eb-092b-445b-a224-cc7cdd207092' FOR ProductInstanceId";
csTemp.Format(" DEFAULT '%s' FOR ProductInstanceId", strProductInstanceId);
csSql += csTemp;
bstrSql = csSql.AllocSysString();
pConnection->Execute(bstrSql, NULL, adExecuteNoRecords);

// Set the ProductId (GUID) in the recordset
// sval.SetString((const char *)strProductInstanceId);
// pRecordset->Fields->GetItem(L"ProductInstanceId")->PutValue(sval);

// pDICastLoadProcess->GetInputFilename()

// Typically the filename will come to use with a path. We only want
// the filename, we will remove the path if we find one. We are adding
// one to the position as ReverseFind() returned the position of
// the '\\', and we want the position of the next char
csInputFilename = pDICastLoadProcess->GetInputFilename();
nPos = csInputFilename.ReverseFind('\\');
nPos++;
csInputFilename = csInputFilename.Mid(nPos);

csMessage.Format("Begin sql bulk insert: %s Thread:%d", csInputFilename, nThreadCount);
pDICastLoadProcess->m_pLog->Write(MXLOG_SEVERITY_INFO, csMessage, csMethodName);

// Create the SQL string using
csTemp = "BULK INSERT [Alphanumericdata].[dbo].[DICastRaw6Hr]";
//csTemp += " FROM '\\\\filer1\\DATA.TEST\\csv\\DICASTST.CSV'";
csTemp += " FROM '\\\\filer1\\DATA.TEST\\csv\\";
csTemp += csInputFilename;
csTemp += "' WITH (FIELDTERMINATOR = ',',";
csTemp += " FIRSTROW = 2,";
// csTemp += " LASTROW = 5000,";
csTemp += " ROWTERMINATOR = '\n',";
csTemp += " FIRE_TRIGGERS,";
csTemp += " FORMATFILE = '\\\\filer1\\DATA.TEST\\csv\\DICASTLTF.CSV')";

// Set the bstr and then execute the sql
bstrSql = csTemp.AllocSysString();
pConnection->Execute(bstrSql, NULL, adExecuteNoRecords);

csMessage.Format("Bulk insert completed successfully: %s Thread:%d", csInputFilename, nThreadCount);
pDICastLoadProcess->m_pLog->Write(MXLOG_SEVERITY_STATIC, csMessage, csMethodName);

nEnd = GetTickCount();
nElapsedTime = (float)(nEnd - nStart)/(float)1000;

// Set the end time
COleDateTime oledtEndDateTime = COleDateTime::GetCurrentTime();

// Create the time span object using the values passed
COleDateTimeSpan oledtDuration = oledtEndDateTime - oledtStartDateTime;

// Format the elapse time
CString csDuration = oledtDuration.Format("%H:%M:%S");

// write message to introduce the Method
csLogMessage.Format("Bulk insert stats: %s Count: %d Error: %d Elapsed time: %s Thread:%d",
csInputFilename,
nCount,
nError,
csDuration,
nThreadCount);
pDICastLoadProcess->m_pLog->Write(MXLOG_SEVERITY_INFO, csLogMessage, csMethodName);

// Drop the guid constrain to the table
csMessage.Format("Drop guid constraint from table. Thread:%d", nThreadCount);
pDICastLoadProcess->m_pLog->Write(MXLOG_SEVERITY_INFO, csMessage, csMethodName);
csSql = "ALTER TABLE Alphanumericdata.dbo.DICastRaw6hr";
csSql += " DROP CONSTRAINT DF_DICastRaw6hr_ProductInstanceId";
bstrSql = csSql.AllocSysString();
pConnection->Execute(bstrSql, NULL, adExecuteNoRecords);


// write message to introduce the Method
csLogMessage.Format("ThreadProcessLongTermWithBulkInsert() thread ending normally. Thread:%d", nThreadCount);
pDICastLoadProcess->m_pLog->Write(MXLOG_SEVERITY_STATIC, csLogMessage, csMethodName);

}
catch(_com_error *e)
{
}
catch(...)
{
}

// Clean up objects before exit the thread.
if (pConnection)
{
if (pConnection->State == adStateOpen)
{
pConnection->Close();
}
}

delete pvTemp;

if (pDICastLoadProcess->GetOwner())
{
// Send a message that we are done
::PostMessage(pDICastLoadProcess->GetOwner()->m_hWnd,
WM_PROCESS_DICAST_THREAD,
(WPARAM) 103,
(LPARAM) nRowCount);
}

// We need these messages at the end.
// We need a flush in the Logging Class ?
csLogMessage.Format("Posting message to calling application. Thread:%d", nThreadCount);
pDICastLoadProcess->m_pLog->Write(MXLOG_SEVERITY_STATIC, csLogMessage, csMethodName);

csLogMessage.Format("Ending ThreadProcessLongTermWithBulkInsert(). Thread:%d", nThreadCount);
pDICastLoadProcess->m_pLog->Write(MXLOG_SEVERITY_STATIC, csLogMessage, csMethodName);

// Decrement the thread count on the way out
pDICastLoadProcess->SetThreadCountDec();

return 0;
}

Thursday, March 22, 2012

Bulk Insert Question

I need to move some data from an ascii file to a database. I want to use
BULK INSERT. The problem I have is there is no field seperation in the
ascii file. For example, columns 1-5 constitute a number, 6-28 constitute a
description, etc. Is there a way to use BULK INSERT without having to put a
comma or something between each field of data. Below is a small example of
data :
item description cost retail
12345thisistheitemdescription00245903599On Fri, 25 Feb 2005 16:13:13 -0500, sqlnewbie wrote:

>I need to move some data from an ascii file to a database. I want to use
>BULK INSERT. The problem I have is there is no field seperation in the
>ascii file. For example, columns 1-5 constitute a number, 6-28 constitute
a
>description, etc. Is there a way to use BULK INSERT without having to put
a
>comma or something between each field of data. Below is a small example of
>data :
>item description cost retail
> 12345thisistheitemdescription00245903599
>
Hi sqlnewbie,
I think you can do this with a formatfile. The easiest way to do this,
is to first run the bcp utility from a DOS prompt, answering all
questions and saving the information in a format file. Then, use a text
editor to check the contents of the format file and tweak it as needed.
Finally, use the FORMATFILE option of the BULK INSERT to specify this
format file for your date import.
Check out the subjects "bcp Utility (overview)" and "Using Format Files"
in Books Online for more information.
Best, Hugo
--
(Remove _NO_ and _SPAM_ to get my e-mail address)|||thanks, I'll play around with it...
"Hugo Kornelis" <hugo@.pe_NO_rFact.in_SPAM_fo> wrote in message
news:g77v11dgleakllnsplfl2aacookufkbcum@.
4ax.com...
> On Fri, 25 Feb 2005 16:13:13 -0500, sqlnewbie wrote:
>
> Hi sqlnewbie,
> I think you can do this with a formatfile. The easiest way to do this,
> is to first run the bcp utility from a DOS prompt, answering all
> questions and saving the information in a format file. Then, use a text
> editor to check the contents of the format file and tweak it as needed.
> Finally, use the FORMATFILE option of the BULK INSERT to specify this
> format file for your date import.
> Check out the subjects "bcp Utility (overview)" and "Using Format Files"
> in Books Online for more information.
> Best, Hugo
> --
> (Remove _NO_ and _SPAM_ to get my e-mail address)|||Use a format file. Here is part of the description:
If specifying a prefix length of 0 and no terminator, bcp allocates the
maximum amount of space shown in the field length prompt because this is the
maximum space that may be needed for the data type in question. The field is
treated as if it were of fixed length so that it is possible to determine
where one field ends and the next begins.
RLF
"sqlnewbie" <1@.1.com> wrote in message
news:uGbyT73GFHA.2736@.TK2MSFTNGP09.phx.gbl...
>I need to move some data from an ascii file to a database. I want to use
>BULK INSERT. The problem I have is there is no field seperation in the
>ascii file. For example, columns 1-5 constitute a number, 6-28 constitute
>a description, etc. Is there a way to use BULK INSERT without having to
>put a comma or something between each field of data. Below is a small
>example of data :
> item description cost retail
> 12345thisistheitemdescription00245903599
>

Thursday, March 8, 2012

Bulk Insert doubts

Hello Guys,

soon soon, ll have to devellop some procedures to read an ASCII file to supply MS SQL tables. As Ive read some old post, Ive understand that I have to use BULK INSERT , or else, BCP or DTS. Id like to know the diference between this commands and witch of them is more powerful, faster and efficient. If you can give me some implementation tips, I will be very grateful.

thanx allDid you look it up in BOL...

How is the data stored?

anyway...I almost exclusivley use bcp for Production code

BULK INSERT usually for quick data analysis..

and DTS for analysis, if the data is in Excel, Access, whatever...|||DTS is the most flexible, because it can go from more or less automated (using the wizard), to very task oriented (using the painter), to micro-managing the transfer (using VBA).

BCP and BULK INSERT are both tools that expose the current equivalents to the old Bulk Copy API. They are very efficient, but not extremely flexible or friendly.

-PatP|||But what's the most effecient?

EDIT: And do you not think releasing DTS to a production environment is more painful?|||BCP and BULK INSERT are simply differet front ends to the same code. Performancewise the difference is irrelevant.

I live and breathe in a replicated environment. One of our machines does virtually all of our DTS jobs against an "interface" server that has little or no user load.

We use DTS against our OLTP servers, our DW servers, and a number of "friendly" machines that need data. I've never had any real complaints about it.

-PatP|||Well thanx for all replies... Talking about what I have... Ill have 30 ASCII files to load as I told you, and approximatly 91292,42 KB for each... Itll run in production environment, to synchronize the ambient for a BI system... Which comand should I use?

Originally posted by Brett Kaiser
Did you look it up in BOL...

How is the data stored?

anyway...I almost exclusivley use bcp for Production code

BULK INSERT usually for quick data analysis..

and DTS for analysis, if the data is in Excel, Access, whatever...|||I suggest DTS is best and reliable tool to import those ASCII text files to the database. And even you can schedule the same package if its ongoing BI requirement.

As suggested you should follow books online for all the information.
For DTS specifically keep in touch with http://www.sqldts.com.