Thursday, March 29, 2012
Bulk Move
I am using Microsoft SQL server 2005. I want to move a large set of
data with a specific condition to other database in the same machine.
for example after a period of time I want to archive data by the
condtion of time ( in a specific year ) to other database. that
archive database is going to store the previous data which are
archived.
Am I going to use queries to move the data for each table? or there is
a better way?
what is the best approach to do so?
Thanks in advance,
AliAli
What is large set of data means?
You can have a job which does an INSERT statement every day ( that's
probably is not too large set fo data) or
INSERT INTO.....dbname.dbo.tabl and having properly created indexes on the
source tables
"Ali" <nikzad.a@.gmail.com> wrote in message
news:1191158805.093132.214280@.22g2000hsm.googlegroups.com...
> hi,
> I am using Microsoft SQL server 2005. I want to move a large set of
> data with a specific condition to other database in the same machine.
> for example after a period of time I want to archive data by the
> condtion of time ( in a specific year ) to other database. that
> archive database is going to store the previous data which are
> archived.
> Am I going to use queries to move the data for each table? or there is
> a better way?
> what is the best approach to do so?
> Thanks in advance,
> Ali
>|||Why are you archiving data? If the purpose is manageability, you might
consider partitioning instead of moving data to another database (assuming
SQL 2005 Enterprise Edition). You can also use a partition switch to
efficiently delete or insert mass amounts of data by partition as part of
your move process.
You'll need to use queries to extract and delete data if partitioning is not
an option. Bulk insert (especially minimally logged) is faster than INSERT
when large amounts of data are involved. A common technique to load large
amounts of data is to remove indexes on the destination table, perform a
minimally logged bulk insert (perhaps in a SSIS package) and then recreate
indexes and constraints.
--
Hope this helps.
Dan Guzman
SQL Server MVP
"Ali" <nikzad.a@.gmail.com> wrote in message
news:1191158805.093132.214280@.22g2000hsm.googlegroups.com...
> hi,
> I am using Microsoft SQL server 2005. I want to move a large set of
> data with a specific condition to other database in the same machine.
> for example after a period of time I want to archive data by the
> condtion of time ( in a specific year ) to other database. that
> archive database is going to store the previous data which are
> archived.
> Am I going to use queries to move the data for each table? or there is
> a better way?
> what is the best approach to do so?
> Thanks in advance,
> Ali
>|||do you want to COPY or MOVE the data?
if it's a copy you can use the synchronization feature of SQL Server. this
process take a copy of the source database at a regular basis (up to a
realtime synchonization)
if you want to control everything (like which rows you want to copy by
applying filters) and/or want to delete the source data (so moving the
data):
you can do it by running SQL scripts or using SSIS
so you have to write the SQL statements required to copy the expected rows
and delete them in the source if you want to remove them.
"Ali" <nikzad.a@.gmail.com> wrote in message
news:1191158805.093132.214280@.22g2000hsm.googlegroups.com...
> hi,
> I am using Microsoft SQL server 2005. I want to move a large set of
> data with a specific condition to other database in the same machine.
> for example after a period of time I want to archive data by the
> condtion of time ( in a specific year ) to other database. that
> archive database is going to store the previous data which are
> archived.
> Am I going to use queries to move the data for each table? or there is
> a better way?
> what is the best approach to do so?
> Thanks in advance,
> Ali
>|||On Sep 30, 5:02 pm, "Dan Guzman" <guzma...@.nospam-
online.sbcglobal.net> wrote:
> Why are you archiving data? If the purpose is manageability, you might
> consider partitioning instead of moving data to another database (assuming
> SQL 2005 Enterprise Edition). You can also use a partition switch to
> efficiently delete or insert mass amounts of data by partition as part of
> your move process.
> You'll need to use queries to extract and delete data if partitioning is not
> an option. Bulk insert (especially minimally logged) is faster than INSERT
> when large amounts of data are involved. A common technique to load large
> amounts of data is to remove indexes on the destination table, perform a
> minimally logged bulk insert (perhaps in a SSIS package) and then recreate
> indexes and constraints.
> --
> Hope this helps.
> Dan Guzman
> SQL Server MVP
> "Ali" <nikza...@.gmail.com> wrote in message
> news:1191158805.093132.214280@.22g2000hsm.googlegroups.com...
> > hi,
> > I am using Microsoft SQL server 2005. I want to move a large set of
> > data with a specific condition to other database in the same machine.
> > for example after a period of time I want to archive data by the
> > condtion of time ( in a specific year ) to other database. that
> > archive database is going to store the previous data which are
> > archived.
> > Am I going to use queries to move the data for each table? or there is
> > a better way?
> > what is the best approach to do so?
> > Thanks in advance,
> > Ali
Thanks for the answers. so I think I'd better use partitioning than
copying or moving them to other database.
My reason was the speed of retrieving data. I think partitioning can
do the Job.
Thanks,
Ali|||> My reason was the speed of retrieving data. I think partitioning can
> do the Job.
Partitioning will help manageability but not necessarily performance. The
key to data retrieval performance is indexing and query tuning. Appropriate
indexes reduce the amount of data that needs to be touched without
physically moving data data via archiving or partitioning.
--
Hope this helps.
Dan Guzman
SQL Server MVP
"Ali" <nikzad.a@.gmail.com> wrote in message
news:1191166646.655159.267330@.g4g2000hsf.googlegroups.com...
> On Sep 30, 5:02 pm, "Dan Guzman" <guzma...@.nospam-
> online.sbcglobal.net> wrote:
>> Why are you archiving data? If the purpose is manageability, you might
>> consider partitioning instead of moving data to another database
>> (assuming
>> SQL 2005 Enterprise Edition). You can also use a partition switch to
>> efficiently delete or insert mass amounts of data by partition as part of
>> your move process.
>> You'll need to use queries to extract and delete data if partitioning is
>> not
>> an option. Bulk insert (especially minimally logged) is faster than
>> INSERT
>> when large amounts of data are involved. A common technique to load
>> large
>> amounts of data is to remove indexes on the destination table, perform a
>> minimally logged bulk insert (perhaps in a SSIS package) and then
>> recreate
>> indexes and constraints.
>> --
>> Hope this helps.
>> Dan Guzman
>> SQL Server MVP
>> "Ali" <nikza...@.gmail.com> wrote in message
>> news:1191158805.093132.214280@.22g2000hsm.googlegroups.com...
>> > hi,
>> > I am using Microsoft SQL server 2005. I want to move a large set of
>> > data with a specific condition to other database in the same machine.
>> > for example after a period of time I want to archive data by the
>> > condtion of time ( in a specific year ) to other database. that
>> > archive database is going to store the previous data which are
>> > archived.
>> > Am I going to use queries to move the data for each table? or there is
>> > a better way?
>> > what is the best approach to do so?
>> > Thanks in advance,
>> > Ali
> Thanks for the answers. so I think I'd better use partitioning than
> copying or moving them to other database.
> My reason was the speed of retrieving data. I think partitioning can
> do the Job.
> Thanks,
> Ali
>
Tuesday, March 27, 2012
Bulk insertion
Hi,
I am working on an application that is to read a large number of XML files, take out specific values from each file, and store these in a SQL server so that reports can be generated from these values. There are some 15-20,000 files for each month of the year. I am OK with parsing the files and getting the fields that I need but I don't want to insert one record at a time as I parse the files. I was told that I can create a .exe file that parses the xml files and stores the required values in a csv file and use these csv files to initiate a bulk insert, using Business Intelligence Studio. I have not been able to find any info or article on how to do this. Any help on how I can accomplish this, or alternate solutions is greatly appreciated.
You can make use of SQL BulkCopy feature of ADO.NET. Load your XML into a DataSet/DataTable and run SQL Bulk copy into your table. Very few lines of code.
using (SqlBulkCopy bulkCopy = new SqlBulkCopy(connectionString))
{
foreach (string tableName in tableNames)
{
SqlBulkCopy(dataSet.Tables[tableName], tableName, bulkCopy);
}
}
private static void SqlBulkCopy(DataTable dataTable, string tableName, SqlBulkCopy bulkCopy)
{
bulkCopy.DestinationTableName = tableName;
bulkCopy.ColumnMappings.Clear();
foreach (DataColumn myCol in dataTable.Columns)
bulkCopy.ColumnMappings.Add(myCol.ColumnName, myCol.ColumnName);
bulkCopy.WriteToServer(dataTable);
}
Thanks for the reply Raghu. I don't need all the data elements in the XML files, only a selected few. This is an example XML file:
<xml xmlns:s='uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882'
xmlns:dt='uuid:C2F41010-65B3-11d1-A29F-00AA00C14882'
xmlns:rs='urn:schemas-microsoft-com:rowset'
xmlns:z='#RowsetSchema'>
<s:Schema id='RowsetSchema'>
<s:ElementType name='row' content='eltOnly' rs:updatable='true'>
<s:AttributeType name='c0' rs:name='AFR Finalization' rs:number='1' rs:write='true'>
<s:datatype dt:type='string' rs:dbtype='str' dt:maxLength='18' rs:precision='0' rs:fixedlength='true' rs:maybenull='false'/>
</s:AttributeType>
<s:AttributeType name='Processed' rs:number='2' rs:write='true'>
<s:datatype dt:type='int' dt:maxLength='4' rs:precision='0' rs:fixedlength='true' rs:maybenull='false'/>
</s:AttributeType>
<s:AttributeType name='Rate' rs:number='3' rs:write='true'>
<s:datatype dt:type='string' rs:dbtype='str' dt:maxLength='7' rs:precision='0' rs:fixedlength='true' rs:maybenull='false'/>
</s:AttributeType>
<s:extends type='rs:rowbase'/>
</s:ElementType>
</s:Schema>
<rs:data>
<rs:insert>
<z:row c0='CIF '/>
<z:row c0=' Input ' Processed='12028'/>
<z:row c0=' Finalized ' Processed='5444' Rate=' 45.26%'/>
<z:row c0='RTS '/>
<z:row c0=' Input ' Processed='9802'/>
<z:row c0=' Finalized ' Processed='5504' Rate=' 56.15%'/>
<z:row c0='Interception '/>
<z:row c0=' Input ' Processed='12639'/>
<z:row c0=' Finalized ' Processed='8220' Rate=' 65.04%'/>
</rs:insert>
</rs:data>
</xml>
I only need CIF/Input and RTS/INPUT from this particular file. There will be one of this file for each day of the month and I will have to process a month's worth of xml files. There are tens of different xml files but they all follow the same schema.
Thanks.
Sunday, March 25, 2012
Bulk Insert with only one column
Hi,
I have a question on inserting data for only a specific column in a table.
I have a table as follows
Table <MyTable>
{
Name varchar,
DateUpdate DateTime
}
I wanted to insert the the from a file to table.
The File contains the list of name as follows (line by line)
name1
name2
name3
name4
......
The file name actually contains the DateTime
I would like to insert the names in the file as wellas the DateTime (i.e. part of the file name ) into <MyTable>
I guess "Bulk insert " doens't allow to insert values for only one column
If i change the contents of my data file to
name1 | DateTime1
name2 | DateTime2
name3 | DateTime3
name4 | DateTime4
Then the follwoin query works fine for me.
Bulk Insert <MyTable> FROM <filePath>
With
{
FIELDTERMINATOR = '|'
ROWTERMINATOR = '\n'
}
But my original file will contains only Names and the file name contains the date that commom for all the names in the file. And also the file may contains millions of names
Is there any way this can be accomplised using " Bulk Insert" ? Or is there any alternative that i can do it fastly
Your answer will be appreciated
~mohan
Instead of achieving this task in one step, can you do in two steps making use of a staging table that matches the file structure and insert to the main table after that. There might be other solutions also, lets hear from other people.
|||Hi Mohan,
Answer for your First Question :
Table <MyTable>
{
Name varchar,
DateUpdate DateTime
}
Create a view like :
create view vwMyTable
as
select Name from MyTable
Use this view in Bulk Insert to load the data instead of the table.
Here the assumption is your DataUpdate allows Null or any default assigned to it.
Answer for your Second Question :
Insert all your data file names into a temp table using xp_cmdshell output.
Read from the table one by one and execute the bulk insert and update the datetime from table.
Thanks
Naras.
|||
If your use SQL Server 2005, you coult use OPENROWSET with BULK option:
Code Snippet
INSERT INTO youTable
SELECT a.Name, <some expression for datetime> FROM OPENROWSET( BULK 'c:\test\values.txt',
FORMATFILE = 'c:\test\values.fmt') AS a;
You can build an SS Integraton Services package
1. For get a new datatime column you can use Derived Column Data Flow Transformation
2. For separating your data in Name and DateUpdate you can use Fixed Width Format in Flat File Connection Manager; in this way you put a column delimiter in desired position of initial string , depends on the size of DateUpdate.
This package can be put in a job a launched periodically if your business logic need it.
|||I would also suggest that you use a stage table for the initial load, then do whatever 'massaging' needs to be done, and after that update your production table.
/Kenneth