At lot of times when working with databases you will find that you need to import a text file format and have to manipulate it for the purpose of sending it back out to another computer processing system. This is where you might find that Microsoft’s Transact-SQL in SQL Server Management Studio (SSMS) will do the job for you. You may also have a look at Microsoft Excel or Access. They will probably work for you as well. In this blog post we will look at an example of a simple comma-separated file (.csv) that we need to format differently so that it matches the requirements of the system we are sending it to. If you have data in Excel, you may be able to copy and paste the data to import it. For instructions on how to do this, see at the bottom of this post.
If your input file is not delimited, there are three posts dealing with that solution.
Sometimes the best way to learn is to jump right in with an example.
Input File – input.csv
The input file is a csv file that is given to us that is delimited with commas. The file has one header line that describes the columns in the file. Each column name is enclosed in double quotes. An example file is shown below. The third column has the code a A or B. The table name in the SQL Server database is [SimpleBulkInsert]. Here is what our input file looks like:
"FirstName","LastName","A","Amount","JulianDate" Bob,"Smith",A,23098,2017029 John,Johnson,A,98341,2017029
Output – Detail Lines
Below is what we want the detail-line output to look like. Our requirements might also include a header line and a trailer line. We may need to produce one text file that has a header line, some detail lines and a trailer line. In this blog post we will only look at the detail lines. The output of our processing will result in a result query that looks like what we see below. To get the formatting in SQL Server itself to look this way you will need to be sure that the button in SQL Server Management Studio called Results to Text is clicked. You can change default settings from SSMS: Tools>>Options>>Query Results>>SQL Server >>General>> you have three options here to save result: Results to grid;Results to text; Result to file.
------------------------------------------------------------- 0000A00022017029 A_CONSTANT 000000098341John Johnson END 0000A00032017029 A_CONSTANT 000000023098Bob Smith END (2 row(s) affected)
Two-Step Process (Load and Select)
The first step is to load the csv file’s data into a previously made table in our database. The second step is to use a SQL SELECT statement to out put the data. Once it is output, we can copy and past the data into another file. Below we use REPLACE to clean the data. Data cleaning by replacing some characters with others is often necessary.
/* This is the load query that imports a csv file */ USE myDatabase GO DELETE FROM [myDatabase].[dbo].[SimpleBulkInsert] GO BULK INSERT [myDatabase].[dbo].[SimpleBulkInsert] FROM 'C:\...\input.csv' WITH ( DATAFILETYPE = 'char', FIELDTERMINATOR = ',', ROWTERMINATOR = '0x0a', FIRSTROW = 2 /* Start at row 2, which is the data */ ) GO /* Get rid of any double quotes in the last name */ UPDATE [myDatabase].[dbo].[SimpleBulkInsert] SET lastname = REPLACE(lastname, '"', '') WHERE CHARINDEX('"', lastname) <> 0 /* REPLACE Replaces all occurrences of a specified string value with another string value. REPLACE ( expressionToFind , expressionToSearch [ , start_location ] ) CHARINDEX Searches an expression for another expression and returns its starting position if found. CHARINDEX ( expressionToFind , expressionToSearch [ , start_location ] ) */ SELECT * FROM [myDatabase].[dbo].[SimpleBulkInsert] GO
Below is the code that will give you the detail-line results.
USE [myDatabase] GO SELECT Cast('0000' AS CHAR(4)) + Cast(CodeAB AS CHAR(1)) + right('0000' + cast((row_number () over (order by LastName)) + 1 as varchar (4)),4) + cast(JulianDate AS CHAR(7)) + cast(' ' AS CHAR(2)) + cast('A_CONSTANT' AS CHAR(12)) + right('000000000000' + cast(Amount AS varchar(12)),12) + left(cast((FirstName + ' ' + LastName) AS char(16)),16) + Cast ('END' AS CHAR(3)) FROM dbo.SimpleBulkInsert
In the above examples, the name of the database is myDatabase. The name of the table that we created to store the data is called SimpleBulkInsert. The SQL script to create the table is shown below. I created the table directly in SQL Server Management Studio by right-clicking the Tables folder, choosing Table… and typing in the field names. SQL Server generates the above script automatically for you is you have a table to use. In the Object Explorer on the left part of SSMS, navigate to your database and right-click the table name, Script Table as…, CREATE To…, New Query Editor Window.
USE [myDatabase] GO /****** Object: Table [dbo].[SimpleBulkInsert] Script Date: 2017-01-29 7:49:32 PM ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TABLE [dbo].[SimpleBulkInsert]( [FirstName] [nvarchar](50) NOT NULL, [LastName] [nvarchar](50) NOT NULL, [CodeAB] [nvarchar](1) NOT NULL, [Amount] [int] NOT NULL, [JulianDate] [int] NOT NULL ) ON [PRIMARY] GO
Copy Paste from Excel to SQL Server
First, copy the data in Excel, excluding the headers if they exist. Next, in SSMS, open the table for editing the top 200 rows. Select the entire bottom row where the NULLs are by clicking on the left side and then paste. Ensure that the columns line up correctly.
The next post will discuss stored procedures in SSMS. We want to leap over the header and the trailer creation in the blog post series because they are fairly simple, except for the total calculation because we want to use a variable in T-SQL.