SQL Server csv in xml out Project Part 2


This is a continuation of the first post.

Input Data Specifications – Initial Thoughts

In this section we will begin to design the data that is coming into SQL Server, and the data that is sent out into the xml file for the accounting system. There are several things to think about. What data do we need to exactly describe one book? For our project we will keep the number of data elements small. In the real world there would be more data elements which we could add later which won’t effect our overall algorithm.

Each book has a title, ISBN, author, a quantity ordered, and a suggested retail price. In the incoming csv file, there will also be a number field that has contains the number of books we received. In the header of the incoming csv fie there will also be an order number. It uniquely identifies a specific order from a supplier. I Love Books Inc. created a numbering system that uses the first three letters of the company name followed by four numerals.

Below is an example of an incoming csv file from ABC Book Supply Inc. We are still training them, so you will notice that they put a dollar sigh in front of the price. When we import this we will strip that away as part of our data cleaning routine. Also notice that the name of the book has a comma in the title, and the book title is enclosed in quotes. For illustration purposes, I put an ampersand in the author section. This may need to be delt with in the xml file, and replaced with an entity.

Incoming CSV File

Here is an example of the incoming csv file.

ABC Book Supply Inc.,ABC0001,00001,20170402113035
"Microsoft SQL Server 2016: A Beginner's Guide, Sixth Edition",1259641791,Dusan Petovick & nobody else,5,$32.12

Notice that there is no trailer line. We could have included one, but for simplicity we didn’t. The trailer line would have a total of units (in this case 5) and a total value of $160.60 (5 x 32.12). This could serve as a way to validate some of the internal data in the file itself. The XML output file requires this information. Since this information describes the order itself, it will need to be computed and stored in the Orders table.

Tables

In SQL Server we can create a diagram of our tables and show the relationships between primary keys and foreign keys. Suppliers can have many orders and each order can have may books. Below is a database diagram.

To test the tables at the bottom of the previous post I will go right into SSMS and add data from the CSV file into the three tables. Now I realize that we need to make a change to the [Books] table. We do not need to have the column [SupplierCode]. Books are part of an order. Orders have books and suppliers have orders. But suppliers do not, in this database relationship, have books. They must have an order to have books so the [SupplierCode] in the table [Books] is redundant and needs to be removed. The change is reflected in the diagram below.

Will this table design work with our data in the file above?

Additional Tables

After working more on this, and writing more blogs, I have come back to add more tables to our design. When we do a bulk insert we need to create new tables for this. Also, we need history tables to store the supplier, orders and books that have successfully been sent to the accounting department. Now, here is a list of tables in SQL Server.

  • 00FileLocations
  • 01HeaderBulkInsert
  • 02DetailBulkInsert
  • Books
  • BooksHistory
  • Orders
  • OrdersHistory
  • Suppliers
  • SuppliersHistory
  • XmlData

Warning

Make sure that you are leaving enough space in the columns for the incoming data. If you do not you may not get the error you expect or may not get an error at all! It just may stop working part way through. For example, you may import into the Header table and not the Detail table.

Series Navigation<< SQL Server csv in xml out ProjectSQL Server csv in xml out Project Part 3 >>

Leave a comment

Your email address will not be published. Required fields are marked *