SQL Server and XML Introduction


Beginning with SQL Server 2005, Microsoft added the XML data type, the XQuery language, and several new functions for working with XML data in addition to the functionality found in SQL Server 2000. Why use XML? Companies often use XML to exchange data between incompatible internal systems or with their external vendors and customers. SQL Server also extensively uses XML data to store query plans.

In SQL Server there are primarily two ways of handling XML. Either you need to convert an XML document into a rowset (table) or you have a rowset and want to convert it into an XML document.

Converting an XML document into a rowset is called shredding, and this is the purpose of the OPENXML command introduced with SQL Server 2000. OPENXML must also be used in conjunction with two other commands: sp_xml_preparedocument and sp_xml_removedocument.

One useful feature that can be accessed via CLR Integration is the W3C Extensible Stylesheet Language Transformations (XSLT). As defined by the W3C, XSLT is a language designed for the sole purpose of “transforming XML documents into other XML documents.” SQL Server 2014 provides access to XSL transformations via a combination of the built-in xml data type and the .NET Framework XslCompiledTransform class.

Shredding’ XML data is another common request. To ‘shred’ means to strip the actual data away from the markup tags, and organize it into a relational format. For example, shredding is what happens when an XML document is imported into a table, when each node value is mapped to a specific field in the table. A popular method to use for this is to use the OPENXML() function, but XQuery methods can also be engaged to perform the same tasks. OPENXML() was available to use for shredding before the SQL Server XQuery methods were introduced, and is somewhat faster for larger data operations. However, it is decidedly more complex to use, and is more memory intensive. Also, OPENXML() cannot take advantage of XML indexes as XQuery methods can.

How to load an XML document into an XML variable

The following discussion is from SQLServerCentral.com. There are downloadable files there, if you register for free with an email. The zip file is at the bottom of the page. The OpenXML method of shredding an XML document requires that the entire XML document be loaded into system memory. The first step in doing this is accomplished by loading the document into a SQL variable of XML data type using the OPENROWSET(BULK) method, as demonstrated in the following SQL script.

DECLARE @Doc AS XML
SET @Doc = (SELECT * FROM OPENROWSET(BULK 'C:\xml\StockMarketDataAsElements.xml',SINGLE_BLOB) AS x)
SELECT @Doc AS XmlDoc

Execute this script from SQL Server Management Studio. This demonstrates that the XML document ‘StockMarketDataAsElements.xml’ was successfully stored in the XML data type variable. Now that we know how to get an XML document into an XML variable, we are ready to shred the elements-only XML document with OpenXML and insert its data into a table.

OpenXML uses XPath, a specialized expression language that parses through XML nodes to selectively retrieve data from XML documents. When using the OpenXML shredding method, the sp_xml_prepareDocument stored procedure is used to create an instance of an XML document in system memory from a copy of the document stored in an XML-type variable and return an integer handle that is used to address the XML object.

EXEC sp_xml_prepareDocument @hdoc OUTPUT, @Doc

In the above sp_xml_prepareDocument SQL statement, @Doc is the XML-type varialble holding the XML document, while @hdoc is the handle of the instantianted XML-object document. Following is the entire SQL script using this statement.

DECLARE @Doc XML
DECLARE @hdoc INT
DECLARE @iDoc INT
SET @Doc = (SELECT * FROM OPENROWSET(BULK 'C:\xml\StockMarketDataAsElements.xml',SINGLE_BLOB) AS x)
EXEC sp_xml_prepareDocument @hdoc OUTPUT, @Doc
SET @iDoc = 1

BEGIN TRY
    DROP TABLE Stocks
END TRY
BEGIN CATCH
END CATCH

SELECT * INTO Stocks
FROM Openxml (@iDoc,'//StockMarketData/StockData ')
WITH
(
    [Date] VARCHAR(MAX) 'Date',
    [Ticker] VARCHAR(MAX) 'Ticker',
    [Open] VARCHAR(MAX) 'Open',
    [High] VARCHAR(MAX) 'High',
    [Low] VARCHAR(MAX) 'Low',
    [Close] VARCHAR(MAX) 'Close',
    [Volume] VARCHAR(MAX) 'Volume'
)
EXEC sp_xml_removeDocument @iDoc
SELECT * FROM Stocks

Execute the SQL script to shred the ‘StockMarketDataElementsOnly.xml’ document with OpenXML and insert its data into the ‘Stocks’ table. The ‘SELECT * FROM Stocks’ SQL statement confirms that the contents of the ‘StockMarketDataElementsOnly.xml’ document were successfully inserted into the ‘Stocks’ table. The previous script is as simple an example of shredding XML with OpenXML that you are ever likely to see. Study it and the OpenXML syntax reference on MIcrosoft’s site. Write a similar script of your own, then play with the examples in Using OPENXML.

Shredding XML with XQuery

Like OpenXML, XQuery uses XPath to navigate through elements and attributes in an XML document. Shredding XML with XQuery does not require using sp_xml_prepareDocument to instantiate an XML object, meaning that the entire XML document does not have to be loaded into system memory. Execute the SQL script to shred the ‘StockMarketDataElementsOnly.xml’ document with XQuery and insert its data into the ‘Stocks’ table.

DECLARE @Doc XML
SET @Doc = (SELECT * FROM OPENROWSET(BULK 'C:\xml\StockMarketDataAsElements.xml',SINGLE_BLOB) AS x)
BEGIN TRY
    DROP TABLE Stocks
END TRY
BEGIN CATCH
END CATCH
SELECT * INTO Stocks FROM (
SELECT
    X.StockData.query('Date').value('.','VARCHAR(10)') AS 'Date',
    X.StockData.query('Ticker').value('.','VARCHAR(10)') AS 'Ticker',
    X.StockData.query('Open').value('.','VARCHAR(10)') AS 'Open',
    X.StockData.query('High').value('.','VARCHAR(10)') AS 'High',
    X.StockData.query('Low').value('.','VARCHAR(10)') AS 'Low',
    X.StockData.query('Close').value('.','VARCHAR(10)') AS 'Close',
    X.StockData.query('Volume').value('.','VARCHAR(10)') AS 'Volume'
FROM @Doc.nodes('StockMarketData/StockData') AS X(StockData)
) sub
SELECT * FROM Stocks ORDER BY Ticker

The ‘SELECT * FROM Stocks’ SQL statement confirms that the contents of the ‘StockMarketDataElementsOnly.xml’ document were successfully inserted into the Stocks table. The previous script is as simple an example of shredding XML with XQuery that you are ever likely to see. Study it and the XQuery Language Referene on Microsoft’s site. Write a similar script of your own, then play with the examples in XQuery Labs – A Collection of XQuery Sample Scripts.

OpenXML versus XQuery

XQuery is usually faster than OpenXML when shredding small XML documents, but slower when shredding large documents. The OpenXML method requires that the entire document be loaded into system memory, while the XQuery method does not. SQL Server performance can be severely affected when using OpenXML if the size of the XML document approaches the size of available system memory. XQuery is much more flexible than OpenXML. XQuery supports the use of the XML Data Modification Language, and its code is more manageable and scalable.

Leave a comment

Your email address will not be published. Required fields are marked *