Returning flattened data from a SQL table containing pairs of rows

I like little challenges. The ones that don’t take all day to figure out, but are enough to capture your interest for a little while. Yesterday I had such a problem to solve.

I had some data in a table that was basically in ‘pairs’ of rows. It was actually different to the example below, but the example we’ll use is a ‘Message’ table, that contains requests and replies, that are linked through a particular identifier.

Our simple example looks like this (my actual table had more fields).

CREATE TABLE Message
(
    MessageID INT NOT NULL IDENTITY ,
    MessageType CHAR(1) NOT NULL,
    TransactionID INT NOT NULL,
    MessageBody VARCHAR(30),
    CreatedDate DATETIME DEFAULT GetDate()
)

We’ll add a bit of sample data (script generated from my insert generator stored proc)

SET IDENTITY_INSERT Message ON

INSERT Message(MessageID,MessageType,TransactionID,MessageBody,CreatedDate) VALUES('1','Q','1','Request Message 1',convert(datetime,'2012-08-30 13:55:07.213',121))
INSERT Message(MessageID,MessageType,TransactionID,MessageBody,CreatedDate) VALUES('2','R','1','Reply Message 1',convert(datetime,'2012-08-30 13:55:37.680',121))
INSERT Message(MessageID,MessageType,TransactionID,MessageBody,CreatedDate) VALUES('3','Q','2','Request Message 2',convert(datetime,'2012-08-30 13:55:51.183',121))
INSERT Message(MessageID,MessageType,TransactionID,MessageBody,CreatedDate) VALUES('4','R','2','Reply Message 2',convert(datetime,'2012-08-30 13:56:04.020',121))

SET IDENTITY_INSERT Message OFF


SELECT * FROM Message

MessageID   MessageType TransactionID MessageBody                    CreatedDate
----------- ----------- ------------- ------------------------------ -----------------------
1           Q           1             Request Message 1              2012-08-30 13:55:07.213
2           R           1             Reply Message 1                2012-08-30 13:55:37.680
3           Q           2             Request Message 2              2012-08-30 13:55:51.183
4           R           2             Reply Message 2                2012-08-30 13:56:04.020 

We can see that some of the fields are consistent from row to row (in pairs), and some of the fields are unique to each row. My challenge was to represent a pair of messages in one row.

On the face of it, this seems like it would be simple – just grouping by the TransactionID (the field that links the two rows). The problem is that you won’t be able to get the unique information from both rows without some assumptions (that may not be solid).

For example, this will happily give you the MessageID’s of both sides of the transaction (given the assumption that the request comes before the reply, and that there are two messages in a transaction) …

SELECT TransactionID, MIN(MessageID) AS RequestID, MAX(MessageID) AS ReplyID
FROM [Message]
GROUP BY TransactionID HAVING COUNT(*) = 2

TransactionID RequestID   ReplyID
------------- ----------- -----------
1             1           2
2             3           4 

But – it’s doesn’t give you the unique data related to each ID, as you’d need to correlate the MessageBody to the right MessageID – MIN(MessageBody) won’t necessarily relate to the ‘Request’.

So… We can think about how to correlate the data to get the result we want. There’s a few options…

1. Use temporary tables, and get the result in two steps (reusing the query above)..

 

–1 – Two Step Process

SELECT TransactionID, MIN(MessageID) AS RequestID, MAX(MessageID) AS ReplyID
INTO #MessagePair
FROM [Message]
GROUP BY TransactionID HAVING COUNT(*) = 2

SELECT  REQ.MessageID AS RequestMessageID,
        REQ.TransactionId,
        REQ.MessageBody AS RequestBody,
        REQ.CreatedDate AS RequestDate,
        RPY.MessageID AS ReplyMessageID,
        RPY.MessageBody AS ReplyBody,
        RPY.CreatedDate AS ReplyDate
FROM #MessagePair MP
INNER JOIN [Message] REQ
   ON REQ.MessageID = MP.RequestID
INNER JOIN [Message] RPY
   ON RPY.MessageID = MP.ReplyID

RequestMessageID TransactionId RequestBody                    RequestDate             ReplyMessageID ReplyBody                      ReplyDate
---------------- ------------- ------------------------------ ----------------------- -------------- ------------------------------ -----------------------
1                1             Request Message 1              2012-08-30 13:55:07.213 2              Reply Message 1                2012-08-30 13:55:37.680
3                2             Request Message 2              2012-08-30 13:55:51.183 4              Reply Message 2                2012-08-30 13:56:04.020

2. Nasty correlated subquery and joins (not even going there)

3. Single query that makes use of the assmption that a request happens before a reply (meaning the messageId will be a lower value)

SELECT  REQ.MessageID AS RequestMessageID,
        REQ.TransactionId,
        REQ.MessageBody AS RequestBody,
        REQ.CreatedDate AS RequestDate,
        RPY.MessageID AS ReplyMessageID,
        RPY.MessageBody AS ReplyBody,
        RPY.CreatedDate AS ReplyDate
FROM [Message] REQ
INNER JOIN [Message] RPY
    ON REQ.TransactionID = RPY.TransactionID
AND REQ.MessageID < RPY.MessageID 

This produces the same result as above, and is what I ended up going with. I reckon there’s probably a few more potential viable solutions, so I’d be interested to see anyone’s alternative solutions.

Generate SQL Server Inserts from existing data – with Identity Inserts

A good while ago I posted a stored proc that would generate insert statements for table data, along with simple filtering capability.

I broke this out again today, as I needed to recreate part of a database on a local machine.  I didn’t have knowledge of the schema so I just went about fixing each constraint error and adding the reference tables as required to my script.  After manually adding ‘SET IDENTITY_INSERT xx’ about a dozen times I added the functionality to the Stored Proc – so if your table has an identity column it will now wrap the results in IDENTITY_INSERT statements, saving you a bit more time, and headaches.

An interesing side effect if you’re explicitly inserting identity values is that the order of your inserts may become pertinent – especially if you’ve got foreign keys referencing the same table.  I then just added the capability to ‘order by’.
 
There’s no automatic way to switch off column headers, so you’ll need to configure that in the Resutls Output options in Query Analyzer (or management studio).

If you run the proc into ‘Northwind’ and run:

usp_insertgenerator Employees, @order=’EmployeeID’

You’ll get…

SET IDENTITY_INSERT Employees ON

INSERT Employees(EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,BirthDate,HireDate,
Address,City,Region,PostalCode,Country,HomePhone,Extension,Photo,Notes,ReportsTo,PhotoPath)
  VALUES(‘1′,’Davolio’,’Nancy’,’Sales Representative’,’Ms.’,convert(dateti
INSERT Employees(EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,BirthDate,HireDate,
Address,City,Region,PostalCode,Country,HomePhone,Extension,Photo,Notes,ReportsTo,PhotoPath)
  VALUES(‘2′,’Fuller’,’Andrew’,’Vice President, Sales’,’Dr.’,convert(datet
INSERT Employees(EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,BirthDate,HireDate,
Address,City,Region,PostalCode,Country,HomePhone,Extension,Photo,Notes,ReportsTo,PhotoPath)
  VALUES(‘3′,’Leverling’,’Janet’,’Sales Representative’,’Ms.’,convert(date
INSERT Employees(EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,BirthDate,HireDate,
Address,City,Region,PostalCode,Country,HomePhone,Extension,Photo,Notes,ReportsTo,PhotoPath)
  VALUES(‘4′,’Peacock’,’Margaret’,’Sales Representative’,’Mrs.’,convert(da
INSERT Employees(EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,BirthDate,HireDate,
Address,City,Region,PostalCode,Country,HomePhone,Extension,Photo,Notes,ReportsTo,PhotoPath)
  VALUES(‘5′,’Buchanan’,’Steven’,’Sales Manager’,’Mr.’,convert(datetime,’1
INSERT Employees(EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,BirthDate,HireDate,
Address,City,Region,PostalCode,Country,HomePhone,Extension,Photo,Notes,ReportsTo,PhotoPath)
  VALUES(‘6′,’Suyama’,’Michael’,’Sales Representative’,’Mr.’,convert(datet
INSERT Employees(EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,BirthDate,HireDate,
Address,City,Region,PostalCode,Country,HomePhone,Extension,Photo,Notes,ReportsTo,PhotoPath)
  VALUES(‘7′,’King’,’Robert’,’Sales Representative’,’Mr.’,convert(datetime
INSERT Employees(EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,BirthDate,HireDate,
Address,City,Region,PostalCode,Country,HomePhone,Extension,Photo,Notes,ReportsTo,PhotoPath)
  VALUES(‘8′,’Callahan’,’Laura’,’Inside Sales Coordinator’,’Ms.’,convert(d
INSERT Employees(EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,BirthDate,HireDate,
Address,City,Region,PostalCode,Country,HomePhone,Extension,Photo,Notes,ReportsTo,PhotoPath)
  VALUES(‘9′,’Dodsworth’,’Anne’,’Sales Representative’,’Ms.’,convert(datet

SET IDENTITY_INSERT Employees OFF

Here’s the script.

if exists (select * from dbo.sysobjects where id = object_id(N'[dbo].[usp_InsertGenerator]') and 
OBJECTPROPERTY(id, N'IsProcedure') = 1)
drop procedure [dbo].[usp_InsertGenerator]
GO


CREATE PROC dbo.usp_InsertGenerator
(@tableName varchar(100), @where varchar(1000) = NULL, @order varchar(1000) = NULL) as

SET NOCOUNT ON

--Check whether the table has an identity column
DECLARE @TableHasIdentityColumn BIT
SELECT @TableHasIdentityColumn = 1
from INFORMATION_SCHEMA.COLUMNS
where TABLE_SCHEMA = 'dbo'
and COLUMNPROPERTY(object_id(TABLE_NAME), COLUMN_NAME, 'IsIdentity') = 1
AND TABLE_NAME = @tableName

IF ISNULL(@TableHasIdentityColumn, 0) = 1
    PRINT 'SET IDENTITY_INSERT ' + @tableName + ' ON'

--Declare a cursor to retrieve column specific information for the specified table
DECLARE cursCol CURSOR FAST_FORWARD FOR 
SELECT column_name,data_type FROM information_schema.columns WHERE table_name = @tableName
OPEN cursCol
DECLARE @string nvarchar(3000) --for storing the first half of INSERT statement
DECLARE @stringData nvarchar(3000) --for storing the data (VALUES) related statement
DECLARE @dataType nvarchar(1000) --data types returned for respective columns
SET @string='INSERT '+@tableName+'('
SET @stringData=''
 
DECLARE @colName nvarchar(50)
 
FETCH NEXT FROM cursCol INTO @colName,@dataType
 
IF @@fetch_status<>0
 begin
 print 'Table '+@tableName+' not found, processing skipped.'
 close curscol
 deallocate curscol
 return
END
 
WHILE @@FETCH_STATUS=0
BEGIN
IF @dataType in ('varchar','char','nchar','nvarchar')
BEGIN
 SET @stringData=@stringData+''''+'''+isnull('''''+'''''+'+ @colName+'+'''''+''''',''NULL'')+'',''+'
END
ELSE
if @dataType in ('text','ntext') --if the datatype is text or something else 
BEGIN
 SET @stringData=@stringData+'''''''''+isnull(cast('+ @colName+' as varchar(2000)),'''')+'''''',''+'
END
ELSE
IF @dataType = 'money' --because money doesn't get converted from varchar implicitly
BEGIN
 SET @stringData=@stringData+'''convert(money,''''''+isnull(cast('+ @colName+' as varchar(200)),''0.0000'')+''''''),''+'
END
ELSE 
IF @dataType='datetime'
BEGIN
 SET @stringData=@stringData+'''convert(datetime,'+'''+isnull('''''+'''''+convert(varchar(200),'+ @colName+',121)+'''''+''''',''NULL'')+'',121),''+'
END
ELSE 
IF @dataType='image' 
BEGIN
 SET @stringData=@stringData+'''''''''+isnull(cast(convert(varbinary,'+ @colName+') as varchar(6)),''0'')+'''''',''+'
END
ELSE --presuming the data type is int,bit,numeric,decimal 
BEGIN
 SET @stringData=@stringData+''''+'''+isnull('''''+'''''+convert(varchar(200),'+ @colName+')+'''''+''''',''NULL'')+'',''+'
END
 
SET @string=@string+@colName+','
 
FETCH NEXT FROM cursCol INTO @colName,@dataType
END
DECLARE @Query nvarchar(4000)
 
SET @query ='SELECT '''+substring(@string,0,len(@string)) + ') VALUES(''+ ' + substring(@stringData,0,len(@stringData)-2)+'''+'')'' FROM '+@tableName
IF @where IS NOT NULL
 SET @query = @query + ' where ' + @where
IF @order IS NOT NULL
 SET @query = @query + ' order by ' + @order
 
exec sp_executesql @query
 
CLOSE cursCol
DEALLOCATE cursCol


IF ISNULL(@TableHasIdentityColumn, 0) = 1
    PRINT 'SET IDENTITY_INSERT ' + @tableName + ' OFF'
GO 

LINQ to SQL Connection Strings with Class Library and Web.Config

Most Microsoft technologies that you can operate with a GUI come with some tradeoffs.  Things have certainly improved over the years and now something like the LINQ to SQL designer is pretty trouble free – unless of course you have something like this fairly common scenario:

I had a class library (Data Access), and decided to add LINQ to SQL classes for a new database that was being introduced.
This class library is also ultimately being consumed by WCF web services.  I have dev, test, prod environments, so I use ASP.NET Web Deployment projects to change configuration per environment for things like appSettings and connectionStrings.

It therefore followed that I wanted to configure the LINQ DataContext connection properties in web.config.  Out of the box you’ll find your connection properties go into your Settings properties class, which gets a little bit in the way.

If you start playing around with the generated classes to change where you’re getting the connection info from then any changes in the designer will wipe them out, so a (relatively) pain free approach to setting your connection safely is the following:

Go to your LINQ to SQL designer and remove the Connection String, and set Application Settings to False

Create a new partial class to mirror your DataContext, and set the constructor to retrieve from your alternative source…

using System;
using System.Linq;
using System.Configuration;
using System.Data.Linq;

namespace CodeBureau.Services.DataAccess
{
    public partial class MyDataContext : DataContext
    {
        public MyDataContext()
            : base(ConfigurationManager.ConnectionStrings["MyConnectionString"].ConnectionString, 
mappingSource) { OnCreated(); } } }

This will leave all your generated code intact, but will sort out your configuration woes.