Returning flattened data from a SQL table containing pairs of rows

I like little challenges. The ones that don’t take all day to figure out, but are enough to capture your interest for a little while. Yesterday I had such a problem to solve.

I had some data in a table that was basically in ‘pairs’ of rows. It was actually different to the example below, but the example we’ll use is a ‘Message’ table, that contains requests and replies, that are linked through a particular identifier.

Our simple example looks like this (my actual table had more fields).

CREATE TABLE Message
(
    MessageID INT NOT NULL IDENTITY ,
    MessageType CHAR(1) NOT NULL,
    TransactionID INT NOT NULL,
    MessageBody VARCHAR(30),
    CreatedDate DATETIME DEFAULT GetDate()
)

We’ll add a bit of sample data (script generated from my insert generator stored proc)

SET IDENTITY_INSERT Message ON

INSERT Message(MessageID,MessageType,TransactionID,MessageBody,CreatedDate) VALUES('1','Q','1','Request Message 1',convert(datetime,'2012-08-30 13:55:07.213',121))
INSERT Message(MessageID,MessageType,TransactionID,MessageBody,CreatedDate) VALUES('2','R','1','Reply Message 1',convert(datetime,'2012-08-30 13:55:37.680',121))
INSERT Message(MessageID,MessageType,TransactionID,MessageBody,CreatedDate) VALUES('3','Q','2','Request Message 2',convert(datetime,'2012-08-30 13:55:51.183',121))
INSERT Message(MessageID,MessageType,TransactionID,MessageBody,CreatedDate) VALUES('4','R','2','Reply Message 2',convert(datetime,'2012-08-30 13:56:04.020',121))

SET IDENTITY_INSERT Message OFF


SELECT * FROM Message

MessageID   MessageType TransactionID MessageBody                    CreatedDate
----------- ----------- ------------- ------------------------------ -----------------------
1           Q           1             Request Message 1              2012-08-30 13:55:07.213
2           R           1             Reply Message 1                2012-08-30 13:55:37.680
3           Q           2             Request Message 2              2012-08-30 13:55:51.183
4           R           2             Reply Message 2                2012-08-30 13:56:04.020 

We can see that some of the fields are consistent from row to row (in pairs), and some of the fields are unique to each row. My challenge was to represent a pair of messages in one row.

On the face of it, this seems like it would be simple – just grouping by the TransactionID (the field that links the two rows). The problem is that you won’t be able to get the unique information from both rows without some assumptions (that may not be solid).

For example, this will happily give you the MessageID’s of both sides of the transaction (given the assumption that the request comes before the reply, and that there are two messages in a transaction) …

SELECT TransactionID, MIN(MessageID) AS RequestID, MAX(MessageID) AS ReplyID
FROM [Message]
GROUP BY TransactionID HAVING COUNT(*) = 2

TransactionID RequestID   ReplyID
------------- ----------- -----------
1             1           2
2             3           4 

But – it’s doesn’t give you the unique data related to each ID, as you’d need to correlate the MessageBody to the right MessageID – MIN(MessageBody) won’t necessarily relate to the ‘Request’.

So… We can think about how to correlate the data to get the result we want. There’s a few options…

1. Use temporary tables, and get the result in two steps (reusing the query above)..

 

–1 – Two Step Process

SELECT TransactionID, MIN(MessageID) AS RequestID, MAX(MessageID) AS ReplyID
INTO #MessagePair
FROM [Message]
GROUP BY TransactionID HAVING COUNT(*) = 2

SELECT  REQ.MessageID AS RequestMessageID,
        REQ.TransactionId,
        REQ.MessageBody AS RequestBody,
        REQ.CreatedDate AS RequestDate,
        RPY.MessageID AS ReplyMessageID,
        RPY.MessageBody AS ReplyBody,
        RPY.CreatedDate AS ReplyDate
FROM #MessagePair MP
INNER JOIN [Message] REQ
   ON REQ.MessageID = MP.RequestID
INNER JOIN [Message] RPY
   ON RPY.MessageID = MP.ReplyID

RequestMessageID TransactionId RequestBody                    RequestDate             ReplyMessageID ReplyBody                      ReplyDate
---------------- ------------- ------------------------------ ----------------------- -------------- ------------------------------ -----------------------
1                1             Request Message 1              2012-08-30 13:55:07.213 2              Reply Message 1                2012-08-30 13:55:37.680
3                2             Request Message 2              2012-08-30 13:55:51.183 4              Reply Message 2                2012-08-30 13:56:04.020

2. Nasty correlated subquery and joins (not even going there)

3. Single query that makes use of the assmption that a request happens before a reply (meaning the messageId will be a lower value)

SELECT  REQ.MessageID AS RequestMessageID,
        REQ.TransactionId,
        REQ.MessageBody AS RequestBody,
        REQ.CreatedDate AS RequestDate,
        RPY.MessageID AS ReplyMessageID,
        RPY.MessageBody AS ReplyBody,
        RPY.CreatedDate AS ReplyDate
FROM [Message] REQ
INNER JOIN [Message] RPY
    ON REQ.TransactionID = RPY.TransactionID
AND REQ.MessageID < RPY.MessageID 

This produces the same result as above, and is what I ended up going with. I reckon there’s probably a few more potential viable solutions, so I’d be interested to see anyone’s alternative solutions.

Generate SQL Server Inserts from existing data – with Identity Inserts

A good while ago I posted a stored proc that would generate insert statements for table data, along with simple filtering capability.

I broke this out again today, as I needed to recreate part of a database on a local machine.  I didn’t have knowledge of the schema so I just went about fixing each constraint error and adding the reference tables as required to my script.  After manually adding ‘SET IDENTITY_INSERT xx’ about a dozen times I added the functionality to the Stored Proc – so if your table has an identity column it will now wrap the results in IDENTITY_INSERT statements, saving you a bit more time, and headaches.

An interesing side effect if you’re explicitly inserting identity values is that the order of your inserts may become pertinent – especially if you’ve got foreign keys referencing the same table.  I then just added the capability to ‘order by’.
 
There’s no automatic way to switch off column headers, so you’ll need to configure that in the Resutls Output options in Query Analyzer (or management studio).

If you run the proc into ‘Northwind’ and run:

usp_insertgenerator Employees, @order=’EmployeeID’

You’ll get…

SET IDENTITY_INSERT Employees ON

INSERT Employees(EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,BirthDate,HireDate,
Address,City,Region,PostalCode,Country,HomePhone,Extension,Photo,Notes,ReportsTo,PhotoPath)
  VALUES(‘1′,’Davolio’,’Nancy’,’Sales Representative’,’Ms.’,convert(dateti
INSERT Employees(EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,BirthDate,HireDate,
Address,City,Region,PostalCode,Country,HomePhone,Extension,Photo,Notes,ReportsTo,PhotoPath)
  VALUES(‘2′,’Fuller’,’Andrew’,’Vice President, Sales’,’Dr.’,convert(datet
INSERT Employees(EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,BirthDate,HireDate,
Address,City,Region,PostalCode,Country,HomePhone,Extension,Photo,Notes,ReportsTo,PhotoPath)
  VALUES(‘3′,’Leverling’,’Janet’,’Sales Representative’,’Ms.’,convert(date
INSERT Employees(EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,BirthDate,HireDate,
Address,City,Region,PostalCode,Country,HomePhone,Extension,Photo,Notes,ReportsTo,PhotoPath)
  VALUES(‘4′,’Peacock’,’Margaret’,’Sales Representative’,’Mrs.’,convert(da
INSERT Employees(EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,BirthDate,HireDate,
Address,City,Region,PostalCode,Country,HomePhone,Extension,Photo,Notes,ReportsTo,PhotoPath)
  VALUES(‘5′,’Buchanan’,’Steven’,’Sales Manager’,’Mr.’,convert(datetime,’1
INSERT Employees(EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,BirthDate,HireDate,
Address,City,Region,PostalCode,Country,HomePhone,Extension,Photo,Notes,ReportsTo,PhotoPath)
  VALUES(‘6′,’Suyama’,’Michael’,’Sales Representative’,’Mr.’,convert(datet
INSERT Employees(EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,BirthDate,HireDate,
Address,City,Region,PostalCode,Country,HomePhone,Extension,Photo,Notes,ReportsTo,PhotoPath)
  VALUES(‘7′,’King’,’Robert’,’Sales Representative’,’Mr.’,convert(datetime
INSERT Employees(EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,BirthDate,HireDate,
Address,City,Region,PostalCode,Country,HomePhone,Extension,Photo,Notes,ReportsTo,PhotoPath)
  VALUES(‘8′,’Callahan’,’Laura’,’Inside Sales Coordinator’,’Ms.’,convert(d
INSERT Employees(EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,BirthDate,HireDate,
Address,City,Region,PostalCode,Country,HomePhone,Extension,Photo,Notes,ReportsTo,PhotoPath)
  VALUES(‘9′,’Dodsworth’,’Anne’,’Sales Representative’,’Ms.’,convert(datet

SET IDENTITY_INSERT Employees OFF

Here’s the script.

if exists (select * from dbo.sysobjects where id = object_id(N'[dbo].[usp_InsertGenerator]') and 
OBJECTPROPERTY(id, N'IsProcedure') = 1)
drop procedure [dbo].[usp_InsertGenerator]
GO


CREATE PROC dbo.usp_InsertGenerator
(@tableName varchar(100), @where varchar(1000) = NULL, @order varchar(1000) = NULL) as

SET NOCOUNT ON

--Check whether the table has an identity column
DECLARE @TableHasIdentityColumn BIT
SELECT @TableHasIdentityColumn = 1
from INFORMATION_SCHEMA.COLUMNS
where TABLE_SCHEMA = 'dbo'
and COLUMNPROPERTY(object_id(TABLE_NAME), COLUMN_NAME, 'IsIdentity') = 1
AND TABLE_NAME = @tableName

IF ISNULL(@TableHasIdentityColumn, 0) = 1
    PRINT 'SET IDENTITY_INSERT ' + @tableName + ' ON'

--Declare a cursor to retrieve column specific information for the specified table
DECLARE cursCol CURSOR FAST_FORWARD FOR 
SELECT column_name,data_type FROM information_schema.columns WHERE table_name = @tableName
OPEN cursCol
DECLARE @string nvarchar(3000) --for storing the first half of INSERT statement
DECLARE @stringData nvarchar(3000) --for storing the data (VALUES) related statement
DECLARE @dataType nvarchar(1000) --data types returned for respective columns
SET @string='INSERT '+@tableName+'('
SET @stringData=''
 
DECLARE @colName nvarchar(50)
 
FETCH NEXT FROM cursCol INTO @colName,@dataType
 
IF @@fetch_status<>0
 begin
 print 'Table '+@tableName+' not found, processing skipped.'
 close curscol
 deallocate curscol
 return
END
 
WHILE @@FETCH_STATUS=0
BEGIN
IF @dataType in ('varchar','char','nchar','nvarchar')
BEGIN
 SET @stringData=@stringData+''''+'''+isnull('''''+'''''+'+ @colName+'+'''''+''''',''NULL'')+'',''+'
END
ELSE
if @dataType in ('text','ntext') --if the datatype is text or something else 
BEGIN
 SET @stringData=@stringData+'''''''''+isnull(cast('+ @colName+' as varchar(2000)),'''')+'''''',''+'
END
ELSE
IF @dataType = 'money' --because money doesn't get converted from varchar implicitly
BEGIN
 SET @stringData=@stringData+'''convert(money,''''''+isnull(cast('+ @colName+' as varchar(200)),''0.0000'')+''''''),''+'
END
ELSE 
IF @dataType='datetime'
BEGIN
 SET @stringData=@stringData+'''convert(datetime,'+'''+isnull('''''+'''''+convert(varchar(200),'+ @colName+',121)+'''''+''''',''NULL'')+'',121),''+'
END
ELSE 
IF @dataType='image' 
BEGIN
 SET @stringData=@stringData+'''''''''+isnull(cast(convert(varbinary,'+ @colName+') as varchar(6)),''0'')+'''''',''+'
END
ELSE --presuming the data type is int,bit,numeric,decimal 
BEGIN
 SET @stringData=@stringData+''''+'''+isnull('''''+'''''+convert(varchar(200),'+ @colName+')+'''''+''''',''NULL'')+'',''+'
END
 
SET @string=@string+@colName+','
 
FETCH NEXT FROM cursCol INTO @colName,@dataType
END
DECLARE @Query nvarchar(4000)
 
SET @query ='SELECT '''+substring(@string,0,len(@string)) + ') VALUES(''+ ' + substring(@stringData,0,len(@stringData)-2)+'''+'')'' FROM '+@tableName
IF @where IS NOT NULL
 SET @query = @query + ' where ' + @where
IF @order IS NOT NULL
 SET @query = @query + ' order by ' + @order
 
exec sp_executesql @query
 
CLOSE cursCol
DEALLOCATE cursCol


IF ISNULL(@TableHasIdentityColumn, 0) = 1
    PRINT 'SET IDENTITY_INSERT ' + @tableName + ' OFF'
GO 

Use SQL Server Trusted Connections with ASP.NET on Windows 2003 without impersonation

Access control and troubleshooting 401 errors must be one of the most annoying and recurring issues with configuring IIS.  One of the problems is that it’s often quite a long time between issues, and you simply forget how you solved something last time.  This time I decided to write it all down as I set things up.

Target Scenario
My target scenario is a local intranet, where you want to use a ‘service account’ to access SQL Server, directly from your ‘trusted’ web application, removing the need for impersonation. 

The benefits of this are of course that you can take advantage of connection pooling, but also removing the need to configure passwords in web.config for SQL users (or specific, impersonated domain users).  This also removed the overhead of configuring specific domain users and their SQL Server permissions.  It may also be that you just want to simplify your security model to work only on Windows authentication across the stack.  

SQL Server

  1. Create a new role in the database you’re accessing, for the purposes of your application
  2. Add your service domain user account to the role in SQL Server
  3. Assign permissions to objects, stored procedures etc to the role (not directly to the user)

IIS/Web Site

  1. Set up your web site/application as you would normally – one way to do things….
    1. Create your web application root folder on the web server
    2. Copy your files (or use your deployment tools/scripts to do this)
    3. Create a new application pool to house your new web application (probably model this from the default web site).  This is important as this is where the credentials will be set
    4. Create the new IIS web application against the root folder (if not already done as part of step 2)
    5. Associate the new IIS application with the new application pool
    6. Set the ASP.NET version of your IIS application appropriate (may need to restart IIS here)
  2. Ensure ‘Integrated Security’ is set ON in the Directory Security tab, and ‘Anonymous’ access is switched OFF
  3. Set the application pool’s ‘identity’ to the domain user you want to run the application (and connect to SQL Server) as
  4. Open a command window and Go to the windows\Microsoft.NET\Framework\vXXXXX folder
    1. run aspnet_regiis -ga <domain>\<user> to grant necessary access to the metabase etc for the service account (as per http://msdn.microsoft.com/en-us/library/ff647396.aspx#paght000008_step3 )
    2. In the command window go to the inetpub\adminscripts folder, and set
      NTAuthenticationHeaders metabase property as per instructions at
      http://support.microsoft.com/kb/326985.  you can also use MetaEdit from
      the IIS Resource Kit to change this.  If you’re fully configured to use Kerberos then you can potentially skip this step, as it’s all about making IIS use NTLM authentication.
  5. Navigate to ‘Web Service Extensions’ in IIS Manager, and ensure that the ASP.NET version you’re targeting is ‘allowed’.  e.g. ASP.NET 4.0 is ‘prohibited’ by default.

Summary
So here we’ve circumvented the need to use impersonation by running the ASP.NET application as a specific domain user that is configured as a SQL Server Login, and granted the right access by means of a SQL Server role.  The main work is the plumbing to get IIS to work happily with that user in standard NTLM authentication (you may be able to use Kerberos depending on your network configuration).

Other background on creating service accounts can be found at http://msdn.microsoft.com/en-us/library/ms998297.aspx

SQL Server Stored Procedure to disable / enable all foreign keys and constraints in a database

I’ve been doing a bit of batch archiving work, and needed a nice and quick way to disable foreign keys and other constraints without actually removing them.  I found a nice article on disabling all constraints in a database, and thought I’d just take it one step further by making it into a Stored Procedure, and adding a parameter to toggle whether the constraints are enabled or disabled.

Nice and easy.  Here’s the script.

USE [YOURDBNAME]
GO

IF  EXISTS (SELECT FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[usp_SetDBTableConstraints]’AND type in (N’P’N’PC’))
DROP PROCEDURE dbo.usp_SetDBTableConstraints
GO

CREATE PROCEDURE dbo.usp_SetDBTableConstraints(@Enable BIT)

AS

/*** DISABLE/ENABLE ALL TABLE CONSTRAINTS ************************

This script will disable all constraints on all tables within the database
that it is run in.

************************************************************/

SET NOCOUNT ON
SET ROWCOUNT 

DECLARE @Count INT
DECLARE 
@String NVARCHAR (1000)
DECLARE @ConstraintName VARCHAR(128)
DECLARE @TableName VARCHAR(128)

–Find all constraints and their respective tables from the sysobjects table and place into a temp table.
–Primary Key and Unique Constraints via Unique Indexes are not disabled through this command
–You should use the ALTER INDEX…DISABLE command in SQL Server 2005
SELECT 
        name                     AS 
constraintname,
        object_name(parent_obj)  
AS tablename 
INTO #Const_Table
FROM sysobjects s 
where xtype in (‘F’)

SELECT @Count Count(*) FROM #Const_Table

–Setting the rowcount to one allows for one row from the temp table to be picked off at a time.
–Used as an alternative to a cursor.
SET ROWCOUNT 1

–Loop until all rows in temp table have been processed.
WHILE @Count > 0
BEGIN
    
–The rowcount of one ensures that only one tablename and constraint name is picked.
    
SELECT @TableName TableName, @ConstraintName ConstraintName
    
FROM #Const_Table

    –Build execution string to disable constraint.
    
IF @Enable 
        
SET @String ‘ALTER TABLE [‘+ @tablename + ‘] CHECK CONSTRAINT [‘ + @constraintname +‘]’
    
ELSE
        SET 
@String ‘ALTER TABLE [‘+ @tablename + ‘] NOCHECK CONSTRAINT [‘ + @constraintname +‘]’

    –Execute the SQL
    
EXEC sp_executesql @string

    –Remove this row from the temp table, since it has now been processed.
    
DELETE FROM #Const_Table WHERE ConstraintName @ConstraintName and TableName @TableName

    SET @Count @Count – 1
END – Loop

SET ROWCOUNT 0

DROP TABLE #Const_Table
 

GO

HOW TO Use SQL Server DTS Transform Data Task to export to dynamic Excel spreadsheet file

Many people (including me) are often still using SQL 2000 – which means that you might be using DTS (Data Transformation Services) to perform utility jobs on your databases. 
I’ve had a requirement to output data to Excel (not my choice – Excel usually isn’t!) – which is subtly different to text output as CSV*.  Another requirement was to have a ‘generated’ file name in a known folder.  This is as simple as a date/time driven file name.  The ‘job’ runs once (or more) a day and so creates a new file each time.  Easy right?

Well – it turns out that if you go through the normal motions of creating an Excel connection and using a Transform Data Task it’s likely to work first time and leave you thinking ‘great’.  Unfortunately when you run it again you’re likely to get a message similar to the following:

Table ‘New_Table$’ not found. 

After much digging (hence the reason for writing this), I discovered that in order to map the columns successfully you basically have to create the table in the spreadsheet before running the transformation (the ‘create table’ part of creating the transformation).  If you’ve got a dynamic file name then the next time you’ll  effectively have a blank spreadsheet without the required receiving ‘table’, and it will fail, unless you’ve recreated the sheet before the transformation.

OK so assuming you want to output some data to a spreadsheet with a dynamic filename with a Transform Data task then here’s how you do it.

  1. I put all files in the same folder so I use 2 global variables to build the filename – NetworkLocation and FinalLocation
    1. First, Right-click on a blank part of your package screen and choose Package Properties, then Global Variables tab.
    2. Create the two global variables NetworkLocation and FinalLocation
    3. Edit the NetworkLocation value to be the folder (including a final ‘\’) of where you want to store the files.
    4. Leave FinalLocation blank (we’ll create that in the next step.
  2. Add an ActiveX Script Task with the following code (which you can
    obviously tailor the name to your needs).  It basically creates a
    date/time based file name, and stores a concatenation of the network
    location and file name in the FinalLocation global variable.

    ‘**********************************************************************
    ‘  Visual Basic ActiveX Script
    ‘************************************************************************
    Function Main()

        Dim fileName
        fileName = “MyOutput_” & DatePart(“yyyy”,Date) & Right(“0″ & DatePart(“m”,Date), 2) & Right(“0″ & DatePart(“d”,Date), 2) & “_” & Right(“0″ & DatePart(“h”,Now), 2) & Right(“0″ & DatePart(“n”,Now), 2) & Right(“0″ & DatePart(“s”,Now), 2)
        DTSGlobalVariables(“FinalLocation”).Value = DTSGlobalVariables(“NetworkLocation”).Value + fileName + “.xls”

        Main = DTSTaskExecResult_Success

    End Function

  3. Add your data connections
    1. Create your source connection (presumably SQL Server or wherever your data’s coming from)
    2. Create your destination connection (A Microsoft Excel 97-2000
      connection) and set an initial file name (best to be in the folder
      specified in NetworkLocation)
  4. Create a Data Transformation Task connecting your source and destination connections as per normal
    1. Add in your Table/View or SQL Query to define the source data.
    2. In the Destination tab click ‘Create…’ next to ‘Table name:’ (a
      dialog will pop up ‘Create Destination Table’).  Change the table name
      (or column definitions) if you like, then COPY THE ENTIRE TEXT – and
      paste into a text editor for safe keeping.
    3. Click OK (This will create a new sheet in the spreadsheet you’ve
      specified in your connection, ready to receive your data).  Your fields
      should then also be populated in the table below.
    4. Check the transformations tab – you should get a nice one to one mapping between source and destination columns.
  5. Create a Dynamic Properties Task (Here’s where it all comes together)
    1. Click ‘Add…’ to create a new property.  Choose your Excel Connection and pick the ‘DataSource’ property, then click ‘Set…’
    2. Choose ‘Global Variable’ as the source and FinalLocation as the variable – Click OK.
    3. Click ‘Add…’ to create another property.  Choose the
      DTSTask_DTSDataPumpTask_? in the Tasks node (your data transformation
      task) and pick the ‘DestinationObjectName’ property.
    4. Choose ‘Query’ as the source and paste in your ‘CREATE TABLE‘ statement from earlier – Parse to check, then click OK.
  6. Create the Workflow for the task
    1. Select the Dynamic Properties Task, right-click and select Workflow –> Workflow Properties.
    2. Click ‘New’ and set the ActiveX Script Task as ‘Source Step’.
    3. Select the Data Transformation Task, right-click and choose Workflow Properties.
    4. Click ‘New’ and set the Dynamic Properties Task as ‘Source Step’.

That’s basically it – the package should look something like this:

It should also now do the following (in order):

  1. Set filenames (stored in global variables)
  2. Set filename of the excel connection, create the destination ‘sheet’
    from the global variables and the query in the transformation
  3. Do the transformation and store the results in a date/time named Excel spreadsheet in your configured folder

If you ever need to change the query (columns) then you’ll need to regenerate the ‘CREATE TABLE’ statement and reflect it in the Dynamic Properties Task (5.c)

*Typically an Excel export covers your back with datatypes where a CSV export won’t – so when you open in Excel you’ll have to format columns specifically yourself – like converting dates, and also long numbers from scientific format to text format.

Using local SQL 2005 Express databases, |DataDirectory| setting and SQLBulkCopy with SemiColon delimited text files

Had some fun with this today as I was utterly confused why I wasn’t able to save data to my SQL2005 express DB (linked within a Windows Forms VS 2005 project), even though I wasn’t getting any errors from a SQLBulkCopy operation.  The connection string was set (from the wizard) to use ‘|DataDirectory|\dbname.mdf.

It turned out of course that the reason there wasn’t an error was because the data was being written fine, but just not to the database I was looking at.  By default any database linked in your project file will be copied to the output folder (bin\debug) when you build.  It does this ‘every time‘.  I thought I was going mad as I wasn’t getting primary key violations trying to insert the same data over and over again!

The useful info is in this MSDN blog entry.  It talks about different options for working around the local databases, but I plumped for a subfolder (called data) in my project, then setting ‘Copy if newer’ on the MDF file rather than ‘do not copy’.  This just means that if you make a schema change then it will blow everything away (probably what I normally want in debug-land).

The reason I was using the SQLBulkCopy class was so I could take in a dataset picked from a semicolon-delimited text file. 
The first thing to do was read the text file into a dataset – and after some mucking about with the Jet OleDb provider and avoiding changing registry entries (to get it to recognise ‘;’ as the delimiter) I found that a schema.ini file is what’s required.

Here’s some quick code I then used to get a DataSet from a semicolon-delimited Text File:

       

       /// <summary>
        /// Import from a given file to database
        /// </summary>
        /// <param name=”fileName”>File to import from (CSV)</param>
        
public void Import(string fileName)
        {
            
if (File.Exists(fileName))
            {
                SqlConnection connection 
= new SqlConnection(Properties.Settings.Default.RBStatsConnectionString);
                
//Get Dataset from file
                
DataSet importData GetDataSetFromFile(fileName);

                //Import using SQLBulkCopy
                
SqlBulkCopy copyManager = new SqlBulkCopy(connection.ConnectionString, SqlBulkCopyOptions.UseInternalTransaction);
                
copyManager.DestinationTableName “TShirtStatistic”;
                
copyManager.BatchSize importData.Tables[0].Rows.Count;
                try
                
{
                    copyManager.WriteToServer(importData.Tables[
0]);
                
}
                
catch (Exception exception)
                {
                    Console.WriteLine(exception)
;
                
}
                copyManager.Close()
;
                
connection.Close();
            
}

        }

        public DataSet GetDataSetFromFile(string fileName)
        {
            
//Ensure we’ve got a schema file to fudge the semicolon delimited text
            
string schemaFile Path.Combine(Path.GetDirectoryName(fileName), “schema.ini”);
            
File.Delete(schemaFile);
            string 
schemaContents “[” + Path.GetFileName(fileName) + “]\nColNameHeader=True\nCharacterSet=ANSI\nFormat=Delimited(;)”;
            
StreamWriter writer File.CreateText(schemaFile);
            
writer.Write(schemaContents);
            
writer.Close();

            string connectionString “Provider=Microsoft.Jet.OLEDB.4.0;Data Source=” + Path.GetDirectoryName(fileName) + “\\;Extended Properties=’text;HDR=Yes;FMT=Delimited'”;
            string 
query “SELECT * FROM ” + Path.GetFileName(fileName);
            
OleDbConnection connection = new OleDbConnection(connectionString);
            
OleDbCommand command = new OleDbCommand(query, connection);

            connection.Open();

            OleDbDataAdapter adapter = new OleDbDataAdapter(query, connection);
            
DataSet ds = new DataSet();
            
adapter.Fill(ds);

            return ds;

        }

Create SQL Server User and Role Script to execute Stored Procedures Only

I really thought I’d posted this before, but here’s a script I use quite often when I need to create a SQL user that needs to just have ‘execute’ access to Stored Procs. 

This is usually used for web apps where the server is going through as a single user.  I’d encourage anyone to ensure that you always start off with granting the minimum access required and only add more when actually needed.  If you’ve got different roles in the application (viewer, administrator etc) then create different roles, with a user for each.

The intention is to remove the need to give direct table access (as we only call Stored Procs right?).  The script doesn’t actually do the ‘grants’ for you, but will generate another batch that you can execute if you like.  I prefer to put the grants in the script for each proc/function anyway, just so it’s a little easier to maintain, and you can specifically grant access to the ‘admin’ procs to the ‘admin’ role, whilst limiting any ‘normal’ users to the read-only/everyday ones. 

There’s satisfaction to be had from an app that connects to the database via an appropriate user, which is in a role that can only do what is necessary for the role, and can’t access tables directly.  Not only is it neat, it’s also (relatively) secure.

Here’s the script…

– MSIMNER: This script will generate a role and user for SQL Access to a single database
— The role will have no specific access to read or write tables but will be granted access
— through execution of stored procedures ONLY

USE <databasename, sysname, MyDB>

DECLARE @UserName VARCHAR(20)
DECLARE @Password VARCHAR(20)
DECLARE @DBName   VARCHAR(20)
DECLARE @Role     VARCHAR(20)

–SET PASSWORD HERE
SET @Password ‘ixFg2tfk’

SET @UserName ‘<databasename, sysname, MyDB>User’
SET @Role     ‘db_exec_<databasename, sysname, MyDB>’
SET @DBName   ‘<databasename, sysname, MyDB>’

IF EXISTS(SELECT FROM master.dbo.syslogins WHERE name = @UserName)
BEGIN
    exec 
sp_revokedbaccess @name_in_db @UserName
    
exec sp_droplogin @loginame @UserName
    
exec sp_droprole @rolename @Role

END

–Create User
exec sp_addlogin @loginame @UserName, @passwd @Password
exec sp_grantdbaccess @loginame @UserName
exec sp_defaultdb @loginame @UserName, @defdb @DBName
exec sp_addrole @rolename @Role, @ownername ‘dbo’
exec sp_addrolemember @rolename @Role, @membername @UserName

– The output of this command will need to be run separately – i.e. paste the output into a new window and run… 

–SELECT ‘GRANT EXECUTE ON ‘ + name + ‘ TO ‘ + @Role , *
–FROM sysobjects
–WHERE type = ‘P’


Colorized by: CarlosAg.CodeColorizer

Refactoring the inefficient data loop

OK. Last one for today.  I’m not going to go into too much detail except to say that the offending piece of code loads all items in a ‘matrix’ database table and builds a business object collection hierarchy.  This table has 4 fields, one or more of which may be NULL – for different meanings.  I really don’t particularly like this type of approach but changing it right now would involve quite a bit of work.


The table looks something like


Level1ID
Level2aID
Level2bID
Level2cID


This ultimately translates to a business object hierarchy of:


 



  • Rows with Level1ID (only) – all other fields null – are Level1 objects
  • Rows with Level1ID and Level2a ID (only) are Level2a objects

etc…


 


The matrix table just contains the primary keys for each object so the code loads each one in the style below…


 


Load matrix
for each row in matrix
{
 if row type == Level1
     Add to Level1Collection
 else if row type == Level2a
 {
  Load Level2a object
  Add Level2a object to Level1[Level1ID].Level2aCollection
 }
 else if row type == Level2b
 {
  Load Level2b object
  Add Level2b object to Level1[Level1ID].Level2bCollection
 }
 else if row type == Level2c
 {
  Load Level2c object
  Add Level2c object to Level1[Level1ID].Level2cCollection
 }
}


 


This seems reasonable enough (logical anyway) given the way the data’s being retrieved.


This does however load another several hundred rows from more stored proc calls that load each new object into the child collections.  This whole thing consistently takes around 8 seconds.


 


The Refactoring


 


A wise man once told me that if you can be sure the bulk of your operation is data access then lump it all together if you can, and do the loop or other grunt processing on the client. 



  1. I created a new Stored Procedure to return 4 result sets.  These come back as 4 tables in the target DataSet.  Each resultset is qualified by the same master criteria, and just uses joins to get a set of data that can be loaded directly into the collections.  The original stored proc is no longer required and this is now the only data access call.
  2. I changed my collection classes slightly to allow a LoadData method which takes in a dataset, a tablename and a parent key.  This means I can add Level2a objects to the appropriate Level1 collection.  The pseudo code now looks like…

Load multiple results
for each row in matrix
{
 if Level1 Results present 
   LoadData on Level1Collection
    
 if Level2a Results present
  for each Level1 row
   LoadData on Level1[Level1ID].Level2aCollection
   
 if Level2b Results present
  for each Level1 row
   LoadData on Level1[Level1ID].Level2bCollection


 if Level2c Results present
  for each Level1 row
   LoadData on Level1[Level1ID].LevelcbCollection
 
}


As I said at the beginning, there are some definite improvements to be made from changing the data structure, and a lot of this code could look a lot nicer by using Typed Datasets with relationships defined.


 


The new approach actually completes in less than 100ms.  I couldn’t actually believe it myself, and made sure I killed connections, cache etc to make sure the database was coming in cold.  Still the same.


 


This basically proves that for data-heavy operations, things really start to hurt when you’re making repeated client round-trips, however small the call.  This is basically a 99% saving in load time for this business object. 


 


The resulting page is also really snappy now and I’m sure the customer won’t even notice :-)


 


Search SQL Server Stored Procedures for any text string

Here’s the SQL script I probably use most when I’m trying to work out dependencies and scope of development changes.  Query Analyzer allows you to search objects (hit F4 – it’s cool!) and find objects by ‘name’, but there’s no obvious way to search stored procs for a particular string.


SQL Server hides Stored Procedure code in the sysComments table (in each database), and so it’s possible to search.  This is another script I found from not sure where (so can’t really take credit).  It does a great job of scanning and even gives you line numbers.  It would be great to be able to just shove this into the master database, but you need to be in the context of the database you’re searching in (or it would have to be amended somewhat to run some more ‘dynamic’ SQL). 


I have previously front-ended this for development databases with an ASP.NET page that simply runs the query substituting a parameter for the search text and connecting to the correct database.  The results show nicely in a grid.


Here’s the code – just paste into a new query in your chosen database, change @SearchTerm and away you go…


 


SET NOCOUNT ON
GO

declare 
@dbname sysname
,@SearchTerm    varchar(100)
,@BlankSpaceAdded   
int
,@BasePos       int
,@CurrentPos    int
,@TextLength    int
,@LineId        int
,@AddOnLen      int
,@LFCR          int –lengths of line feed carriage return
,@DefinedLength int
,@obj_name    varchar(255)
,@prev_obj_name 
varchar(255)


/* NOTE: Length of @SyscomText is 4000 to replace the length of
** text column in syscomments.
** lengths on @Line, #CommentText Text column and
** value for @DefinedLength are all 255. These need to all have
** the same values. 255 was selected in order for the max length
** display using down level clients
*/
,@SyscomText    nvarchar(4000)
,@Line          
nvarchar(255)

set @SearchTerm ‘MyTable’
set @dbname db_name()


set @SearchTerm ‘%’ + @SearchTerm + ‘%’

Select @DefinedLength 255
SELECT @BlankSpaceAdded /*Keeps track of blank spaces at end of lines. Note Len function ignores
                             trailing blank spaces*/
CREATE TABLE #CommentText
(obj_name 
varchar(255)
 ,LineId    
int
 
,Text  nvarchar(255collate database_default)

/*
**  Make sure the @objname is local to the current database.
*/
if @dbname is not null and @dbname <> db_name()
        
begin
                raiserror
(15250,-1,-1)
        
end

    begin

        DECLARE 
ms_crs_syscom  CURSOR LOCAL
        FOR SELECT 
so.name, sc.text FROM syscomments sc
    
inner join sysobjects so
    
on so.id = sc.id
    where 
so.type ‘P’
    
AND sc.encrypted 0
                
ORDER BY number, colid

        
FOR READ ONLY
    
end

/*
**  Get the text.
*/
SELECT @LFCR 2
SELECT @LineId 1


OPEN ms_crs_syscom

FETCH NEXT FROM ms_crs_syscom into @obj_name, @SyscomText

WHILE @@fetch_status >0
BEGIN

    SELECT  
@BasePos    1
    
SELECT  @CurrentPos 1
    
SELECT  @TextLength = LEN(@SyscomText)

    
IF @prev_obj_name <> @obj_name
        
–start again
        
SELECT @LineId 1
    
    
WHILE @CurrentPos  !0
    
BEGIN
        
–Looking for end of line followed by carriage return
        
SELECT @CurrentPos =   CHARINDEX(char(13)+char(10), @SyscomText, @BasePos)

        
–If carriage return found
        
IF @CurrentPos !0
        
BEGIN
            
/*If new value for @Lines length will be > then the
            **set length then insert current contents of @line
            **and proceed.
            */
            
While (isnull(LEN(@Line),0) + @BlankSpaceAdded + @CurrentPos-@BasePos + @LFCR) > @DefinedLength
            
BEGIN
                SELECT 
@AddOnLen @DefinedLength-(isnull(LEN(@Line),0) + @BlankSpaceAdded)
                
INSERT #CommentText VALUES
                
( @obj_name,
          @LineId,
                  
isnull(@Line, N”) + isnull(SUBSTRING(@SyscomText, @BasePos, @AddOnLen), N”))
                
SELECT @Line = NULL, @LineId @LineId + 1,
                       @BasePos 
@BasePos + @AddOnLen, @BlankSpaceAdded 0
            
END
            SELECT 
@Line    = isnull(@Line, N”) + isnull(SUBSTRING(@SyscomText, @BasePos, @CurrentPos-@BasePos + @LFCR), N”)
            
SELECT @BasePos @CurrentPos+2
            
INSERT #CommentText VALUES( @obj_name, @LineId, @Line )
            
SELECT @LineId @LineId + 1
            
SELECT @Line = NULL
        END
        ELSE
        
–else carriage return not found
        
BEGIN
            IF 
@BasePos <@TextLength
            
BEGIN
                
/*If new value for @Lines length will be > then the
                **defined length
                */
                
While (isnull(LEN(@Line),0) + @BlankSpaceAdded + @TextLength-@BasePos+1 ) > @DefinedLength
                
BEGIN
                    SELECT 
@AddOnLen @DefinedLength – (isnull(LEN(@Line),0) + @BlankSpaceAdded)
                    
INSERT #CommentText VALUES
                    
( @obj_name,
              @LineId,
                      
isnull(@Line, N”) + isnull(SUBSTRING(@SyscomText, @BasePos, @AddOnLen), N”))
                    
SELECT @Line = NULL, @LineId @LineId + 1,
                        @BasePos 
@BasePos + @AddOnLen, @BlankSpaceAdded 0
                
END
                SELECT 
@Line = isnull(@Line, N”) + isnull(SUBSTRING(@SyscomText, @BasePos, @TextLength-@BasePos+1 ), N”)
                
if LEN(@Line) < @DefinedLength and charindex(‘ ‘, @SyscomText, @TextLength+1 ) > 0
                
BEGIN
                    SELECT 
@Line @Line + ‘ ‘, @BlankSpaceAdded 1
                
END
            END
        END
    END
    
–Save previous name
    
SET @prev_obj_name @obj_name
    
FETCH NEXT FROM ms_crs_syscom into @obj_name, @SyscomText
END

IF 
@Line is NOT NULL
    INSERT #
CommentText VALUES( @obj_name, @LineId, @Line )

select from #CommentText 
where  patindex(@SearchTerm, text) <> 0

CLOSE  ms_crs_syscom
DEALLOCATE     ms_crs_syscom

DROP TABLE     #CommentText




Colorized by: CarlosAg.CodeColorizer