Book Review : Zero Day by Mark Russinovich

Zero DayI’ve long followed the output and talent of Mark Russinovich, as any self-respecting Windows tech should know and use his Winternals and Sysinternals utilities. I don’t know what i would have done in some situations in the past without Process Explorer, Process Monitor, and all the other PsTools. Needless to say, Mark knows what he’s talking about in the world of computers.

I was surprised and intrigued recently when catching up on a Scott Hanselman podcast, he was the interviewee, on topics including his new novel.  Scott seemed genuinely surprised too that Mark had moved into this area, and also impressed with the way he’s managed to bridge the often ‘huge’ gulf between normal life and techiedom.

Zero Day is Mark’s first novel. An action thriller, centered around high technology crime (published in 2011), and he’s followed that up with Trojan Horse; further developing the main characters in the first book.

Zero Day features Jeff Aitkin as the lead character. A technical wizard, specialising in computer security, but seemingly normal in many ways that non-technical people think strange for a geek. He’s fit and good looking.  In the interview, Mark admits Jeff is based on him. 🙂

What’s interesting first of all, is that the novel was even published, as the subject matter simply wouldn’t have had mass appeal even just a few short years ago. The exploding popularity of the internet, and its crossover into normal people’s daily lives has obviously made the story more accessible than ever before. Also, the subject matter of coordinated cyber attacks was something even Mark himself thought may have happened for real before he finished the book; making the story less relevant. I guess he’s thankful in more ways than one we’ve survived up to now.

The story starts with the seemingly isolated virus infection of a company’s network, in the midst of a rather racy opening scene. Mark doesn’t hold back on language and imagery to paint the picture. I don’t know why I was surprised by this, but I guess that helps the book to sell.  Maybe it was also an attempt to quickly set the tone that this is not just a book for techheads.

We quickly then learn of many other, and varied incidents, involving computers in all sorts of places, and applications. Some with fatal consequences. What follows is a well constructed plot that explores not only the technical detail of viruses, exploits, triggers and rootkits, but the human factors that motivate individuals away from doing ‘good’, and the ways they justify their actions to themselves.

Clearly the author has done his research on many topics. The technical stuff is a given, but the characters involved are from many different backgrounds, explored well, and their back story plausible in most cases. Jeff’s own previous connection with the government, and the events leading up to 9/11 is interesting, albeit quite convenient for the story line.

My only real character comment is that the story seems to centre around a small group of very attractive computer people.  Jeff finds himself surprised when his new client is attractive, as his previous experience was that of the few women in IT, very few of them were attractive. His emerging sidekick and love interest also just happened to be a whizzkid, and complete knockout. I guess they’d better get moving on the movie!

Things develop, and it seems that this thing is bigger than individual viruses and hackers. The characters start to connect in unexpected ways, often unknown to them, and it’s a good rollercoaster ride across the world, as the characters chase the ‘cure’.  But – would it make any difference anyway on Zero Day?

Whether enough loose ends were resolved at the end of the book I’m still deciding, but that may be another reason to read Trojan Horse.

Being someone who doesn’t read many novels, I enjoyed this a lot as the story and technical detail held my interest throughout. Also, like the information age itself, the book is split into bitesize chunks that move quickly between the various storylines. I found myself wondering which thread I was going to join() again next (pardon the developer pun).

Trojan Horse - the bookI’m looking forward to getting into the next Jeff Aitkin adventure, Trojan Horse, which is also now available.

Check out the rest of Mark’s work on his site http://www.trojanhorsethebook.com/

Returning flattened data from a SQL table containing pairs of rows

I like little challenges. The ones that don’t take all day to figure out, but are enough to capture your interest for a little while. Yesterday I had such a problem to solve.

I had some data in a table that was basically in ‘pairs’ of rows. It was actually different to the example below, but the example we’ll use is a ‘Message’ table, that contains requests and replies, that are linked through a particular identifier.

Our simple example looks like this (my actual table had more fields).

CREATE TABLE Message
(
    MessageID INT NOT NULL IDENTITY ,
    MessageType CHAR(1) NOT NULL,
    TransactionID INT NOT NULL,
    MessageBody VARCHAR(30),
    CreatedDate DATETIME DEFAULT GetDate()
)

We’ll add a bit of sample data (script generated from my insert generator stored proc)

SET IDENTITY_INSERT Message ON

INSERT Message(MessageID,MessageType,TransactionID,MessageBody,CreatedDate) VALUES('1','Q','1','Request Message 1',convert(datetime,'2012-08-30 13:55:07.213',121))
INSERT Message(MessageID,MessageType,TransactionID,MessageBody,CreatedDate) VALUES('2','R','1','Reply Message 1',convert(datetime,'2012-08-30 13:55:37.680',121))
INSERT Message(MessageID,MessageType,TransactionID,MessageBody,CreatedDate) VALUES('3','Q','2','Request Message 2',convert(datetime,'2012-08-30 13:55:51.183',121))
INSERT Message(MessageID,MessageType,TransactionID,MessageBody,CreatedDate) VALUES('4','R','2','Reply Message 2',convert(datetime,'2012-08-30 13:56:04.020',121))

SET IDENTITY_INSERT Message OFF


SELECT * FROM Message

MessageID   MessageType TransactionID MessageBody                    CreatedDate
----------- ----------- ------------- ------------------------------ -----------------------
1           Q           1             Request Message 1              2012-08-30 13:55:07.213
2           R           1             Reply Message 1                2012-08-30 13:55:37.680
3           Q           2             Request Message 2              2012-08-30 13:55:51.183
4           R           2             Reply Message 2                2012-08-30 13:56:04.020 

We can see that some of the fields are consistent from row to row (in pairs), and some of the fields are unique to each row. My challenge was to represent a pair of messages in one row.

On the face of it, this seems like it would be simple – just grouping by the TransactionID (the field that links the two rows). The problem is that you won’t be able to get the unique information from both rows without some assumptions (that may not be solid).

For example, this will happily give you the MessageID’s of both sides of the transaction (given the assumption that the request comes before the reply, and that there are two messages in a transaction) …

SELECT TransactionID, MIN(MessageID) AS RequestID, MAX(MessageID) AS ReplyID
FROM [Message]
GROUP BY TransactionID HAVING COUNT(*) = 2

TransactionID RequestID   ReplyID
------------- ----------- -----------
1             1           2
2             3           4 

But – it’s doesn’t give you the unique data related to each ID, as you’d need to correlate the MessageBody to the right MessageID – MIN(MessageBody) won’t necessarily relate to the ‘Request’.

So… We can think about how to correlate the data to get the result we want. There’s a few options…

1. Use temporary tables, and get the result in two steps (reusing the query above)..

 

–1 – Two Step Process

SELECT TransactionID, MIN(MessageID) AS RequestID, MAX(MessageID) AS ReplyID
INTO #MessagePair
FROM [Message]
GROUP BY TransactionID HAVING COUNT(*) = 2

SELECT  REQ.MessageID AS RequestMessageID,
        REQ.TransactionId,
        REQ.MessageBody AS RequestBody,
        REQ.CreatedDate AS RequestDate,
        RPY.MessageID AS ReplyMessageID,
        RPY.MessageBody AS ReplyBody,
        RPY.CreatedDate AS ReplyDate
FROM #MessagePair MP
INNER JOIN [Message] REQ
   ON REQ.MessageID = MP.RequestID
INNER JOIN [Message] RPY
   ON RPY.MessageID = MP.ReplyID

RequestMessageID TransactionId RequestBody                    RequestDate             ReplyMessageID ReplyBody                      ReplyDate
---------------- ------------- ------------------------------ ----------------------- -------------- ------------------------------ -----------------------
1                1             Request Message 1              2012-08-30 13:55:07.213 2              Reply Message 1                2012-08-30 13:55:37.680
3                2             Request Message 2              2012-08-30 13:55:51.183 4              Reply Message 2                2012-08-30 13:56:04.020

2. Nasty correlated subquery and joins (not even going there)

3. Single query that makes use of the assmption that a request happens before a reply (meaning the messageId will be a lower value)

SELECT  REQ.MessageID AS RequestMessageID,
        REQ.TransactionId,
        REQ.MessageBody AS RequestBody,
        REQ.CreatedDate AS RequestDate,
        RPY.MessageID AS ReplyMessageID,
        RPY.MessageBody AS ReplyBody,
        RPY.CreatedDate AS ReplyDate
FROM [Message] REQ
INNER JOIN [Message] RPY
    ON REQ.TransactionID = RPY.TransactionID
AND REQ.MessageID < RPY.MessageID 

This produces the same result as above, and is what I ended up going with. I reckon there’s probably a few more potential viable solutions, so I’d be interested to see anyone’s alternative solutions.