SQL Server - CSI
Earlier on this blog I have described how we can see who has deleted tables etc from a database, but sometimes need more detail or to know who has updated a row or when a specific row was deleted. If the database is configured to use the full recovery model and/or has transaction log backups from the time range where what you are interested in, it is possible to read some information from these or from the active transaction log. You can read the older blog post I refer to here.
The transaction log and its backups contain the history of all events that have modified something in the database. However, reading from the transaction log is usually a last resort and is quite cumbersome. There are usually hundreds of thousands of records to analyze and they are not readable but have to be processed first. But often the only way is to get details when unwanted changes have occurred.
There are several third-party tools such as Quest's LiteSpeed and ApexSQLLog that have built-in functions to analyze and generate restore scripts etc if the backups are made through them or with SQL Server's built-in backup functionality. Unfortunately, many of them do not fully work if e.g. the built-in backup compression is turned on. But otherwise our experience with them is very positive.
There are also two built-in functions in SQL Server, fn_dblog() to read from the active transaction log and fn_dump_dblog() to read backup files. Unfortunately, both are undocumented and therefore not supported or recommended by Microsoft for internal use only. It is known that they have had (or may have) some bugs that have required reboots etc so it is not recommended to run these in production environments. However, they only return raw data and require some parsing and understanding and here I will just show a very simple example for fn_dump_dblog().
We start by creating a simple table as a test object, add and modify rows and take a transaction log backup on it.
CREATE TABLE demo ( id int NOT NULL IDENTITY(1,1) PRIMARY KEY, comment char(20), createDate datetime DEFAULT GETDATE()); INSERT INTO demo (comment) VALUES ('Example'), ('Example: To remove'), ('Example: To modify'); UPDATE demo SET comment='Example: Changed' WHERE comment = 'Example: to change'; DELETE FROM demo WHERE comment='Example: To delete' GO
There are now two rows left in the table
SELECT [id], [comment], [createDate] FROM [dbo].[demo]
The task now is to try to figure out who added, modified and removed the lines and when.
To use fn_dump_dblog() to read the transaction log backup, all 68(!) input parameters must be specified, but most can use the DEFAULT value. In order they are:
Start LSN, usually NULL which gives all End LSN, usually NULL as above Type of file, DISK or TAPE, in our case DISK
Backup number in the backup file, 1 if multi-backup media set is not used
All file names for all media families for the media set, usually only one file is used but if you strip the backup to multiple files, all must be listed here. Otherwise "DEFAULT" applies.
To simplify things a bit, the first thing we do is load all the data we need into a table which we then use instead of calling the function.
SELECT [Current LSN] [Operation] , [Transaction ID] [Parent Transaction ID] [Begin Time] [Transaction Name] [Transaction SID] [AllocUnitName] , [Page ID] [Slot ID] , [Lock Information] [Num Elements] , [RowLog Contents 0] [RowLog Contents 1] [Log Record] INTO dump_dblog FROM fn_dump_dblog ( NULL, NULL, 'DISK', 1, 'C:tempAddProDBA_demo.trn', DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT)
The first check I do is to look for the operation LOP_BEGIN_XACT. These log lines mark the beginning of each transaction and are the only ones that have with the start time of the transaction and which SID started it. To find out which login it is then you can use SUSER_SNAME() for example.
SELECT [Operation] [Transaction ID] , [Begin Time] , [Transaction Name] , [Transaction SID] FROM dump_dblog WHERE [Operation] = 'LOP_BEGIN_XACT'
For some of these lines, the transaction name looks very familiar, CREATE TABLE, INSERT, UPDATE and DELETE.
We take a closer look at the details of the INSERT, UPDATE and DELETE transactions.
SELECT [Operation] [Lock Information] , [Num Elements] , [RowLog Contents 0] [RowLog Contents 1] FROM dump_dblog WHERE [Transaction ID] IN ('0000:00000337', '0000:0000033b', '0000:0000033c')
The Lock Information column contains information about the locks taken for each activity:
HoBt 72057594048845440:ACQUIRE_LOCK_IX OBJECT: 0:325576198:0 ;ACQUIRE_LOCK_IX PAGE: 0:1:277 ;ACQUIRE_LOCK_X KEY: 0:72057594038845440 (61a06abd401c)
Among other things, the object ID of the current object is retrieved using OBJECT_ID('demo'), in my case 325576198, and using this you can do a further filtering on the specific object: AND [Lock Information] LIKE N'%:325576198:%'. But in our simple case it gives the same lines.
The LOP_INSERT_ROWS operation contains all the information about the added rows, LOP_MODIFY_ROW about the one we did UPDATE on and LOP_DELETE_ROWS for the one we deleted.
The log content columns contain all the information needed for a possible redo or undo operation for the respective modification. However, the format differs between different types of operations and some are quite easy to decipher but in most cases it is much more complicated.
If we start with LOP_MODIFY_ROW and look at the values in [RowLog Contents 0], [RowLog Contents 1] we know that they are binary versions of strings so we try to convert them:
SELECT CAST(0xC46E647261742020 AS varchar(50)),CAST(0x41747420E46E647261 AS varchar(50))
Just the line we are looking for!
For LOP_INSERT_ROWS and LOP_DELETE_ROWS it is a bit worse to decipher Log content, for them you have to know how a log record is constructed and for this I refer to e.g. Paul Randal's post Anatomy of a record. For simple text fields, with a bit of brute force we can figure out which line is correct with a fair amount of probability.
For example, to check if the LOP_DELETE_ROWS we found is for the line with comment='Example: to delete', we can check if the log content for transaction 0000:0000033c contains the binary representation of the string.
SELECT [Transaction ID] , [Operation] FROM dump_dblog WHERE [Transaction ID] = '0000:0000033c' AND [Operation] = 'LOP_DELETE_ROWS' AND CHARINDEX(CAST('Example: Delete' AS varbinary(25)), [Log Record]) > 0
So it looks like the DELETE we found is correct! But note that CHARINDEX(CAST('Example: To delete' AS varbinary(20)), [Log Record]) will return all lines where the string "Example: To delete" is included, but in our case we knew there was only one.
For our simple example it was easy to find the right rows but there is a lot more data to analyze and getting all the data out for a deleted row is quite complex without other tools but it can be done!
Feel free to contact us with questions and comments!