Dynamics GP – How I figured out which user deleted a row from a table using the SQL fn_dump_dblog function and DBCC PAGE command

Introduction

This post is similar to a previous post I did when investigating deletions on a SOP table however it takes things one step further to confirm 100% which user deleted the data.

Along with similar techniques as the previous post I also use a stored procedure and SQL job to create an alert to highlight when the deletion occurs as well as DBCC PAGE to query the contents of a 8k page of data in a restored SQL data file.

The issue

We had an ongoing issue at a client where randomly and without explanation a distribution line went missing on posted payables transactions? This is very strange as altering distribution lines isn’t allowed once a transaction has been posted. The distribution lines also must balance prior to the transaction being posted.

In order to get to the bottom of this I created a SQL stored procedure that would email me and the client when it identified any transactions with missing distributions. The stored procedure would run via a SQL agent job on a 15 minute schedule. As this had occurred in the past, but we had no idea when or how, the idea behind the alert is that it would then give us a 15 minute window to figure out what had happened. i.e. I could do SQL investigations and the client could ask all users what they were doing etc.

The stored procedure I used for the alert is below:

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
Create Procedure [dbo].[CheckMissingDists]
as

BEGIN

Declare @NumOfMissingDistsOpen as int
Declare @NumOfMissingDistsHist as int

select @NumOfMissingDistsHist = COUNT(*) from PM80600 pmdist
	left join PM30600 pm
		on pmdist.VCHRNMBR=pm.VCHRNMBR and pmdist.DSTSQNUM=pm.DSTSQNUM
	inner join PM30200 pmhdr
		on pmdist.VCHRNMBR=pmhdr.VCHRNMBR and pmdist.DOCTYPE=pmhdr.DOCTYPE
	where pm.VCHRNMBR is null 

select @NumOfMissingDistsOpen = COUNT(*) from PM80600 pmdist
	left join PM10100 pm
		on pmdist.VCHRNMBR=pm.VCHRNMBR and pmdist.DSTSQNUM=pm.DSTSQNUM
	inner join PM20000 pmhdr
		on pmdist.VCHRNMBR=pmhdr.VCHRNMBR and pmdist.DOCTYPE=pmhdr.DOCTYPE
	where pm.VCHRNMBR is null


IF (@NumOfMissingDistsHist > 0 or @NumOfMissingDistsOpen > 0)

EXEC msdb.dbo.sp_send_dbmail 
@profile_name='Email Profile', 
@recipients='myemail@email.com;clientsemail@email.com',
@importance='High',
@subject = 'Potential missing distributions',
@body = 'Potential missing distributions'

ELSE



RETURN

END

I then created an SQL job to execute the stored procedure every 15 minutes.

** Please note the query above runs on potentially large tables (PM30200) so may take a while and lock tables on some systems.

This had all been in place for a while, and nothing happened, until yesterday when I received an alert to inform me of missing distributions.

At this point I emailed the client to ask them to check what users where doing and I connected to the SQL server to confirm the document(s) with missing distributions.

Confirming the affected document

First of all I ran the query below which is based on the code from my alert. This highlighted the transaction in question was in the OPEN PM tables

I then ran the query below to find out which transaction had the missing distributions

This shows that the distribution with the DSTSQNUM number of 32768 is no longer in the PM10100 table.

Finally I logged into Dynamics GP and found the invoice and confirmed a distribution was definitely missing

SQL Investigation

Now I had confirmed that the issue had reoccured I commenced my SQL investigations.

As I knew the issue had happened in a 15 minute window I needed to query the SQL transaction backup log file from that time period to find all deletions in a certain time frame. (I couldn’t query the actually log file using fn_dblog as a log backup had occurred). To do this I ran the query below:

select	
	[Transaction ID]
	,[begin time] as BeginTime
	,[End Time] as EndTime
	,SUSER_SNAME ([transaction sid]) as UserID
	,object_name(object_id) as TableName
	,[Lock Information]
FROM fn_dump_dblog 
	(NULL, NULL, N'DISK', 1, N'D:\<Path to Transaction Log backup file>\Transaction Log Backup File.trn', 
	 DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	 DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	 DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	 DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	 DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	 DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	 DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	 DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	 DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT) as log1
left join sys.partitions part 
	 on part.partition_id=log1.partitionid
where  
	[transaction id] IN
(select 
	[transaction id]
 FROM fn_dump_dblog 
	(NULL, NULL, N'DISK', 1, N'D:\<Path to Transaction Log backup file>\Transaction Log Backup File.trn',
	DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT) as log1
left join sys.partitions part 
	on part.partition_id=log1.partitionid
where 
	operation = 'LOP_DELETE_ROWS' and 
	object_name(object_id) in ('PM10100')) 
order by [Transaction ID]

** Please note this query is a hybrid of a query on SQL Skills blog taken from here: https://www.sqlskills.com/blogs/paul/using-fn_dblog-fn_dump_dblog-and-restoring-with-stopbeforemark-to-an-lsn/

This query highlighted a delete performed by User ID “sa” in the time frame I was looking for as per below

As there was other deletes on PM10100 around the same time I needed to figure out a way to confirm that this delete did remove the distribution on my transaction. (the other deletes could be genuine i.e. a transaction moving from OPEN to HISTORY tables)

As I couldn’t figure out a way of finding out what exact data had been deleted using just the information returned by fn_dump_dblog, I utilised the information recorded in the “Lock Information” column. (as it didn’t fit in the screen shot above I’ve expanded this below)

As you can see this provides a Page ID and a Hash Value for the data that was locked and therefore removed.

As this data has now gone from the LIVE SQL database, I restored the database to a new database called “TEST1” at a point in time prior to the issue, and used DBCC PAGE to view the contents of Page ID 3813166

And hey presto that Page did contain the data for the missing distribution from this transaction!

Therefore I can now say with certainty that the user “sa” must have somehow removed the distribution from the table. What’s more I’m fairly certain that no users use this login to access GP, therefore something else using this account must have somehow removed the distribution. (so further investigations now commence)

Conclusion

This process shows how you can use various SQL tools to pinpoint exactly who has deleted a row of data from an SQL table in a SQL database.

Please note there might be a more efficient way to do this however as fn_dump_dblog and DBCC Page are undocumented tools I couldn’t find much information on how to do this so devised this method.

I hope this helps someone in future!

Thanks for reading!

Dynamics GP – SQL parameter sniffing issue causing a routine to hang

Background

The new year started off with a very interesting ticket logged on our support desk. The premise of the issue was a routine that would normally finish in seconds, or at most minutes, not appearing to start let alone finish. The user had tried crashing out of GP and even asking everyone else to log off however the problem persisted.

The Issue

The issue was occurring when running an “Extract” routine in the third party module Nolans Advanced Bank Reconciliation. This routine extracts all the data from the GL into third party tables based on parameters specified by the user.

The window used to initiate the process is below.

** Please note the window accepts a couple of parameters

After entering the above parameters and clicking “Extract” the process “hung” on 0% as per below:

The client left this for over an hour and the progress didn’t move. It remained on 0% the whole time. At this point they crashed out of Dynamics GP manually.

The Troubleshooting Process

The first thing I did to troubleshoot this issue was setup an SQL trace so I could log the SQL calls after clicking “Extract”. I then set off the process again and checked the trace. This highlighted the stored procedure that was running the extract routine is ncExtractGLTransactions as per below.

I left this running for a few minutes and nothing happened. Next, I wanted to confirm nothing was blocking the SQL process. To do this I used Adam Machanic’s awesome free SQL monitoring tool “sp_whoisactive” (http://whoisactive.com). After running this I could see nothing was blocking the process.

Therefore the conclusion at this point is the process is running (via the ncExtractGLTransactions stored procedure) but for some reason its taking a very long time to complete.

As this routine normally runs daily without issue my SQL senses had me thinking this could be a case of “bad” parameter sniffing.

Parameter sniffing and execution plan reuse

At this point I’ll try and give a very short overview of parameter sniffing and execution plan creation.

When a stored procedure runs the SQL query optimiser first checks if an execution plan already exists in the plan cache for the stored procedure, and if one doesn’t exist an execution plan is created and cached using the inputs provided (i.e. the parameter inputs are “sniffed” and used to create the execution plan) . This execution plan is then used to execute the query. If the stored procedure is executed again SQL will reuse that execution plan and therefore save time creating and compiling a new one. (this is why generally parameter sniffing is regarded as a good thing)

However issues can arise if the execution plan is reused based on parameters that would give completely different results. For example if the stored procedure is first executed with parameters that would only give a very small result set an execution plan might be created using index seeks and lookups. However if its then ran using parameters that would produce a very large result set index seeks and lookups might not be optimal way to do this. In this case it might be much quicker to perform table\index scans.

The Immediate Fix

To see if parameter sniffing was causing the issue I decided to find the execution plan for the stored procedure and evict it from the plan cache thus forcing SQL to create a new plan based on the inputs the client was using the next time the stored procedure was executed.

Please note there are other ways to do this such as marking the stored procedure for recompilation using the system stored procedure “sp_recompile” however I chose the plan cache route. This is mainly because I get to have more of a nosy around 🙂 i.e. running this query you can see when the execution plan was created and a gather whole host of other information.

To find the execution plan I used the script below:

SELECT plan1.query_plan,qs.plan_handle, a.attrlist, qs.*,est.*
FROM   sys.dm_exec_query_stats qs
CROSS  APPLY sys.dm_exec_sql_text(qs.sql_handle) est
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) plan1
CROSS  APPLY (SELECT epa.attribute + '=' + convert(nvarchar(127), epa.value) + '   '
              FROM   sys.dm_exec_plan_attributes(qs.plan_handle) epa
              WHERE  epa.is_cache_key = 1
              ORDER  BY epa.attribute
              FOR    XML PATH('')) AS a(attrlist)
WHERE  est.objectid = object_id ('dbo.ncExtractGLTransactions')
  AND  est.dbid     = db_id('DBName') 

** Please note I acquired this script from someone else either via a blog or book however as it was such a long time ago I can’t recall who exactly to credit them.

I then took the plan handle and ran the query below to evict the execution plan:

DBCC FREEPROCCACHE (<plan handle>)

After this had been done I ran the process again and it completed as normal! (I also ran the first query again and I can see based on the plan creation date it was a brand new execution plan)

The Long Term Fix

Going forwards I could “fix” this by adding the “WITH RECOMPILE” option to the stored procedure so that an execution plan is created but never stored when its executed. Therefore every time the stored procedure is ran a brand new optimal execution plan will be created. The only downside is a CPU hit while the execution plan is created.

Alternatively I could add the OPTION (RECOMPILE) hint to the statement within the stored procedure that is causing the issue so a plan is still created and cached however it won’t be for the troublesome statement.

Final thoughts

Some other things to bear in mind is that execution plans don’t stay in the plan cache forever, which is why you won’t necessarily come across this issue very often.

For example index maintenance and updating statistics, which are routines you might run frequently on your SQL server databases, will invalidate associated executions plans causing them to recompile the next time the stored procedure is ran. Therefore its possible that if I’d have left this issue overnight and index maintenance would have ran, this would have invalidated the plan, and the next time the user ran the routine a new execution plan would have been created and the routine would have ran as normal. This gives the impression the issue has “fixed itself” when in fact a new execution plan was created and used which resolved the issue.

I hope this helps anyone else who comes across routines that suddenly run slowly when they had previously ran fine. It might also answer why when you leave issues overnight things can appear to have magically fix themselves when you try again the next day 🙂

Thanks for reading!

Brief overview and comparison of how summary values are stored and calculated in Dynamics GP, Dynamics NAV and Dynamics 365 Business Central

Introduction

There was an interesting conversation on twitter over the weekend with regards summary values and Dynamics 365 Business Central. Much like myself the person asking the question came from a Dynamics GP background and was wondering where in the database Dynamics 365 Business Central stored summary values i.e. GL period balances.

Now this is one area of the system that I really like about Dynamics NAV and Dynamics 365 Business Central so I thought I’d write this up.

The Dynamics GP way

Dynamics GP stores summary values in a variety of different summary tables. For example the GL open years transactional data is stored in one table (GL20000) and its summarised in a summary balance table (GL10110).

This means you can quickly access the summary values in an enquiry window. For example below is a GL enquiry window showing the GL period balances for a cash account.

What’s great is as this window is accessing a summary table rather than summing and sub totalling the main transaction table the performance is great, so you have access to summary data very quickly. i.e. you could 10 million rows of transactional data for this GL code but its all nicely summarised into a handful of rows in the summary table.

As I mentioned this provides fantastic performance however anyone who has worked with Dynamics GP for any length of time will realise the one issue that comes with storing data in this way. Unfortunately if there is an issue when posting a transaction “sometimes” the summary values aren’t updated and therefore they become “out of sync” with the actual transaction data i.e. if you were to manually sum the transaction data it wouldn’t match the summary data. This can cause problems and prompt awkward conversations with clients as they (understandably) ask “why” this is the case, and (even worse) how they can prevent this in the future. (I hate it when this question crops up)

*** A wider discussion here would be why Dynamics GP doesn’t update all the SQL tables (transaction table and summary table) using one SQL transaction which would negate this issue however that is beyond of the scope of this article.

To compensate for this Dynamics GP has “reconcile” features for nearly all ledgers so you can re-align the summary balances with the actual transactions. For example below is the “reconcile” utility for the General Ledger. Running this will recalculate all the summary balances based on the transactional data and update the summary table I mentioned earlier.

The Dynamics NAV \ Dynamics 365 Business Central way

The first thing to say is that Dynamics NAV and Dynamics 365 Business Central doesn’t store summary values in specific SQL tables like Dynamics GP. Instead it uses something called “Flow fields” for the summary totals which are implemented using SQL Indexed views. (therefore technically the summary values are materialised in the database just not in actual SQL tables). In the Dynamics NAV and Dynamics 365 Business Central world they are referred to as SIFT Indexes.

So for example, when you view the GL balances in Dynamics NAV as per the screen shot below the system is running queries using the SQL indexed views to present the summary values and therefore performance is still extremely good. (technically its the SQL optimiser that will choose to use this index when asked to return the summary values for this page because its much more efficient)

Digging a little deeper if you go into the Dynamics NAV development environment you can see how the SQL indexed views (SIFT Indexes) are implemented.

For example if I select the GL entry table and click “Design”

And then select “View > Keys” to look at the indexes

You can now see the keys (Indexes) on the left and some of the keys have an associated “SumIndexField” (i.e. a SIFT Index which is implemented via a SQL Indexed View). In the highlighted example an SQL indexed view is created summing the fields Amount, Debit Amount, Credit Amount, Additional-Currency Amount, Add.-Currency Debit Amount, Add.-Currency Credit Amount and grouping by GL Account No and Posting Date.

Therefore if I open SQL Management Studio and browse to my Dynamics NAV database I can see all the SQL indexed views on the GL Entry table (there are four which matches the number of enabled keys with SumIndexFields columns in my development environment)

Finally if I take a look at the definition of the SQL indexed view named “CRONUS UK Ltd_$G_L Entry$VSIFT$1” we can see this is implementing the key that I highlighted earlier

CREATE VIEW [dbo].[CRONUS UK Ltd_$G_L Entry$VSIFT$1]
WITH schemabinding
AS
SELECT “17”.”g_l account no_”,
“17”.”posting date”,
Count_big(*) AS “$Cnt”,
Sum(“17″.”amount”) AS “SUM$Amount”,
Sum(“17″.”debit amount”) AS “SUM$Debit Amount”,
Sum(“17″.”credit amount”) AS “SUM$Credit Amount”,
Sum(“17″.”additional-currency amount”) AS
“SUM$Additional-Currency Amount”,
Sum(“17″.”add_-currency debit amount”) AS
“SUM$Add_-Currency Debit Amount”,
Sum(“17″.”add_-currency credit amount”) AS
“SUM$Add_-Currency Credit Amount”
FROM dbo.”cronus uk ltd_$g_l entry” “17”
GROUP BY “17”.”g_l account no_”,
“17”.”posting date”

I suppose you could argue the one draw back on this is that whenever a transaction is posted in Dynamics NAV and Dynamics 365 Business Central then SQL has extra CPU and disk work to do as it maintains the SQL indexed views associated with the transaction tables.

Conclusion

Although I love so many things about Dynamics GP I do think Dynamics NAV and Dynamics 365 Business Central has the edge here. By allowing the database to handle the maintenance of summary values through the implementation of SQL indexed views the data can never go out of sync with the summary values.

Thanks for reading.