Dynamics GP – 10 tips when running a General Ledger Year End on a large SQL Database

Introduction

A few weeks ago I assisted with a General Ledger year end for a GP user with a very large number of journals and consequently a very large SQL database.

Usually a year end is a fairly straight forward routine, however as this user has such a large database, along with additional SQL bits and pieces for disaster recovery and bespoke reporting, it required a little more thought to ensure things went as smoothly as possible.

In this post I thought I’d share the steps I took to hopefully assist anyone else facing a similar situation in the future.

Here goes…..

1) Do the basics first.

By basics I mean following Microsoft’s general Year End prerequisites which can be found here.

I find the most important steps are ensuring you have a good backup and also ensuring the general ledger codes have the correct posting type as this can be a pain to correct post year end.

2) Ensure you have enough disk space for the SQL Data and SQL Log files.

This usually isn’t a consideration when running the Year End however given the size of the data its something that needs planning when dealing with GL tables in the 100’s of GBs.

Microsoft’s official guidance on disk space is included in the Year End documentation that I’ve linked to above however I’ve copied the section on disk space below:

During the year-end closing routine, all the records that will be moved are put in a temporary table before they are moved to the GL30000 table. You must have free disk space that is equal to the size of the GL20000 table to perform the routine.

To find the current size of the GL20000 you can use the sp_spaceused system stored procedure i.e.

However I’d also add that when dealing with large data volumes you also need to plan for extraordinary SQL transaction log growth as well.

For example the SQL transaction log grew to 240GB when I did this particular year end. This is because various statements including the INSERT statements into GL30000 are completed in one SQL transaction. Therefore no matter what the SQL recovery model is of your database, the log file is going to grow while the process completes. This is standard behaviour so SQL can roll the whole transaction back should any issue occur.

3) Remove any Database Mirroring or Availability Groups.

In my scenario the user has database mirroring for disaster recovery. I could have left this on however this would result in a lot of transaction log being sent over the wire while the Year End ran potentially doubling the length time it takes. I always see the amount of time something takes, as the size of the window of opportunity for something to go wrong. The quicker I could get the Year End process to go, the less chance I had of something going wrong. On the down side it does mean I had additional configuration to complete post year end, but I was fine with this.

4) Remove any custom indexes

If you have any additional custom indexes on GL20000 and GL30000 its best to remove them. Again its not a problem if you don’t remove them, however it can slow things down and also increase the amount of transaction log that SQL will produce. This is because if you are DELETEing and INSERTing data into tables the indexes also have to be updated. Therefore the less indexes you have the better. This requires additional space and also additional log to record the operations.

**Don’t remove any of the standard indexes though.

5) Stop the SQL Agent.

Maintenance tasks like reindex routines sometimes use the SQL agent to run. As I suspected the Year End will run for a very long time, I stopped the SQL agent to prevent this from happening. Its always best to check that critical operations won’t be affected by this though. If so just disable any SQL jobs that run maintenance on the Dynamics databases.

6) Temporarily revoke permissions for any Reporting Logins.

Ideally I’d like to set the database to single user mode while the Year End ran however as I wanted to monitor the process I didn’t want to restrict this too much. I therefore disable logins that I knew were used for reporting in case anyone sets off a large report in the middle of the year end.

7) Ensure SQL log growth on the database is set sensibly.

As I mentioned above the SQL transaction log is going to grow during this operation regardless of the recovery model. Therefore go to the properties of the SQL database and ensure the growth is set correctly, and its set in MB’s rather than a percent. I set it to grow 10GB increments for this particular year end process.

8) Use SQL monitoring scripts

I like to nosy under the hood when running most things but I especially want a sneak peak on the General Ledger Year End. My tool of choice for this is Adam Machanic’s awesome free monitoring tool sp_whoisative. This is a free script you can use to monitor activity in SQL. Using this I can pretty much see at what point the Year End was up to. i.e. INSERTing the data to GL30000 etc.

**If you didn’t know about this free script, and this is all you take away from this blog I’d be happy :). Its a really cool free tool you can use when troubleshooting all sorts of things in SQL i.e. blocking etc.

I also want to monitor my database and log growth as the process is running. If I monitor this I can predict when the log will run out of space and need to grow. I then check the disk space to ensure the growth will happen smoothly. To do this I use this script

SELECT DB_NAME() AS DbName, 
name AS FileName, 
size/128.0 AS CurrentSizeMB, 
size/128.0 - CAST(FILEPROPERTY(name, 'SpaceUsed') AS INT)/128.0 AS FreeSpaceMB 
FROM sys.database_files; 

9) Be patient 🙂

One of the most important things is to be patient as this routine can last hours on a large data set. Its also important to never cancel the operation by manually closing GP down even if it says “Not Responding”. If it doubt you can run the sp_whoisactive stored procedure and check what’s happening on the server.

10) Once complete

After a huge sigh of relief I performed an additional backup and then started recreating the indexes, setting up the mirroring and restarting the SQL agent. I also shrank the SQL transaction log back down to a reasonable size.

Conclusion

I hope you find the information in this post useful. Although you may not need to follow all the points I outlined, I think the most important are disk space for the SQL transaction log, monitoring and lots of patience 🙂

Thanks for reading!

Dynamics GP – How I figured out which user deleted a row from a table using the SQL fn_dump_dblog function and DBCC PAGE command

Introduction

This post is similar to a previous post I did when investigating deletions on a SOP table however it takes things one step further to confirm 100% which user deleted the data.

Along with similar techniques as the previous post I also use a stored procedure and SQL job to create an alert to highlight when the deletion occurs as well as DBCC PAGE to query the contents of a 8k page of data in a restored SQL data file.

The issue

We had an ongoing issue at a client where randomly and without explanation a distribution line went missing on posted payables transactions? This is very strange as altering distribution lines isn’t allowed once a transaction has been posted. The distribution lines also must balance prior to the transaction being posted.

In order to get to the bottom of this I created a SQL stored procedure that would email me and the client when it identified any transactions with missing distributions. The stored procedure would run via a SQL agent job on a 15 minute schedule. As this had occurred in the past, but we had no idea when or how, the idea behind the alert is that it would then give us a 15 minute window to figure out what had happened. i.e. I could do SQL investigations and the client could ask all users what they were doing etc.

The stored procedure I used for the alert is below:

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
Create Procedure [dbo].[CheckMissingDists]
as

BEGIN

Declare @NumOfMissingDistsOpen as int
Declare @NumOfMissingDistsHist as int

select @NumOfMissingDistsHist = COUNT(*) from PM80600 pmdist
	left join PM30600 pm
		on pmdist.VCHRNMBR=pm.VCHRNMBR and pmdist.DSTSQNUM=pm.DSTSQNUM
	inner join PM30200 pmhdr
		on pmdist.VCHRNMBR=pmhdr.VCHRNMBR and pmdist.DOCTYPE=pmhdr.DOCTYPE
	where pm.VCHRNMBR is null 

select @NumOfMissingDistsOpen = COUNT(*) from PM80600 pmdist
	left join PM10100 pm
		on pmdist.VCHRNMBR=pm.VCHRNMBR and pmdist.DSTSQNUM=pm.DSTSQNUM
	inner join PM20000 pmhdr
		on pmdist.VCHRNMBR=pmhdr.VCHRNMBR and pmdist.DOCTYPE=pmhdr.DOCTYPE
	where pm.VCHRNMBR is null


IF (@NumOfMissingDistsHist > 0 or @NumOfMissingDistsOpen > 0)

EXEC msdb.dbo.sp_send_dbmail 
@profile_name='Email Profile', 
@recipients='myemail@email.com;clientsemail@email.com',
@importance='High',
@subject = 'Potential missing distributions',
@body = 'Potential missing distributions'

ELSE



RETURN

END

I then created an SQL job to execute the stored procedure every 15 minutes.

** Please note the query above runs on potentially large tables (PM30200) so may take a while and lock tables on some systems.

This had all been in place for a while, and nothing happened, until yesterday when I received an alert to inform me of missing distributions.

At this point I emailed the client to ask them to check what users where doing and I connected to the SQL server to confirm the document(s) with missing distributions.

Confirming the affected document

First of all I ran the query below which is based on the code from my alert. This highlighted the transaction in question was in the OPEN PM tables

I then ran the query below to find out which transaction had the missing distributions

This shows that the distribution with the DSTSQNUM number of 32768 is no longer in the PM10100 table.

Finally I logged into Dynamics GP and found the invoice and confirmed a distribution was definitely missing

SQL Investigation

Now I had confirmed that the issue had reoccured I commenced my SQL investigations.

As I knew the issue had happened in a 15 minute window I needed to query the SQL transaction backup log file from that time period to find all deletions in a certain time frame. (I couldn’t query the actually log file using fn_dblog as a log backup had occurred). To do this I ran the query below:

select	
	[Transaction ID]
	,[begin time] as BeginTime
	,[End Time] as EndTime
	,SUSER_SNAME ([transaction sid]) as UserID
	,object_name(object_id) as TableName
	,[Lock Information]
FROM fn_dump_dblog 
	(NULL, NULL, N'DISK', 1, N'D:\<Path to Transaction Log backup file>\Transaction Log Backup File.trn', 
	 DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	 DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	 DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	 DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	 DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	 DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	 DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	 DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	 DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT) as log1
left join sys.partitions part 
	 on part.partition_id=log1.partitionid
where  
	[transaction id] IN
(select 
	[transaction id]
 FROM fn_dump_dblog 
	(NULL, NULL, N'DISK', 1, N'D:\<Path to Transaction Log backup file>\Transaction Log Backup File.trn',
	DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
	DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT) as log1
left join sys.partitions part 
	on part.partition_id=log1.partitionid
where 
	operation = 'LOP_DELETE_ROWS' and 
	object_name(object_id) in ('PM10100')) 
order by [Transaction ID]

** Please note this query is a hybrid of a query on SQL Skills blog taken from here: https://www.sqlskills.com/blogs/paul/using-fn_dblog-fn_dump_dblog-and-restoring-with-stopbeforemark-to-an-lsn/

This query highlighted a delete performed by User ID “sa” in the time frame I was looking for as per below

As there was other deletes on PM10100 around the same time I needed to figure out a way to confirm that this delete did remove the distribution on my transaction. (the other deletes could be genuine i.e. a transaction moving from OPEN to HISTORY tables)

As I couldn’t figure out a way of finding out what exact data had been deleted using just the information returned by fn_dump_dblog, I utilised the information recorded in the “Lock Information” column. (as it didn’t fit in the screen shot above I’ve expanded this below)

As you can see this provides a Page ID and a Hash Value for the data that was locked and therefore removed.

As this data has now gone from the LIVE SQL database, I restored the database to a new database called “TEST1” at a point in time prior to the issue, and used DBCC PAGE to view the contents of Page ID 3813166

And hey presto that Page did contain the data for the missing distribution from this transaction!

Therefore I can now say with certainty that the user “sa” must have somehow removed the distribution from the table. What’s more I’m fairly certain that no users use this login to access GP, therefore something else using this account must have somehow removed the distribution. (so further investigations now commence)

Conclusion

This process shows how you can use various SQL tools to pinpoint exactly who has deleted a row of data from an SQL table in a SQL database.

Please note there might be a more efficient way to do this however as fn_dump_dblog and DBCC Page are undocumented tools I couldn’t find much information on how to do this so devised this method.

I hope this helps someone in future!

Thanks for reading!

Dynamics GP – SQL parameter sniffing issue causing a routine to hang

Background

The new year started off with a very interesting ticket logged on our support desk. The premise of the issue was a routine that would normally finish in seconds, or at most minutes, not appearing to start let alone finish. The user had tried crashing out of GP and even asking everyone else to log off however the problem persisted.

The Issue

The issue was occurring when running an “Extract” routine in the third party module Nolans Advanced Bank Reconciliation. This routine extracts all the data from the GL into third party tables based on parameters specified by the user.

The window used to initiate the process is below.

** Please note the window accepts a couple of parameters

After entering the above parameters and clicking “Extract” the process “hung” on 0% as per below:

The client left this for over an hour and the progress didn’t move. It remained on 0% the whole time. At this point they crashed out of Dynamics GP manually.

The Troubleshooting Process

The first thing I did to troubleshoot this issue was setup an SQL trace so I could log the SQL calls after clicking “Extract”. I then set off the process again and checked the trace. This highlighted the stored procedure that was running the extract routine is ncExtractGLTransactions as per below.

I left this running for a few minutes and nothing happened. Next, I wanted to confirm nothing was blocking the SQL process. To do this I used Adam Machanic’s awesome free SQL monitoring tool “sp_whoisactive” (http://whoisactive.com). After running this I could see nothing was blocking the process.

Therefore the conclusion at this point is the process is running (via the ncExtractGLTransactions stored procedure) but for some reason its taking a very long time to complete.

As this routine normally runs daily without issue my SQL senses had me thinking this could be a case of “bad” parameter sniffing.

Parameter sniffing and execution plan reuse

At this point I’ll try and give a very short overview of parameter sniffing and execution plan creation.

When a stored procedure runs the SQL query optimiser first checks if an execution plan already exists in the plan cache for the stored procedure, and if one doesn’t exist an execution plan is created and cached using the inputs provided (i.e. the parameter inputs are “sniffed” and used to create the execution plan) . This execution plan is then used to execute the query. If the stored procedure is executed again SQL will reuse that execution plan and therefore save time creating and compiling a new one. (this is why generally parameter sniffing is regarded as a good thing)

However issues can arise if the execution plan is reused based on parameters that would give completely different results. For example if the stored procedure is first executed with parameters that would only give a very small result set an execution plan might be created using index seeks and lookups. However if its then ran using parameters that would produce a very large result set index seeks and lookups might not be optimal way to do this. In this case it might be much quicker to perform table\index scans.

The Immediate Fix

To see if parameter sniffing was causing the issue I decided to find the execution plan for the stored procedure and evict it from the plan cache thus forcing SQL to create a new plan based on the inputs the client was using the next time the stored procedure was executed.

Please note there are other ways to do this such as marking the stored procedure for recompilation using the system stored procedure “sp_recompile” however I chose the plan cache route. This is mainly because I get to have more of a nosy around 🙂 i.e. running this query you can see when the execution plan was created and a gather whole host of other information.

To find the execution plan I used the script below:

SELECT plan1.query_plan,qs.plan_handle, a.attrlist, qs.*,est.*
FROM   sys.dm_exec_query_stats qs
CROSS  APPLY sys.dm_exec_sql_text(qs.sql_handle) est
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) plan1
CROSS  APPLY (SELECT epa.attribute + '=' + convert(nvarchar(127), epa.value) + '   '
              FROM   sys.dm_exec_plan_attributes(qs.plan_handle) epa
              WHERE  epa.is_cache_key = 1
              ORDER  BY epa.attribute
              FOR    XML PATH('')) AS a(attrlist)
WHERE  est.objectid = object_id ('dbo.ncExtractGLTransactions')
  AND  est.dbid     = db_id('DBName') 

** Please note I acquired this script from someone else either via a blog or book however as it was such a long time ago I can’t recall who exactly to credit them.

I then took the plan handle and ran the query below to evict the execution plan:

DBCC FREEPROCCACHE (<plan handle>)

After this had been done I ran the process again and it completed as normal! (I also ran the first query again and I can see based on the plan creation date it was a brand new execution plan)

The Long Term Fix

Going forwards I could “fix” this by adding the “WITH RECOMPILE” option to the stored procedure so that an execution plan is created but never stored when its executed. Therefore every time the stored procedure is ran a brand new optimal execution plan will be created. The only downside is a CPU hit while the execution plan is created.

Alternatively I could add the OPTION (RECOMPILE) hint to the statement within the stored procedure that is causing the issue so a plan is still created and cached however it won’t be for the troublesome statement.

Final thoughts

Some other things to bear in mind is that execution plans don’t stay in the plan cache forever, which is why you won’t necessarily come across this issue very often.

For example index maintenance and updating statistics, which are routines you might run frequently on your SQL server databases, will invalidate associated executions plans causing them to recompile the next time the stored procedure is ran. Therefore its possible that if I’d have left this issue overnight and index maintenance would have ran, this would have invalidated the plan, and the next time the user ran the routine a new execution plan would have been created and the routine would have ran as normal. This gives the impression the issue has “fixed itself” when in fact a new execution plan was created and used which resolved the issue.

I hope this helps anyone else who comes across routines that suddenly run slowly when they had previously ran fine. It might also answer why when you leave issues overnight things can appear to have magically fix themselves when you try again the next day 🙂

Thanks for reading!