T-SQL Copy & Paste Pattern – Increasing a performance problem

Disclaimer: The title is my assumption because I saw it in the past happening this way.

This blog post aims to make you remember something: something that is obvious to you, might not be obvious to others.

Scenario:

A client has a process which consists of a stored procedure that wraps a bunch of other stored procedures.
The process runs for about 10 hours.

Taking a look…what is running right now?

I was curious about the process, I’ve seen this running before but never explored the code. After a couple of days of seeing it running for so long, I decided to see what would be the random query I would get executing.

I had some luck and saw one with a pattern that I knew clearly that could be rewritten it and make it faster.

sp_WhoIsActive to the rescue

If you don’t know what sp_WhoIsActive (Adam Machanic‘s creation) stored procedure is, let me copy the short definition from the whoisactive.com website:

sp_whoisactive is a comprehensive activity monitoring stored procedure that works for all versions of SQL Server from 2005 through 2017.

You can download it from the download page or if you use dbatools you can use the command that will download it for you and install it. You can read more about it in my previous blog post New Version Of sp_WhoIsActive (V11.20) Is Available – Deployed On 123 Instances In Less Than 1 Minute

Using sp_WhoIsActive to get the current running query

EXEC sp_WhoIsActive

The query

The query that was being run has the following structure

SELECT column1
  FROM table1
 WHERE EXISTS (
               SELECT column2 FROM table2 WHERE table2.column1 = table1.column2
              )
   AND (
              EXISTS (SELECT column1 FROM table3 WHERE table3.column1 = table1.column1 AND table3.column2 = @param1 and table3.column3 = 0)
           OR EXISTS (SELECT column1 FROM table3 WHERE table3.column1 = table1.column1 AND table3.column2 = @param2 and table3.column3 = 0)
           OR EXISTS (SELECT column1 FROM table3 WHERE table3.column1 = table1.column1 AND table3.column2 = @param3 and table3.column3 = 0)
           OR EXISTS (SELECT column1 FROM table3 WHERE table3.column1 = table1.column1 AND table3.column2 = @param4 and table3.column3 = 0)
           OR EXISTS (SELECT column1 FROM table3 WHERE table3.column1 = table1.column1 AND table3.column2 = @param5 and table3.column3 = 0)
           OR EXISTS (SELECT column1 FROM table3 WHERE table3.column1 = table1.column1 AND table3.column2 IN (@param6, @param7) and table3.column3 = 0)
       )

However this default call does not bring the execution plan, for that you need to run using the @get_plans parameter:

EXEC sp_WhoIsActive @get_plans = 1

I have copied the XML that contains the sqlplan to SentryOne Plan Explorer and this was what I saw

Within the red circle we can see a concatenation operator (first one on top left) which will get the result of each EXISTS sub query (8 in total).

If we want to know which outer batch or stored procedure call was issued by the application or user we can use the @get_outer_command parameter

EXEC sp_WhoIsActive @get_plans = 1, @get_outer_command = 1

However, if we have nested calls, this will not show the current batch or stored procedure where the code of the current query belongs.

To get that we need to use the @get_full_inner_text parameter:

EXEC sp_WhoIsActive @get_plans = 1, @get_outer_command = 1, @get_full_inner_text = 1

This way, the sql_text column will contain the whole batch or stored procedure where the query belongs.

Back to the query – The pattern

Can you see the pattern? A lot of OR EXISTS() conditions. That is odd indeed, it wouldn’t be so odd if each OR EXISTS() was accessing different tables…oh, wait…they are not 🙂 and that is where the problem is.

Easy to improve

We can easily re-write the query without changing the logic or affecting the output data.

I have re-written the code in the following way:

SELECT column1
  FROM table1
 WHERE EXISTS (
               SELECT column2 FROM table2 WHERE table2.column1 = table1.column2
              )
   AND EXISTS (
               SELECT column1 
                 FROM table3 
                WHERE table3.column1 = table1.column1 
                  AND (
                           table3.column2 = @param1 
                        OR table3.column2 = @param2
                        OR table3.column2 = @param3
                        OR table3.column2 = @param4
                        OR table3.column2 = @param5
                        OR table3.column2 IN (@param6, @param7)
                      )
                  AND table3.column3 = 0
           )
       )

This way we will just hit the table3 once instead of one time per OR EXISTS().

The actual plan seems to have a much better shape:

A different approach would be a single IN () condition with all variables comma separated. However, I preferred this way as it’s easy to show to the developement team the differences between now and before.

In fact, when we use the IN operator the optimizer will expand it to various OR conditions. Example:

Result

With this change, I have improved the query by 99%.
Query went down from ~4340 seconds to less than 30 seconds.

NOTE: The table had ~46M records.

Also, the number of logical reads for the table dive-bombed!

Before (optimizer used a Worktable):

Table ‘Worktable’. Scan count 924230, logical reads 1045662012

Table ‘table3’. Scan count 6, logical reads 4967238

SQL Server Execution Times:
CPU time = 3824887 ms, elapsed time = 4344872 ms.

After:

Table ‘table3’. Scan count 9, logical reads 829734,

SQL Server Execution Times:
CPU time = 86345 ms, elapsed time = 26074 ms.

This means that on the whole process we have saved 1h!

How’d this happen?

As said in my title and initial disclaimer, this smells like a copy & paste pattern. Maybe something similar to:

Client: “Hey can we have another validation for a different value?”
Dev: “Sure, it’s pretty easy to do so”
Also Dev: copy & paste existing OR EXISTS () change parameter, commit to source control and push it into QA test (with few data) and it’s good to go into PROD.
Client: “Thanks it’s working just a little bit slower but today things are slower in general”

Wrap

When you find these kind of patterns, invest a couple of minutes to test it with better logic.
You may end saving a “couple” of CPU cycles and saving a lot of time.

Thanks for reading!

Allow SQL Server Agent jobs management for non sysadmins

I don’t know about your experience when it comes to SQL Server Agent jobs but often I receive requests to grant permission so that clients can handle their jobs in an autonomous way.

Any problem with that?

It depends 😄
If you’re not familiarized with the difficulty this can bring, let me share with you a different way to accomplish the task without adding the login to the sysadmin instance role. I’m sure that you don’t want to do that on a production instance.
The next possible way is to make the login as the owner of the job and it needs to belong to one of the msdb fixed database roles! You can take a look at this documentation and see the matrix there for each role.

This means that if we are talking about a single login, you won’t have any problem at all.

Using service accounts instead

It is common having services accounts that are job owners so they can run within the proper context.
In my humble opinion, this starts to be a little strange when it comes to modifying the agent job. It means that the service account needs permissions on agent roles and someone will need to connect to SQL Server using the service account (run as) so they can manage the agent job. It works, but not practical.

AD Groups

Many of the scenarios I get from clients is that they have an AD Group that contains all of all maintainers or support people.
Being able to put groups as job owner would be awesome, but unfortunately, it’s not possible.
I can hear the sad trombone!

My suggestion/recommendation/approach to this:

This may seem to be too much work, but at the end of the day, I feel it’s the best balance between security and the ability for the client to manage their agent jobs as they wish within their context.

Wrappers

I suggest that you create a wrapper for each system stored procedures that client need within msdb. It can be just the sp_update_job_step. You may also get a request to be able to change the schedules of the job and you need to create another wrapper also for the sp_update_job_schedule system stored procedure.

Security context

Use the EXECUTE AS OWNER so they can impersonate the sysadmins permissions and call the system procedure.

Be more granular (HIGHLY RECOMMENDED!!)

Say that you have your administration jobs and the jobs from the client. To narrow down the scope for the client you may want to add an extra validation using the job name prefix.

If they don’t have a naming convention, you can always ask if they can start using one and update the agent jobs accordingly.

Ask them to help you to help them! 🙂

The base principle here is simple if the name starts with the specified prefix the execution can proceed, otherwise, it will stop running and will return an error message saying they can’t update that specific job.

Here is a code example of a wrapper for the sp_update_job system stored procedure:

USE [msdb]
GO

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

CREATE PROC [dbo].[sp_update_job_for_non_admins]
	@job_id                       UNIQUEIDENTIFIER = NULL, -- Must provide this or current_name
	@job_name                     sysname          = NULL, -- Must provide this or job_id
	@new_name                     sysname          = NULL,
	@enabled                      TINYINT          = NULL,
	@description                  NVARCHAR(512)    = NULL,
	@start_step_id                INT              = NULL,
	@category_name                sysname          = NULL,
	@owner_login_name             sysname          = NULL,
	@notify_level_eventlog        INT              = NULL,
	@notify_level_email           INT              = NULL,
	@notify_level_netsend         INT              = NULL,
	@notify_level_page            INT              = NULL,
	@notify_email_operator_name   sysname          = NULL,
	@notify_netsend_operator_name sysname          = NULL,
	@notify_page_operator_name    sysname          = NULL,
	@delete_level                 INT              = NULL,
	@automatic_post               BIT              = 1     -- Flag for SEM use only

WITH EXECUTE AS OWNER
AS
	BEGIN
		
		IF EXISTS (SELECT [name]
				     FROM msdb.dbo.sysjobs
					WHERE job_id = @job_id
					 AND [name] LIKE '<customer prefix>%'
			) 
			BEGIN 
				EXEC msdb.dbo.sp_update_job
					@job_id                      
					,@job_name                    
					,@new_name                    
					,@enabled                     
					,@description                 
					,@start_step_id               
					,@category_name               
					,@owner_login_name            
					,@notify_level_eventlog       
					,@notify_level_email          
					,@notify_level_netsend        
					,@notify_level_page           
					,@notify_email_operator_name  
					,@notify_netsend_operator_name
					,@notify_page_operator_name   
					,@delete_level                
					,@automatic_post
			END
		ELSE
			BEGIN
				RAISERROR ('The job_id used does not belong to an <customer prefix> job.', 16, 1);  
			END         
	END
GO

 

More examples on my GitHub

If you want to leverage on the stored procedures that I have already created you can download them from ManageAgentJobsNonsysAdmin folder on my GitHub repository

Giving permissions to the wrapper objects

Create a role (will be easy to manage) on the msdb database and add the logins (nominal or groups) to it.
Grant EXECUTE permissions for all of your compiled *_for_non_admins stored procedures.

Other options are to grant the EXECUTE permissions to the existing database fixed role where the client login/group is already member.

T-SQL code example:

use [msdb]
GO
GRANT EXECUTE ON [dbo].[sp_update_job_for_non_admins] TO [SQLAgentOperatorRole]
GO

 

You can be more creative but be aware of the maintenance costs

Another option that comes into my mind is to include the agent jobs names (totally or partially) on a control table.
However, this can bring more work when talking about maintaining this process. If the client has somehow a fixed number of jobs maybe it is not too much work otherwise it can be a nightmare.

Wrap up

We have seen how we can provide more control to clients so they can manage their jobs without compromising security.
I have used this method several times with success. By success I mean, I was able to explain to the client the limitation on SQL Server side and, on the other hand I present to them a possible solution.
It is not easy to change the code within a step using the stored procedure instead of using the interface? Sure it is not! But at least you provide a working solution that makes the client autonomous! In order to help them show how they can script out the changes, add the suffix for NonAdmins and they are good to go.

Thanks for reading.

“Ups…I have deleted some data. Can you put it back?” – dbatools for the rescue

Few days ago I received a request to restore a dozen of tables because someone have deleted more data than it was supposed.
I immediately thought about dbatools for the job!

NOTE: I also thought about SSMS “Import/Export Data”. And this is ok when someone says “it’s to run just once, just now”. When you are in the IT for a while you know that is not only once :-). And what if I need to re-run? Or save it as example and share with my colleagues? Doing this using a PowerShell script makes it much more flexible.

At this post, I’m assuming that we already have restored what will be our source database using the backup requested by the client.

Let’s start – Finding the right command

I knew we have a Copy command for table data.

To search for commands within dbatools we can use the Find-DbaCommand.

Find-DbaCommand -Pattern Copy*Table*Data

Note: As you can see, we can use wildcards to do the search when using -Pattern parameter.

From the results list, we know that we have at least three commands (as of v0.9.824) that match the specified pattern. By reading the Synopsis of the first command that seems what we need to accomplish the work.

Get familiarized with the command

Don’t forget to use Get-Help cmdlet to find the available parameters and get some examples on how you can use the command.

Get-Help Copy-DbaDbTableData -Detailed

Let’s try the command and copy the data

$params = @{
    SqlInstance = 'sql1'
    Destination = 'sql2'
    Database = 'db1'
    DestinationDatabase = 'db2'
    Table = '[dbo].[Table]'
    DestinationTable = '[dbo].[Table]'
}
Copy-DbaDbTableData @params

Result

WARNING: [14:26:06][Copy-DbaDbTableData] Table [dbo].[Table] cannot be found in db2. Use -AutoCreateTable to automatically create the table on the destination.

Wait a moment…it yielded a warning?

Oh, I see…this will not work if the table does not exists on the destination database.

At this point I realised that some of the requested tables were dropped. The client did not specify that.

But dbatools got you covered and the warning gives you one hint: use -AutoCreateTable which will, per its description,

Creates the destination table if it does not already exist, based off of the “Export…” script of the source table.

That is nice!

I added the -AutoCreateTable parameter and rerun the command

$params = @{
    SqlInstance         = 'sql1'
    Destination         = 'sql2'
    Database            = 'db1'
    DestinationDatabase = 'db2'
    Table               = '[dbo].[Table]'
    DestinationTable    = '[dbo].[Table]'
    AutoCreateTable     = $true
}
Copy-DbaDbTableData @params

And now it has completed successfully.

At this point you may be thinking…”everything is done!”

The client just said that he wants to replace all the data from last backup for a dozen of tables. In this case some of the tables do not exist anymore, they were dropped.. so adding the -AutoCreateTable parameter helped.

And we are done! Aren’t we?

Not so fast!

If the table don’t have any indexes, triggers, etc, or you just want the table skeleton, yes the work here is done!
Otherwise, we are not finished yet. And this was the scenario.

Primary key was missing and some constraints and indexes as well.

Beyond default scripting options

As I have demonstrated on my last post Scripting SQL Server objects with dbatools – Beyond default options, we can generate an object of ScriptingOptions and use the -ScriptingOptionsObject parameter available on the Export-DbaScript command to specify more than the defaults.

Exploring other options

As stated on the previous post, we have some properties like:

  • Indexes
  • DriPrimaryKey
  • DriForeignKeys
  • etc

By default they are set to $false and that explains why our -AutoCreateTable parameter (which uses default options) didn’t bring all of this details from our source tables.

This means that with a little bit more code we will be able to accomplish our requirements.

The code

Note: This is an modified version (to meet my needs) of the original script that I have borrowed from Andy Levy’s (b | t) blog post Copying Individual Tables with dbatools. Andy is also a dbatools contributor.

$SourceServer = "SQL1";
$DestinationServer = "SQL2"
$SourceDB = "srcDB";
$DestinationDB = "dstDB";
$tables = "Table1", 'Table2', "Table3";

$options = New-DbaScriptingOption

$options.DriPrimaryKey = $true
$options.DriForeignKeys = $true
$options.DriUniqueKeys = $true
$options.DriClustered = $true
$options.DriNonClustered = $true
$options.DriChecks = $true
$options.DriDefaults = $true

$tables | ForEach-Object {
    # Get the table definition from the source
    [string]$tableScript = Get-DbaDbTable -SqlInstance $SourceServer -Database $SourceDB -Table $_ | Export-DbaScript -ScriptingOptionsObject $options -Passthru;

    if (-not [string]::IsNullOrEmpty($tableScript)) {
        if ($null -eq (Get-DbaDbTable -SqlInstance $DestinationServer -Database $DestinationDB -Table $_)) {
        # Run the script to create the table in the destination
        Invoke-DbaQuery -Query $tableScript -SqlInstance $DestinationServer -Database $DestinationDB;
        }
        else {
            Write-Warning "Table $_ already exists in detination database. Will continue and copy the data."
        }

        # Copy the data
        Copy-DbaDbTableData -SqlInstance $SourceServer -Database $SourceDB -Destination $DestinationServer -DestinationDatabase $DestinationDB -KeepIdentity -Truncate -Table $_ -DestinationTable $_;
    }
    else {
        Write-Warning "Table $_ does not exists in source database."
    }
}

This will do the following steps:

  • New-DbaScriptingOption creates a new object of Scripting Options type and then we set a bunch of properties as $true.

For each of the tables that we define on our $tables list variable:

  • Export-DbaScript will generate the T-SQL script from the source table using the properties that we have defined. In this case, Keys (Primary, Foreign, Unique), Defaults and Checks Constraints and also clustered and non-clustered indexes.

  • Invoke-DbaQuery will run the generated script on the destination database. At this point we have the same table structure on both sides. This will only run if the table does not exists on the destination database.

  • Finally we use our Copy-DbaDbTableData command to copy the records. Here I have choosen to truncate the table with -Truncate parameter and keep identity values by specifying the -KeepIdentity.

Special callout:

If these tables have relations between them, you need to specify the table list in a specific order to make sure that parent tables are created before child tables. Otherwise you will get, as expected, some errors.

Final thoughts

dbatools has, literally, hundreds of commands so it’s important to know how to find the right one for the job.
Make sure you take a look on the help to understand what is or not covered by the process. As we saw in this example, assuming that the table would be created is a pitfall. For that, we need to use another parameter, but even so when reading the parameter description we will understand that this table will not be created with its full definition. This means that primary key, constraints, indexes, etc will not be created by default.

Because we are using PowerShell, and there is more than one way to accomplish a task, combining both commands as the New-DbaScriptingOption will make it more flexible and possible to reach the final result that we need.

Thanks for reading!

Scripting SQL Server objects with dbatools – Beyond default options

Probably you had the need to script out some objects from a SQL Server instance/database and this is quite easy. You just need to right click on the object (well…not on every single one, try it with an Availability Group :-), no script option available) select “Script XXXX as” and you have it.

But have you realized that this option doesn’t bring all the stuff?
Let’s say you are scripting a table and you have a Non-Clustered index or a trigger…using this option some of the objects under the table will not be scripted out. I understand this as each one of this is a different object, which means if you go to each one you can right click and script just that one.

SSMS – “Generate Scripts…” option

This is a tedious work and you can easily miss some objects. What can we do on SSMS to make this task easier?

You can accomplish this task by using the “Generate Scripts…” option under “Tasks” when you right-click on the database:

This will open the wizard and at the “Set Scripting Options” you can just click on the “Advanced” button and there you can change the properties.
Here you can see that some defaults are in place, and “Script Indexes” is one of them.

This is much easier right? All-in-one in a single scripting operation.

What about automating this task? You want to script out multiple objects from different instances/databases.

Enter dbatools’ “Export-” commands

To search for commands within dbatools we can use the Find-DbaCommand.

Find-DbaCommand -Tag Export

This command give to us a nice Synopsis text that help to find which command is the one we want. From the output we can see that we have (as of v0.9.824) 5 commands tagged as Export.

To replicate our SSMS example using PowerShell we will use the Export-DbaScript.

Don’t forget to use Get-Help cmdlet to find the available parameters and get some examples on how you can use the command.

Get-Help Export-DbaScript -Detailed

Tip: Try other switches like -Examples or even -ShowWindow (won’t work on PSCore but dbatools does!) switches.

Example using our “MyTable” object

Here is how MyTable looks like:

  • 3 Columns
  • 1 Default constraint
  • 1 Non-Clustered Index
Get-DbaDbTable -SqlInstance SQL1 -Database DB1 -Table MyTable | Export-DbaScript -Passthru

Note: I’m using -PassThru parameter to output the script to the console, by default it will create a SQL file.

The output of this execution is even more incomplete when comparing with SSMS. Here, we dont even get the default constraint scripted.

Using “New-DbaScriptingOption” command

dbatools has a command that makes it possible to create an object of type ScriptingOptions.
Then we can change the properties like we have done before on the “Generate Scripts…” option on SSMS.

$options = New-DbaScriptingOption
$options | Get-Member

Use Get-Member so you can see what properties the object offers.

Here we start seeing what we need.

By default what are the values of properties like NonClusteredIndexes and DriDefaults

$options = New-DbaScriptingOption
$options.NonClusteredIndexes
$options.DriDefaults

False! That explains why they are “missing” from our default Export-DbaScript output.

NOTE: Export-DbaUser comamnd can also leverage on this object. Try it.

Let’s change this options to $true and pass our $options object as the value of the -ScriptingOptionsObject parameter and run the command again.

$options = New-DbaScriptingOption
$options.NonClusteredIndexes = $true
$options.DriDefaults = $true
Get-DbaDbTable -SqlInstance SQL1 -Database DB1 -Table MyTable | Export-DbaScript -Passthru -ScriptingOptionsObject $options

Nice! Now we can see all the stuff.

Try it yourself

See the other options available, change the values, rerun and analyse the output.
Do you need to export it to run on a lower SQL Server version? Change the TargetServerVersion option

#Will script the code like SQL Server 2014
$options.TargetServerVersion = "Version120"

You want to include the “IF NOT EXISTS” statement? Change the “IncludeIfNotExists” option.

Here is an example to script out a list of tables (add more to the $TableName variable):

$SourceServer = "SQL1";
$SourceDB = "DB1";
$TableName = "MyTable", 'YourTable';

$options = New-DbaScriptingOption

$options.DriPrimaryKey = $true
$options.DriForeignKeys = $true
$options.DriNonClustered = $true
$options.DriChecks = $true
$options.DriDefaults = $true
$options.Indexes = $true
$options.IncludeIfNotExists = $true

$TableName | Foreach-Object {
    Get-DbaDbTable -SqlInstance $SourceServer -Database $SourceDB -Table $_ | Export-DbaScript -ScriptingOptionsObject $options -Passthru;
}

Availability Groups example using dbatools

Try it yourself:

Get-DbaAvailabilityGroup -SqlInstance SQL1 -AvailabilityGroup SQL1_AG1 | Export-DbaScript -Passthru

Summary

We have seen how we can leverage on some dbatools commands to generate T-SQL scripts from objects. This way we can versioning or just run them on other instace/database. We have seen what default options it (doesn’t) brings and how to add more options to it.
We also saw that, for some objects, the script option is not available. As example Availability Groups, but dbatools can help you here too.

Thanks for reading!