Don’t cutoff yourself with dates in T-SQL – Did you know…

Almost all the applications we use have a lot of options. And, sometimes we even stumble across them by accident.
Who’s ever heard something like “Normal user don’t even use 5% of Excel capabilities!”?

Other options, we know they exist but because the default value it’s ok (so far) we tend to forget it. It is just sitting there, waiting for the day we want/need to change it.

The oddity

Few days ago I was talking with a friend that show me some “odd behavior” when working with dates on SQL Server.

Are you mad?! Is this SQL Server bugged?

In case you ask, this is my table definition:

CREATE TABLE DatesFormat
(
    dt DATETIMEOFFSET DEFAULT ('1900-01-01')
);
GO

Let’s take a closer look at the SELECT.

  1. On the 1st column, we can see the complete value of the column
  2. The 2nd column picks the year from the date column
  3. The 3rd one also picks the year from a datetime but declared as string.

But why don’t the 2nd and 3rd columns return the exact same value?!

What is your interpretation?

What do you read when you see some date in a format like “01-Jan-00 00:00:00.000”? Keep in mind that I’m talking about the output directly from the table and without any formatting.
1st of January seems to leave no doubt (just because there is no default date format starting with two digits for the year), but…what about the year part ’00’?
It stands for 1900 and the 3rd column is wrong?
Or it stands for 2000 and the DATEPART function is returning the wrong value?

Both are returning the correct value! Say hello to “Two Digit Year Cutoff” configuration

You can find it on the advanced tab in the Server Proprieties:

Or by running the sp_configure command:

EXEC sp_configure 'show advanced options', 1;  
GO  
RECONFIGURE ;  
GO  
EXEC sp_configure 'two digit year cutoff';  
GO 

Or even using dbatools PowerShell module:

Get-DbaSpConfigure -SqlInstance sql2016 -ConfigName 'TwoDigitYearCutoff'

Output:

That’s right! This option is the one that makes the last column value being translated to 2000 and not 1900.

If we change the configuration to 1999 instead of 2049 (default value) the result of the DATEPART will be 1900 but having it as 2049 will convert the year as 2000 (from the date that is a string).

The test:

-- Check the running value
EXEC sp_configure 'two digit year cutoff';
GO

-- Notice the 3rd column
SELECT dt, DATEPART(YEAR, dt), DATEPART(YEAR, '01-Jan-00 12:00:00 AM +00:00') 
  FROM dbo.DatesFormat
GO

-- Change the configuration
EXEC sp_configure 'two digit year cutoff', 1999;  
GO  
RECONFIGURE ;  
GO 

-- Confirm the running value
EXEC sp_configure 'two digit year cutoff';
GO

-- See the different value on the 3rd column
SELECT dt, DATEPART(YEAR, dt), DATEPART(YEAR, '01-Jan-00 12:00:00 AM +00:00') 
  FROM dbo.DatesFormat

Output:

Remember, this only happens when you use a literal string.

To set a new value using dbatools:

Set-DbaSpConfigure -SqlInstance sql2016 -ConfigName 'TwoDigitYearCutoff' -Value 1999

Output:

What about the returning value?

Yeah, I know, why is the value of the first column returned on that format? You are used to seeing in the format of yyyy-MM-dd right?
I’ll explain this in a next post! Stay tuned.

Summary

Next time you have to work with dates in formats like dd-MMM-yy remember that “Two Digit Year Cutoff” configuration exists and may mislead you.

To complete the question…”Did you know that ‘Two Digit Year Cutoff’ configuration can trick you?” Now you do.

Thanks for reading.

Did you know…you can change the default file encoding for new query on SSMS?

If you have been reading my last blog posts, you know that I’m currently working on a SQL code migration from Firebird to SQL Server.

The client provided the scripts with all modules (Stored Procedures, functions, etc) and the steps I’m following (roughly speaking) for converting are:

  1. Open new query window
  2. Copy and paste de object code
  3. Save the file
    This is how the file look like:

  4. Run a PowerShell script that does a find and replace based on a hashtable. Apply all the changes and save the file again.

  5. The file refresh on SSMS
    This is how the file look like after the find and replace:

  6. Unicode characters are broken 😦

So…what is happening?

The file that is used to create a new query window has ANSI encoding but when I save the file on the PowerShell script I save it as UTF-8 because the client have comments on the code with unicode characters.

On this process, the unicode characters are replaced by some symbols.

How to solve?

We can change the default file encoding in order to be the one we want in the first place. What I have done was change from ANSI encoding to UTF-8.

This way, when use the keyboard shortcut CTRL + N to open a new window and hit Save, I’m saving as UTF-8 which means that the PowerShell script will do the find and replace, save the file and preserve the unicode characters. 🙂

Where is that default file?

My path to the file is C:\Program Files (x86)\Microsoft SQL Server\140\Tools\Binn\ManagementStudio\SqlWorkbenchProjectItems\Sql where the 140 stands for the SSMS v17 (in my case, right know I’m using the v17.5).
Inside this folder we can find the file SQLFile.sql.

We just need to open it, for example with notepad, do the Save As and choose the encoding we want.

Now if you go to the SSMS, open a new query window and hit save, you can confirm that the file is saved by default with the new encoding you have setup.

With this I have eliminated one tedious step that forced me to do the Save As and choose the UTF-8 every single file I wanted to save. I have hundreds of objects so, this small improvement times hundreds of iterations save me a good amount of time!

Thanks for reading.

Using CAST() function with User-Defined Data Types…Did you know…

I’m converting some Firebird database code to T-SQL and I’m seeing a lot of code that doesn’t work in the same way that SQL Server.
No surprise – I already expected that but for those that still say that “all engines/databases are equal”…”is SQL right?” here is another proof that is not true.

On Firebird it is possible to use the CAST function using a DOMAIN (the equivalent in SQLServer is UDDT – User-Defined Data Types) as target data-type, well turns out that on SQL Server…that is not possible.

Which means that, even if I have created a UDDT as:

CREATE TYPE dbo.VARCHAR_50 FROM VARCHAR(50);

The following code will not work

SELECT CAST('value' AS VARCHAR_50)

and it yields the following error message:

Msg 243, Level 16, State 2, Line 1
Type VARCHAR_50 is not a defined system type.

this means that we need to change it to the defined system type like:

SELECT CAST('value' as VARCHAR(50))

Maybe it works with CONVERT() function?!…not really, the behaviour is the same.

To finish the title…Using CAST() function with User-Defined Data Types…Did you know… it is not possible. You need to use the system type.

Thanks for reading.

Did you know…you can’t do arithmetic operations with datetime2 like with datetime?

I’m currently working on a SQL code migration from Firebird to SQL Server and I hit an error that I haven’t seen for some time.

The error message is the following:

Msg 206, Level 16, State 2, Line 4
Operand type clash: datetime2 is incompatible with int

This ringed the bell right away! Somewhere on the code someone was trying to do an arithmetic calculation without using the proper function.

How so?

In the early days of my T-SQL coding, I used to do this a lot. Also, I still see some code from other applications that still use it this way. Take, for instance, the following code that returns all orders placed with more than 1 day old:

SELECT OrderId, ClientId, Quantity, OrderDate
FROM dbo.Orders
WHERE OrderDate < GETDATE() -1

For this example let’s say that the OrderDate column is a DATETIME2. This works just fine because the GETDATE() function returns a DATETIME value and thus we can subtract one day from it.

If we define a variable of DATETIME2 datatype and assign it a GETDATE() value, then attempt to subtract-1 from the variable, an error will yield!

DECLARE @vOrderDate DATETIME2 = GETDATE()
SELECT OrderId, ClientId, Quantity, OrderDate
FROM dbo.Orders
WHERE OrderDate < @vOrderDate - 1

Msg 206, Level 16, State 2, Line 20
Operand type clash: datetime2 is incompatible with int

But it was working!?

Yes it was on the source engine (Firebird) and it will still work on the destination (SQLServer) if the datatype is still the same – DATETIME.

What happened here was the column datatype was changed during the schema migration from DATETIME to DATETIME2.

NOTE: The most recent date/time datatypes appeared with SQL Server 2008. They are DATE, TIME, DATETIME2, DATETIMEOFFSET.
Also, bear in mind that actually the DATETIME and SMALLDATETIME datatypes are the only from the date/time family that supports this arithmetic operations.

How to fix this error?

To solve this, we need to convert the

@vOrderDate  -1

to

DATEADD(dd, -1, @vOrderDate) 

Whole code looks like:

DECLARE @vOrderDate DATETIME2 = GETDATE()
SELECT OrderId, ClientId, Quantity, OrderDate
FROM dbo.Orders
WHERE OrderDate < DATEADD(dd, -1, @vOrderDate) 

this way, it will work with DATETIME, DATETIME2, DATE, SMALLDATETIME and DATETIMEOFFSET datatypes.

NOTE: DATEADD also support TIME datatype, I didn’t mention because on our example we are subtracting DAYS, and as (at least I) expected this will give an error.

Wrap up

Are you thinking about changing your DATETIME columns to DATETIME2? Or are you just beginning to use it in your projects?
Documentation encourages you to do so (https://docs.microsoft.com/en-us/sql/t-sql/data-types/datetime-transact-sql) but as you could see from this post, you need to pay attention and do the proper testing and T-SQL code revision.

Thanks for reading.

Using Common Table Expression (CTE) – Did you know…

Today I will write just a short blog post to do a quick reminder!

I still hear a lot of people suggesting CTEs because they think it works like a temporary table (you populate the table and then it can be/is reutilized).

It doesn’t!

From de documentation:

Specifies a temporary named result set, known as a common table expression (CTE).

Maybe they are focusing on the “temporary” word.

Using the CTE two times will perform two different executions! Don’t believe me? See the next example!
If we run the following code do you expect to get the same value for both queries? Note: we have a UNION ALL between them.

WITH cte AS
(
	SELECT NEWID() AS Col1
)
SELECT Col1
  FROM cte
UNION ALL
SELECT Col1
  FROM cte

Sorry to disappoint you but it will run the CTE’s code twice and return the value(s) from each execution.
As we are using the function NEWID(), two different values will be generated.

output.png

To complete the question: “Did you know that CTE’s code will be executed as many times as you use it?”

Thanks for reading!

DELETE data on SQL Server HEAP table – Did you know…

Before I complete my question let me provide context.

I’ve received an alert saying that a specific database could not allocate a new page (disk was full)

The message that you will see on the SQL Server Error log is:

Could not allocate a new page for database ” because of insufficient disk space in filegroup ”. Create the necessary space by dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.

I didn’t know the database structure or what is saved there, so I picked up a script from my toolbelt that shreds all indexes from all table. Just some information like number of rows and space that it is occupying. I have sorted by occupying space in descending order, look what I found…

So…my script has a bug? 🙂 No, it hasn’t!

The joy of heaps

First, the definition:

A heap is a table without a clustered index. One or more nonclustered indexes can be created on tables stored as a heap. Data is stored in the heap without specifying an order. Usually data is initially stored in the order in which is the rows are inserted into the table, but the Database Engine can move data around in the heap to store the rows efficiently; so the data order cannot be predicted. To guarantee the order of rows returned from a heap, you must use the ORDER BY clause. To specify the order for storage of the rows, create a clustered index on the table, so that the table is not a heap.

Source: MS Docs – Heaps (Tables without Clustered Indexes)

Until now, everything seems normal, it is just a table with unordered data.

Why am I talking about heaps?

Not because of table name (was created on propose for this demo), let me show to you the whole row of the script:

Do you have a clue? Yup, index_id = 0. That means that our table does not have a clustered index defined and therefore it is an HEAP.

Even so, how it is possible? 0 rows but occupying several MB…

The answer is…on the documentation 🙂

When rows are deleted from a heap the Database Engine may use row or page locking for the operation. As a result, the pages made empty by the delete operation remain allocated to the heap. When empty pages are not deallocated, the associated space cannot be reused by other objects in the database.

source: DELETE (Transact-SQL) – Locking behavior

That explains it!

So…what should I do in order to get my space back when deleting from a HEAP?

On the same documentation page we can read the following:

To delete rows in a heap and deallocate pages, use one of the following methods.

  • Specify the TABLOCK hint in the DELETE statement. Using the TABLOCK hint causes the delete operation to take an exclusive lock on the table instead of a row or page lock. This allows the pages to be deallocated. For more information about the TABLOCK hint, see Table Hints (Transact-SQL).
  • Use TRUNCATE TABLE if all rows are to be deleted from the table.
  • Create a clustered index on the heap before deleting the rows. You can drop the clustered index after the rows are deleted. This method is more time consuming than the previous methods and uses more temporary resources.

Following the documentation, it suggest we can to use the TABLOCK hint in order to release the empty pages when deleting the data.
Example:

DELETE 
  FROM dbo.Heap WITH (TABLOCK)

What if I didn’t that way or if anyone else run a DELETE without specify it?

You can rebuild your table using this syntax (since SQL Server 2008):

ALTER TABLE dbo.Heap REBUILD

This way, the table will release the empty pages and you will recovery the space to use on other objects in the database.

Wrap up

I hope that with this little post you understood how and why a HEAP can have few rows or even zero but still occupy lots of space. Also I have mentioned two ways to solve this problem.
Also, I have found databases with dozens of HEAPS almost empty or even empty that were occupying more than 50% of the total space allocated to the database. And guess what? People where complaining about space.

To finish, I need to complete the title, Did you know…you should use TABLOCK hint when deleting data from a HEAP?

Thanks for reading!