Identify and resolve “Adding a value to a ‘date’ column caused an overflow”

This is similar to my other post on how to identify conversion problems, but here’s the catch, we don’t always get a “CASE WHEN” statement or “TRY-CONVERT / CAST” function to help us! This makes it a little bit tricky to identify your problem rows, but we’ll tackle that together.

Let’s create a basic table to illustrate the point. We’ll have an ID column so we can identify our rows easily and we’ll have another column holding our date value. But before we go too far into this setup, what would be an example where you would receive such an error?

In my particular experience, I was calculating the start and end of week day based on a date. Take 2018-01-01 for instance, the week for that date starts on 2017-12-31 and the week ends on 2018-01-06. This is a relatively normal problem, but this sets us up for a scenario which we can apply our deductive reasoning and logic.

So let’s configure our environment so I can demonstrate this error along with the error “The conversion of a varchar data type to a datetime data type resulted in an out-of-range value.”

IF OBJECT_ID('tempdb.dbo.#Temp', 'U') IS NOT NULL
   DROP TABLE #Temp; 

CREATE TABLE #Temp 
 (id INT PRIMARY KEY IDENTITY (1,1),
 DateCol DATE
 -- DateCol VARCHAR(25)
 );
 
 INSERT INTO #Temp (DateCol)
 VALUES ('2018-01-01'),('0001-01-01'),('9999-12-31');

/*****************
Show data inserted
*****************/
SELECT 
       [ID], 
       [DateCol]
FROM [#Temp];
Result set from test table.

So far things look fine right? Well, what happens if we take this data and we try to calculate the beginning or ending of the week?

SELECT 
       DATEADD([dd], -(DATEPART([dw], [DateCol]) - 1), [DateCol]) AS [WeekStart], 
       DATEADD([dd], 7 - (DATEPART([dw], [DateCol])), [DateCol]) AS  [WeekEnd]
FROM [#Temp];
“Adding a value to a ‘date’ column caused an overflow” is the error we receive when trying to find the start and end of our date values in this data set.

So this is a pretty easy data set to read through and figure out, you likely will have hundreds/thousands/millions/billions of rows to decipher making this data set child’s play. The theory will scale, albeit it may scan the tables and take longer to reach a conclusion.

Now if your value is being inserted into a DATE type column, you don’t need to worry about verifying the column you are adding or subtracting on is actually a DATE or not. (This is because the DATE type performs this verification on insert/update and will cause invalid data points to fail.) You do need to worry however if the addition or subtraction of that DATE falls out of the range for the data type. For the DATE value, the range is ‘0001-01-01’ to ‘9999-12-31’.

What if your data type is a VARCHAR or similar and it allows for wiggle room? You could have letters, mixed up entries between month/day, or even an invalid date entry like ‘10000-01-01’! In this instance, I would recommend to use the CASE WHEN or TRY CAST or CONVERT functions to find those outliers. I’m not a fan of ISDATE() because it will say any value before ‘1753-01-01’ is an invalid date. (This is our exception to the above statement where I stated you can’t use CASE WHEN or TRY CAST / CONVERT. When the data type is anything but a date type, this method can help.)

Ok, so if the column is NOT a DATE type, we have the tools to find the bad data as described in my last conversion post with the TRY CAST/CONVERT or a CASE WHEN statement. This does not help us with an out of date range value. So how should we tackle this?

I would take your base query starting at the FROM clause all the way to the ORDER clause. Let’s append a where clause from the previous query so I can show you an example of what I mean. You would take this query:

SELECT 
       DATEADD([dd], -(DATEPART([dw], [DateCol]) - 1), [DateCol]) AS [WeekStart], 
       DATEADD([dd], 7 - (DATEPART([dw], [DateCol])), [DateCol]) AS  [WeekEnd]
FROM [#Temp]
WHERE ID > 0 AND DateCol IS NOT NULL;

Now strip out the base query:

SELECT 
id, DateCol
FROM [#Temp]
WHERE ID > 0 AND DateCol IS NOT NULL;

The SELECT piece is what’s causing our error, so we keep the FROM/WHERE clause intact because it will present us the data rows we need to analyze because this will contain both the rows causing a problem and rows not causing a problem. Let’s add an ORDER BY clause now and look at our data set.

SELECT TOP 100 
       [id], 
       [DateCol]
FROM [#Temp]
WHERE [ID] > 0
      AND [DateCol] IS NOT NULL
ORDER BY 
         [DateCol] DESC;
SELECT TOP 100 
       [id], 
       [DateCol]
FROM [#Temp]
WHERE [ID] > 0
      AND [DateCol] IS NOT NULL
ORDER BY 
         [DateCol] ASC;

Each query gives us the top end and bottom end of the values of the DateCol which are very easy to see. It’s very transparent to see in the first query that our DateCol is out of range and likewise, it’s very easy to see the opposite end in the second query.

Some other ideas if you are desperate to isolate the dirty data would be to create an ETL process via SSIS/Python or your choice of tool that has the ability to try the function you are performing and can distribute rows that succeed and rows that fail. You would either notate in a single table the data that succeeded and the data that failed or perhaps you have a single table to route data failures to and also their original value.

Good luck!

I have added in a form for feedback, it would be very helpful if you would take a few minutes to fill this out. My goal is to adjust the content to best help others resolve their problems.

Leave a Reply

Your email address will not be published. Required fields are marked *