Wednesday, April 5, 2017

String to DATE conversion and validation in 12.2

A new little feature in Oracle Database 12.2 is, that you can convert strings to dates without worrying about exception handling. (That goes also for converting to numbers or timestamps or other datatypes, but here I'll concentrate on dates.)

It's just one of a ton of new 12.2 features ranging from minor features that make your daily life slightly easier to major inventive features. I've been part of a Trivadis Team that deep-dived into all of the new features to create the Trivadis TechnoCircle on 12.2 new features and it's been very interesting.

So this is just one new feature, but nice enough for a developer to merit a little blogging about it.

It's actually a double feature - a VALIDATE_CONVERSION function to validate if a conversion can take place, and DEFAULT ON CONVERSION ERROR syntax for TO_DATE to run without raising exceptions for conversion errors.

VALIDATE_CONVERSION

The function returns 1 if the string can be converted according to the format mask:

select validate_conversion(
          '2017-04-05' as date
        , 'YYYY-MM-DD'
       ) is_valid
  from dual
/

  IS_VALID
----------
         1

And it returns 0 if it cannot be converted according to format mask:

select validate_conversion(
          'April 5. 2017' as date
        , 'YYYY-MM-DD'
       ) is_valid
  from dual
/

  IS_VALID
----------
         0

It supports NLS parameter just like TO_DATE:

select validate_conversion(
          '5. Abril 2017' as date
        , 'DD. Month YYYY'
        , 'NLS_DATE_LANGUAGE=SPANISH'
       ) is_valid
  from dual
/

  IS_VALID
----------
         1

If you validate conversion for the NULL string, it returns 1, because a NULL string can be passed to TO_DATE without error and be converted to a NULL date:

select validate_conversion(
          null as date
        , 'YYYY-MM-DD'
       ) is_valid
  from dual
/

  IS_VALID
----------
         1

DEFAULT ON CONVERSION ERROR

Let's set session NLS_DATE_FORMAT to make sure we can view output dates:

alter session set nls_date_format = 'YYYY.MM.DD'
/

In the TO_DATE function, a default can be set that will be returned on conversion error instead of exception raised:

select to_date(
          'April 5. 2017' default '4000-01-01' on conversion error
        , 'YYYY-MM-DD'
       ) the_date
  from dual
/

THE_DATE
----------
4000.01.01

It also works with conversion errors due to NSL errors, not just errors from the format mask itself:

select to_date(
          '5. Abril 2017' default '1. January 4000' on conversion error
        , 'DD. Month YYYY'
        , 'NLS_DATE_LANGUAGE=AMERICAN'
       ) the_date
  from dual
/

THE_DATE
----------
4000.01.01

Note that the default value must be a string that conforms to the format mask. The default string will always be attempted converted to a date, even if there are no conversion errors. So if the default string cannot be converted according to format mask, exception is always raised, irrespective of whether there are errors in the actual value to be converted.

select to_date(
          'April 5. 2017' default date 'January 1. 4000' on conversion error
        , 'YYYY-MM-DD'
       ) the_date
  from dual
/

          'April 5. 2017' default date 'January 1. 4000' on conversion error
                                       *
ERROR at line 2:
ORA-01841: (full) year must be between -4713 and +9999, and not be 0

We might feel it natual to use a DATE for default value, but beware: it is actually implicitly converted to a string, which then afterwards is explicitly converted to a date again (just like if we had entered a default string value.)

This example works only because the session NLS_DATE_FORMAT of 'YYYY.MM.DD' makes the implicit conversion of the DATE return '4000.01.01', which can successfully be converted back to a DATE by the format mask 'YYYY-MM-DD':

select to_date(
          'April 5. 2017' default date '4000-01-01' on conversion error
        , 'YYYY-MM-DD'
       ) the_date
  from dual
/

THE_DATE
----------
4000.01.01

But let us change session NLS_DATE_FORMAT:

alter session set nls_date_format = 'MM/DD-RR'
/

And now the DATE literal in the exact same query is implicitly converted to string '01/01-40', which makes the explicit conversion back to DATE fail and raise an exception:

select to_date(
          'April 5. 2017' default date '4000-01-01' on conversion error
        , 'YYYY-MM-DD'
       ) the_date
  from dual
/

          'April 5. 2017' default date '4000-01-01' on conversion error
                                  *
ERROR at line 2:
ORA-01847: day of month must be between 1 and last day of month

So let us get back to our original NLS_DATE_FORMAT for the rest of the demo:

alter session set nls_date_format = 'YYYY.MM.DD'
/

CAST alternative to TO_DATE

The CAST function now supports format masks like TO_DATE (or like TO_NUMBER or other TO_* functions if CAST to other datatypes), so now is practically an alias syntax that works like TO_* functions:

select cast(
          '2017-04-05' as date
        , 'YYYY-MM-DD'
       ) the_date
  from dual
/

THE_DATE
----------
2017.04.05

CAST also has been given the same DEFAULT ON CONVERSION ERROR clause:

select cast(
          'April 5. 2017' as date default '4000-01-01' on conversion error
        , 'YYYY-MM-DD'
       ) the_date
  from dual
/

THE_DATE
----------
4000.01.01

And NLS parameters can also be used in CAST:

select cast(
          'Abril 5. 2017' as date default 'January 1. 4000' on conversion error
        , 'Month DD. YYYY'
        , 'NLS_DATE_LANGUAGE=AMERICAN'
       ) the_date
  from dual
/

THE_DATE
----------
4000.01.01

Combine validation and conversion

We can use multiple VALIDATE_CONVERSION calls to try out multiple format masks, but beware of having too general format masks.

In this example '5.april 17' will actually match format mask 'DD-MM-RRRR' and output '2005.04.17':

with s(str) as (
   select '2017-04-05' from dual union all
   select '5/4-2017'   from dual union all
   select '17.4.05'    from dual union all
   select '5.apr 2017' from dual union all
   select '5.abril 17' from dual union all
   select '5.april 17' from dual
)
select s.str
     , case
          when validate_conversion(s.str as date, 'RRRR-MM-DD') = 1
                              then to_date(s.str, 'RRRR-MM-DD')
          when validate_conversion(s.str as date, 'DD-MM-RRRR') = 1
                              then to_date(s.str, 'DD-MM-RRRR')
          when validate_conversion(s.str as date, 'DD-MM-RRRR', 'NLS_DATE_LANGUAGE=SPANISH') = 1
                              then to_date(s.str, 'DD-MM-RRRR', 'NLS_DATE_LANGUAGE=SPANISH')
       end the_date
  from s
/

STR        THE_DATE
---------- ----------
2017-04-05 2017.04.05
5/4-2017   2017.04.05
17.4.05    2017.04.05
5.apr 2017 2017.04.05
5.abril 17 2017.04.05
5.april 17 2005.04.17

We can alter the order of the WHEN clauses to help make '5.april 17' convert correctly.

But now '17.4.05' will match 'DD-MM-RRRR' to output '2005.05.17' :-(

with s(str) as (
   select '2017-04-05' from dual union all
   select '5/4-2017'   from dual union all
   select '17.4.05'    from dual union all
   select '5.apr 2017' from dual union all
   select '5.abril 17' from dual union all
   select '5.april 17' from dual
)
select s.str
     , case
          when validate_conversion(s.str as date, 'DD-MM-RRRR', 'NLS_DATE_LANGUAGE=SPANISH') = 1
                              then to_date(s.str, 'DD-MM-RRRR', 'NLS_DATE_LANGUAGE=SPANISH')
          when validate_conversion(s.str as date, 'DD-MM-RRRR') = 1
                              then to_date(s.str, 'DD-MM-RRRR')
          when validate_conversion(s.str as date, 'RRRR-MM-DD') = 1
                              then to_date(s.str, 'RRRR-MM-DD')
       end the_date
  from s
/

STR        THE_DATE
---------- ----------
2017-04-05 2017.04.05
5/4-2017   2017.04.05
17.4.05    2005.04.17
5.apr 2017 2017.04.05
5.abril 17 2017.04.05
5.april 17 2017.04.05

So to make this work better, we can use FX modifier to make our format masks very specific. That reduces the risk of wrong matches of format masks:

with s(str) as (
   select '2017-04-05' from dual union all
   select '5/4-2017'   from dual union all
   select '17.4.05'    from dual union all
   select '5.apr 2017' from dual union all
   select '5.abril 17' from dual union all
   select '5.april 17' from dual
)
select s.str
     , case
          when validate_conversion(s.str as date, 'FXYYYY-MM-DD') = 1
                              then to_date(s.str, 'FXYYYY-MM-DD')
          when validate_conversion(s.str as date, 'DD/MM-FXYYYY') = 1
                              then to_date(s.str, 'DD/MM-FXYYYY')
          when validate_conversion(s.str as date, 'FXRR.FXMMFX.DD') = 1
                              then to_date(s.str, 'FXRR.FXMMFX.DD')
          when validate_conversion(s.str as date, 'DD.FXMon YYYY') = 1
                              then to_date(s.str, 'DD.FXMon YYYY')
          when validate_conversion(s.str as date, 'DD.FXMonth RR', 'NLS_DATE_LANGUAGE=SPANISH') = 1
                              then to_date(s.str, 'DD.FXMonth RR', 'NLS_DATE_LANGUAGE=SPANISH')
          when validate_conversion(s.str as date, 'DD.FXMonth RR', 'NLS_DATE_LANGUAGE=AMERICAN') = 1
                              then to_date(s.str, 'DD.FXMonth RR', 'NLS_DATE_LANGUAGE=AMERICAN')
       end the_date
  from s
/

STR        THE_DATE
---------- ----------
2017-04-05 2017.04.05
5/4-2017   2017.04.05
17.4.05    2017.04.05
5.apr 2017 2017.04.05
5.abril 17 2017.04.05
5.april 17 2017.04.05

But even though it works for this example, it cannot be entirely correct, because if the data would contain '5.4.17' and '4.5.17' it is impossible to know which is April 5th and which is May 4th if both Americans and Europeans can enter data.

VALIDATE_CONVERSION and DEFAULT ON CONVERSION ERROR can be very useful and helpful tools, but you still have to know your data to use the correct format.

No comments:

Post a Comment