3zecisi3: November 2012

Wednesday, November 14, 2012

HOW TO SQL - NULL values

NULL values often generate unexpected results when they appear in SQL queries. This is because not always these NULL values are understood. So, what about those NULL values?
LNNVL
NULLIF
NVL
NVL2

It’s probably best to begin with a few words about what NULL is NOT:

·         NULL is not the number 0

·         NULL is not the empty string

NULL is a special placeholder in SQL used to indicate that the value does not exist in the database, that the value is unknown.

Now let’s explore some unknown NULL values!

·         Comparison with null values is done with special syntax IS NULL or IS NOT NULL, and doesn’t use comparison operands like =, <>, >, <.

select employee_id from employees where manager_id=NULL;

no rows selected

SQL> select employee_id from employees where manager_id IS NULL;

EMPLOYEE_ID

-----------

        100

·         Because the value of NULL values is unknown, unknown is also the result of any operation with NULL.

select 5+7+null+9 as suma from dual;

      SUMA

----------

       NULL

·         The truth table is also influenced by NULL values as follows, an interesting case being FALSE or NULL = NULL:

Bool1           Bool2           AND              OR

TRUE             TRUE             TRUE             TRUE

TRUE             FALSE           FALSE           TRUE

TRUE             NULL            NULL            TRUE

FALSE           TRUE             FALSE           TRUE

FALSE           FALSE           FALSE           FALSE

FALSE           NULL            FALSE           NULL

NULL            TRUE             NULL            TRUE

NULL            FALSE           FALSE           NULL

NULL            NULL            NULL            NULL

·         To view how unknown NULL values are treated in databases, let’s take the example of DEPARTMENTS table, where there are some departments with no manager (manager_id is NULL).

SELECT FROM DEPARTMENTS WHERE MANAGER_ID <= 200*

UNION ALL

SELECT FROM DEPARTMENTS WHERE MANAGER_ID > 200;*

This SQL, which is a UNION ALL between 2 sets of data that covers all values for MANAGER_ID will not return all departments, as someone would expect, but only those with a known value for manager_id.

10     Administration       200    1700

30     Purchasing           114    1700

50     Shipping             121    1700

60     IT                   103    1700

80     Sales                145    2700

90     Executive            100    1700

100    Finance              108    1700

20     Marketing            201    1700

40     Human Resources      203    2700

70     Public Relations     204    2700

110    Accounting           205    1700

·         Another thing that is related to the unknown character of NULL values is that NULL values aren’t stored in INDEXES. This is why queries having condition IS NULL don’t use index, even if there is one defined for the column specified in condition. On the contrary, the index is used in queries having conditions IS NOT NULL.

select from employees where manager_id is null;*

-------------------------------------------------------------------------------

| Id | Operation         | Name      | Rows | Bytes | Cost (%CPU)| Time     |

-------------------------------------------------------------------------------

|   0 | SELECT STATEMENT |           |     1 |    69 |    3   (0)| 00:00:01 |

| 1 | TABLE ACCESS FULL| EMPLOYEES |     1 |    69 |     3   (0)| 00:00:01 |*

-------------------------------------------------------------------------------

select from employees where manager_id is not null;*

--------------------------------------------------------------------------------

| Id | Operation                   | Name           | Rows | Bytes | Cost

--------------------------------------------------------------------------------

|   0 | SELECT STATEMENT            |                |   107 | 7383 |     9

|   1 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES      |   107 | 7383 |     9

| 2 |   INDEX FULL SCAN           | EMP_MANAGER_IX |   107 |       |     1*

--------------------------------------------------------------------------------

·         In sorting operation, NULL values appear LAST in ASC order and FIRST in DESC order.

select from departments order by manager_id ASC;*

           90 Executive                             100        1700

           60 IT                                    103        1700

          …….

          110 Accounting                            205        1700

          230 IT Helpdesk                           NULL       1700

          240 Government Sales                      NULL       1700

          ……

          140 Control And Credit                    NULL       1700

select from departments order by manager_id DESC;*

          140 Control And Credit                    NULL       1700

          ………

          160 Benefits                              NULL       1700

          110 Accounting                            205        1700

          ……

           90 Executive                             100        1700

·         All aggregate functions except `COUNT`() and `GROUPING` ignore nulls. You can use the `NVL` function in the argument to an aggregate function to substitute a value for a null. `COUNT` never returns null, but returns either a number or zero. For all the remaining aggregate functions, if the data set contains no rows, or contains only rows with nulls as arguments to the aggregate function, then the function returns null.

SELECT COUNT() FROM DEPARTMENTS;

COUNT()*

----------

        27

SQL> select count(manager_id) from departments;

COUNT(MANAGER_ID)

-----------------

               11

select min(manager_id) from departments;

MIN(MANAGER_ID)

---------------

            100

SQL> select max(manager_id) from departments;

MAX(MANAGER_ID)

---------------

            205

·         NULL does not equal NULL or other value

select from a;*

        C1         C2

---------- ----------

         1          1

         1          2

         1    NULL

      NULL    NULL

select from a where c1=c2;*

        C1         C2

---------- ----------

1             1

·         The NULL-related functions facilitate null handling. The NULL-related functions are:

COALESCE

·         Null have different behavior in NOT IN and NOT EXISTS.

Take, for example, the case of returning all employees that has no managers:

SELECT FROM HR.EMPLOYEES WHERE EMPLOYEE_ID NOT IN ( SELECT MANAGER_ID FROM HR.EMPLOYEES );*

no rows selected

SELECT FROM HR.EMPLOYEES E1 WHERE NOT EXISTS ( SELECT 1 FROM HR.EMPLOYEES E2 WHERE E2.EMPLOYEE_ID = E1.MANAGER_ID);*

        100 Steven               King

SKING                     515.123.4567         17-JUN-98 AD_PRES         24000

                                     90

                or departments with no employees:

SELECT department_id FROM hr.departments

WHERE department_id

NOT IN (SELECT department_id FROM HR.EMPLOYEES);

            no rows selected



SELECT DEPARTMENT_ID FROM HR.DEPARTMENTS d

WHERE NOT EXISTS (SELECT 1

FROM HR.EMPLOYEES e where e.department_id = d.department_id);

DEPARTMENT_ID

-------------

          120

          130

          140

………

As you can see, this happens because one or more values in NOT IN clause are NULL (UNKNOWN), and the predicate department_id NOT IN (x,y,NULL) evaluates to neither TRUE, nor FALSE, but to UNKNOWN (NULL).

This is why, NOT IN can be used if all the values that must be evaluated differ from NULL value.

As a conclusion of this short intro, the existence of NULL values into a database introduces a new degree of uncertainty. If not understood, a lot of guessing must be done by an SQL programmer to counter for erroneous results of NULL values in a database.

Subscribe to: Posts (Atom)