Friday, December 4, 2015

Vessel Fuel Consumption - or Restarting Running Sum

I got a question recently from Morten Braten whether my FIFO analytic techniques could be helpful for calculating the value of fuel consumption for cargo vessels. It turned out not to be quite the same, but similar. Analytic functions definitely could help.

I'll show how in two parts. First this blog post will calculate the fuel consumption in quantity, then part two will calculate the value by FIFO.

So let's setup a slightly simplified version of Mortens table:

```create table vessel_fuel (
voyage_id            number         not null
, inventory_date       date           not null
, seq                  number         generated as identity primary key
, port                 varchar2(10)   not null
, movement_type        varchar2(3)    not null
, qty                  number         not null
, unit_value           number
)
/
create unique index vessel_fuel_voyage_idx on vessel_fuel (
voyage_id, inventory_date, seq
)
/
alter session set nls_date_format = 'YYYY-MM-DD HH24:MI'
/
insert into vessel_fuel (
voyage_id, inventory_date, port, movement_type, qty, unit_value
) values (
101, '2015-11-01 07:00', 'FREDERICIA', 'OB' , 150, 750
)
/
insert into vessel_fuel (
voyage_id, inventory_date, port, movement_type, qty, unit_value
) values (
101, '2015-11-03 04:00', 'ROTTERDAM' , 'ROB',  70, null
)
/
insert into vessel_fuel (
voyage_id, inventory_date, port, movement_type, qty, unit_value
) values (
101, '2015-11-03 06:00', 'ROTTERDAM' , 'IN' , 100, 700
)
/
insert into vessel_fuel (
voyage_id, inventory_date, port, movement_type, qty, unit_value
) values (
101, '2015-11-04 22:00', 'LONDON'    , 'ROB', 130, null
)
/
insert into vessel_fuel (
voyage_id, inventory_date, port, movement_type, qty, unit_value
) values (
101, '2015-11-04 23:00', 'LONDON'    , 'OUT',  60, null
)
/
insert into vessel_fuel (
voyage_id, inventory_date, port, movement_type, qty, unit_value
) values (
101, '2015-11-05 05:00', 'LONDON'    , 'IN' , 200, 650
)
/
insert into vessel_fuel (
voyage_id, inventory_date, port, movement_type, qty, unit_value
) values (
101, '2015-11-06 22:00', 'ESBJERG'   , 'ROB', 210, null
)
/
commit
/

select voyage_id
, inventory_date
, seq
, port
, movement_type
, qty
, unit_value
from vessel_fuel
order by voyage_id, inventory_date, seq
/
```

``` VOYAGE_ID INVENTORY_DATE          SEQ PORT       MOV        QTY UNIT_VALUE
---------- ---------------- ---------- ---------- --- ---------- ----------
101 2015-11-01 07:00          1 FREDERICIA OB         150        750
101 2015-11-03 04:00          2 ROTTERDAM  ROB         70
101 2015-11-03 06:00          3 ROTTERDAM  IN         100        700
101 2015-11-04 22:00          4 LONDON     ROB        130
101 2015-11-04 23:00          5 LONDON     OUT         60
101 2015-11-05 05:00          6 LONDON     IN         200        650
101 2015-11-06 22:00          7 ESBJERG    ROB        210
```

The idea is that each cargo vessel goes on voyages, each voyage consisting of a series of stops at various ports. The calculations to be performed are to be done for each voyage. The test data I've created is just for a single voyage and I'm assuming there's a cross table somewhere to link the voyage ID with a specific cargo vessel.

The data for the voyage always starts out with an opening balance (OB) that shows how much fuel is in the tanks at the start of the voyage and what it cost per unit.

Then throughout the voyage we either take more fuel on board (IN) at some cost, or we transfer fuel to another vessel in the fleet (OUT). IN can actually be either bought fuel (bunkering) or transfer from another vessel, but that is really irrelevant for this exercise.

Of course the engine of the vessel also consumes fuel. We don't know how much, but we want to calculate it. So from time to time we measure how much fuel is in the tanks and register it in the data using movement type ROB (Remaining On Board). That will enable us to calculate fuel consumption, which is the goal of this part 1 blog post.

Besides the already mentioned movement types, there will be two more that will be introduced later, making for a total of 6 movement types:

 OB opening balance IN from bunkering or transfer from other vessel ROB remaining on board, ie measurement OUT transfer to other vessel CON consumption CB closing balance

If the data had already included consumption (CON) rows, we could have calculated how much fuel is left in the tanks by a simple running sum:

```select voyage_id
, inventory_date
, seq
, port
, movement_type
, qty
, sum(
case
when movement_type in ('OB' , 'IN' ) then  qty
when movement_type in ('CON', 'OUT') then -qty
end
) over (
partition by voyage_id
order by inventory_date, seq
rows between unbounded preceding and current row
) running_qty
from vessel_fuel
order by voyage_id, inventory_date, seq
/
```

``` VOYAGE_ID INVENTORY_DATE          SEQ PORT       MOV        QTY RUNNING_QTY
---------- ---------------- ---------- ---------- --- ---------- -----------
101 2015-11-01 07:00          1 FREDERICIA OB         150         150
101 2015-11-03 04:00          2 ROTTERDAM  ROB         70         150
101 2015-11-03 06:00          3 ROTTERDAM  IN         100         250
101 2015-11-04 22:00          4 LONDON     ROB        130         250
101 2015-11-04 23:00          5 LONDON     OUT         60         190
101 2015-11-05 05:00          6 LONDON     IN         200         390
101 2015-11-06 22:00          7 ESBJERG    ROB        210         390
```

But we don't have CON rows - they are the ones we need to calculate. To do that we need a running sum that "restarts" every time we have a measurement of the tank quantity - i.e. "restarts" for every ROB row, such that for example the running_qty of row 3 becomes 170, because we know for a fact that at row 2 we had 70 in the tanks and row 3 adds 100.

To do that we can group the rows such that every ROB row becomes the first row in each group:

```select voyage_id
, inventory_date
, seq
, port
, movement_type
, qty
, last_value(
case movement_type
when 'ROB' then seq
end
ignore nulls
) over (
partition by voyage_id
order by inventory_date, seq
rows between unbounded preceding and current row
) as group_first_seq
from vessel_fuel
order by voyage_id, inventory_date, seq
/
```

``` VOYAGE_ID INVENTORY_DATE          SEQ PORT       MOV        QTY GROUP_FIRST_SEQ
---------- ---------------- ---------- ---------- --- ---------- ---------------
101 2015-11-01 07:00          1 FREDERICIA OB         150
101 2015-11-03 04:00          2 ROTTERDAM  ROB         70               2
101 2015-11-03 06:00          3 ROTTERDAM  IN         100               2
101 2015-11-04 22:00          4 LONDON     ROB        130               4
101 2015-11-04 23:00          5 LONDON     OUT         60               4
101 2015-11-05 05:00          6 LONDON     IN         200               4
101 2015-11-06 22:00          7 ESBJERG    ROB        210               7
```

Notice the first group gets NULL as GROUP_FIRST_SEQ. This is OK, as all we will be using it for is a PARTITION BY. You may ask why I didn't do it like this:

```     , last_value(
case
when when movement_type in ('OB','ROB') then seq
end
ignore nulls
) over (
```

The reason is that a voyage opening balance might consist of more than one row, since the 150 might have been bought at different values, so our data might have looked like this:

``` VOYAGE_ID INVENTORY_DATE          SEQ PORT       MOV        QTY UNIT_VALUE
---------- ---------------- ---------- ---------- --- ---------- ----------
101 2015-11-01 07:00          1 FREDERICIA OB          40        690
101 2015-11-01 07:00          2 FREDERICIA OB         110        750
...
```

And those two rows we do not want to be part of different groups, they should be in the same group. Therefore it is better with the solution above where first group is NULL.

I'll show example of such opening balance in the part 2 blog post - for now just let's continue with a single OB row and move on to using the GROUP_FIRST_SEQ to create a new running sum:

```select s1.*
, sum(
case
when movement_type in ('OB' , 'IN', 'ROB') then  qty
when movement_type = 'OUT' then -qty
end
) over (
partition by voyage_id, group_first_seq
order by inventory_date, seq
rows between unbounded preceding and current row
) running_qty
from (
select voyage_id
, inventory_date
, seq
, port
, movement_type
, qty
, last_value(
case movement_type
when 'ROB' then seq
end
ignore nulls
) over (
partition by voyage_id
order by inventory_date, seq
rows between unbounded preceding and current row
) as group_first_seq
from vessel_fuel
) s1
order by voyage_id, inventory_date, seq
/
```

``` VOYAGE_ID INVENTORY_DATE    SEQ PORT       MOV  QTY GROUP_FIRST_SEQ RUNNING_QTY
---------- ---------------- ---- ---------- --- ---- --------------- -----------
101 2015-11-01 07:00    1 FREDERICIA OB   150                         150
101 2015-11-03 04:00    2 ROTTERDAM  ROB   70               2          70
101 2015-11-03 06:00    3 ROTTERDAM  IN   100               2         170
101 2015-11-04 22:00    4 LONDON     ROB  130               4         130
101 2015-11-04 23:00    5 LONDON     OUT   60               4          70
101 2015-11-05 05:00    6 LONDON     IN   200               4         270
101 2015-11-06 22:00    7 ESBJERG    ROB  210               7         210
```

Here we can see the running sum "restarting" in every group. If there had been IN or OUT between the first OB and ROB, they would have been part of that first group, since we use GROUP_FIRST_SEQ in the PARTITION BY clause of the sum and NULL values will be in a partition for themselves.

So how can this be used to calculate the consumption? Well, look at row 4 - we have measured there is 130 in the tanks, but the running sum in the row above shows that if there had been no consumption, there ought to have been 170. So the consumption here is 170 minus 130 = 40.

This we can do with LAG:

```select s2.*
, case movement_type
when 'ROB' then
lag(running_qty) over (
partition by voyage_id
order by inventory_date, seq
) - qty
end consumption_qty
from (
select s1.*
, sum(
case
when movement_type in ('OB' , 'IN', 'ROB') then  qty
when movement_type = 'OUT' then -qty
end
) over (
partition by voyage_id, group_first_seq
order by inventory_date, seq
rows between unbounded preceding and current row
) running_qty
from (
select voyage_id
, inventory_date
, seq
, port
, movement_type
, qty
, last_value(
case movement_type
when 'ROB' then seq
end
ignore nulls
) over (
partition by voyage_id
order by inventory_date, seq
rows between unbounded preceding and current row
) as group_first_seq
from vessel_fuel
) s1
) s2
order by voyage_id, inventory_date, seq
/
```

``` VOYAGE_ID INVENTORY_DATE    SEQ PORT       MOV  QTY GROUP_FIRST_SEQ RUNNING_QTY CONSUMPTION_QTY
---------- ---------------- ---- ---------- --- ---- --------------- ----------- ---------------
101 2015-11-01 07:00    1 FREDERICIA OB   150                         150
101 2015-11-03 04:00    2 ROTTERDAM  ROB   70               2          70              80
101 2015-11-03 06:00    3 ROTTERDAM  IN   100               2         170
101 2015-11-04 22:00    4 LONDON     ROB  130               4         130              40
101 2015-11-04 23:00    5 LONDON     OUT   60               4          70
101 2015-11-05 05:00    6 LONDON     IN   200               4         270
101 2015-11-06 22:00    7 ESBJERG    ROB  210               7         210              60
```

So using analytic functions, we have calculated for the entire voyage how much fuel must have been consumed between each time we measured how much fuel was remaining on board in the tanks.

The first part of the task is now done - calculating the fuel consumption. The second part is then to calculate how much that actually cost, since everytime we take fuel aboard it may be at a different price. So we need to calculate price for consumption as well as price for OUT rows, and that needs to be done by First-In-First-Out (FIFO) principle.

That'll be the topic of the next blog post to come ;-)