CalCalcs and the udunits library


The udunits library is a handy set of routines that translate between different units. Also included in version 1 of the library were some routines to handle dates and times. In particular, routine utCalendar could convert temporal intervals into dates, for example, converting "3 days since 1901-01-01" into the date "1901-01-04". Routine utInvCalendar did the reverse transform, for example, converting the date "1901-01-04" into "3 days since 1901-01-01". You could specify the date the conversions were referenced to and the time interval, so you could as easily work in "milliseconds since 1875-09-15 12:00" or many other units.

The udunits library is widely used in the field of climate research, so I wrote utCalendar_cal and utInvCalendar_cal, which extended these routines to support the calendars typically used by climate models.

 

Version 2 of the udunits library


When version 2 of the udunits library was released, the new API did not include a direct replacement for the utCalendar/utInvCalendar functionality. Additionally, the documentation for the version 2 library states:

"I've come to believe, however, that creating such a unit [a calendar-referenced date unit] was a mistake, primarily because users try to use the unit in ways for which it was not designed (such as converting dates in a calendar whose year is exactly 365 days long). Such activities are much better handled by a dedicated calendar package. Please be careful about using timestamp-units."

In the source code itself, routine unitcore.c, a comment on the calendar-referenced date unit states:

"A wrong-headed unit that shouldn't exist but does for backward compatibility. It was intended to provide similar functionality as the GalileanUnit, but for time units (e.g., "seconds since the epoch"). Unfortunately, people try to use it for more than it is capable (e.g., days since some time on an imaginary world with only 360 days per year)."

 

 

 

The CalCalcs routines


The above make me concerned that support for the calendrical calculations might be dropped in a future version. And since I was rewriting the utCalendar_cal and utInvCalendar_cal routines to support the udunits library version 2 anyway, I figured I might as well replace all the calendar functionality of the udunits library.

Here is how I structured it.

The CalCalcs routines themselves work only in integer days, and use no code or calls to the udunits libray (either version 1 or 2). They are completely stand-alone. The are in their own file ("calcalcs.c") and have their own header ("calcalcs.h"), and can be compiled without linking to the udunits library.

The new routines utCalendar2_cal and utInvCalendar2 both 1) provide a replacement for the old version 1 routines that work with the version 2 library; and 2) extend those new routines to calendars used by climate models. These routines use the udunits library only for two things: parsing the units string, and converting between time units (such as days to seconds, or seconds to hours). They do not use any udunits routines or calls for any calendrical computations. They are in a separate source and header file ("utCalendar2_cal.c" and "utCalendar2_cal.h"). If you use utCalendar2_cal.c you must also include calcalcs.c and link the result with the udunits library.

 

 

 

How the udunits library behaves


Because utCalendar2_cal and utInvCalendar2_cal use the udunits library to parse the units string, they inherit some behavior from the udunits-2 library. If you are not aware of that behavior, you might occasionally be surprised by how it acts.

First off, udunits silently and unavoidably converts any reference to "year 0" into year 1. So if you are using the utCalendar2_cal/utInvCalendar2_cal routines, anytime you pass a reference to year 0 in the units string -- for example, by specifying "days since 0000-01-01" -- it will be exactly as if you had specified "days since 0001-01-01". This is even true if you specify a noleap or 360_day calendar. My best advice is to never use "year 0" as a reference year in a udunits string.

Second, the udunits-2 library does not completely check its input for validity, so you can, for instance, specify "days since 2001-89-01", and it will happily take it. The result will be as if you had passed "days since 2001-08-01".

Or, take the case of days that do not exist in the Standard calendar. For example, the calendar transitioned from the Julian calendar, which was used up to 4 Oct 1582, to the Gregorian calendar, starting the next day, which was labeled 15 Oct 1582. So what happens if you specify a units string of "days since 1582-10-5"? The udunits library treats it as if you'd specified "1582-10-15". Another example is leap days in non-leap years -- the string "days since 1901-02-29" (i.e., a leap day specified for a non-leap year) is treated as 1901-03-01. Arguably not unreasonable, but still you might find it unexpected that "days since 1901-02-31" is treated as "days since 1901-03-03" while "days since 1901-02-32" is treated as "days since 1901-02-03".

If you want to avoid these behaviors, you can rewrite your application to use the CalCalcs routines directly (instead of relying on the utCalendar2_cal/utInvCalendar2_cal interface and therefore the udunits units string parsing code), since the CalCalcs routines handle these situations correctly.




(C) 2010 David W. Pierce