Implementing POSIX clocks under Linux
Markus Kuhn
Note: This is an old and early draft document from 1998.
It evolved into proposed
modernised <time.h> for ISO C 9X, which implements
similar ideas and tries to bring better leap second and timezone
handling into the ISO C standard. Later I came to the conclusion that
attempting to present inserted leap seconds as an out-of-scale
timestamp of the form 23:59:60 to applications at an
operating-system-API or network-protocol level is not very useful for
most applications and that a standardized smoothed form of UTC, such
as my UTC-SLS proposal (2006), is far more
practical in almost all applications, except for those concerned with
precisely tracking the motion of physical masses, where TAI is a more
useful timebase.
The POSIX clock interface ignores the existence of leap seconds in
the commonly used UTC time scale and does not provide a sufficiently
powerful interface for adjusting clocks to an external time reference.
However the POSIX clock interface was designed in an extensible way.
We discuss here how the clock functionality of POSIX can be improved.
Eventually we might submit the result as a new POSIX standard proposal
(far future) and implement it in Linux (near future).
Participants in this discussion so far have been: Markus Kuhn, Joe Gwinn, Colin Plumb, Andrew Derrick Balsa, Paul Eggert, and probably others
I forgot. If you are interested, please join the tz mailing list.
Before you read this document, you may find it useful to
familiarize yourself with the following references:
- The POSIX.1 standard (ISO/IEC 9945-1:1996) that describes in Section 14 a portable Unix clock
and timer API.
- The sci.astro FAQ section C.02, which gives and overview over
reference time scales such as International Atomic Time (TAI),
Universal Time (UT1), and Coordinated Universal Time (UTC). Read on
this also the NIST Time and
Frequency Glossary. Further useful references are available from
the U.S. Naval
Observatory and the International Earth Rotation
Service.
- ITU-R Recommendation
TF.460-4, which defines how leap seconds are inserted into
Coordinated Universal Time (UTC). Other relevant ITU-R TF
recommendations and drafts
are available from the ITU server.
- Mills, D.L. Improved
algorithms for synchronizing computer network clocks. IEEE/ACM
Trans. Networks (June 1995), 245-254.
- Mills, D.L. Unix kernel modifications for precision time synchronization.
Electrical Engineering Department Report 94-10-1, University of
Delaware, October 1994, 24 pp. (Defines the
ntp_gettime() and ntp_adjtime()
system calls.)
- Various other papers by David Mills
on NTP and the kernel
PLL.
- The International Standard Date and Time
Notation specified in ISO 8601.
- Perhaps you might also want to have a look at the N2620
draft for the next ISO C revision, especially section 7.16 with
the <time.h> definitions.
POSIX.1 specifies its representation of time in the
time_t type as follows:
2.2.2.113 seconds since the Epoch: A value to be interpreted
as the number of seconds between a specified time and the Epoch.
A Coordinated Universal Time name (specified in terms of seconds
(tm_sec), minutes (tm_min), hours (tm_hour),
days since January 1 of the year (tm_yday), and calendar year
minus 1900 (tm_year) is related to a time represented as seconds since the Epoch, according to the expression below.
If the year < 1970 or the value is negative, the relationship is
undefined. If the year >= 1970 and the value is non-negative, the value
is related to a Coordinated Universal time name according to the
expression:
tm_sec + tm_min*60 + tm_hour*3600
+ tm_yday*86 400 +
(tm_year-70)*31 536 000 +
((tm_year-69)/4*86 400
There are two problems with this encoding of UTC second names:
- There will be a 86400 s log gap between 2100-02-28 24:00:00Z and
2100-03-01 00:00:00Z (same for the years 2200, 2300, 2500, ...),
because the leap year rule used in the above formula is only valid in
the year interval 1901..2099. This is not of concern when the seconds
since the Epoch are counted in 32-bit signed integer variables, which
will overflow on 2038-01-19 03:14:08Z, but on 64-bit machines,
this should be fixed, since the Y2K problem has taught us that it is
conceivable that software will be in use for centuries and it is
unlikely that mankind will agree on a new calendar scheme any time soon
given the enormous problems that even a century roll-over causes in
our IT infrastructure.
- The POSIX second count ignores inserted leap seconds (and does not
even provide an encoding for them) and counts deleted leap seconds.
Most networked workstations receive today information from time
broadcasting services are maintain their clock accurate to UTC within
a few tens of milliseconds, therefore correct leap second handling
would be desirable.
Proposed modifications to POSIX.1
Seconds since the Epoch definition
Replace the last paragraph in section 2.2.2.13 with the following
paragraph:
If the year < 1970 or the value is negative, the relationship is
undefined. If the year >= 1970 and the value is non-negative, the value
is related to a Coordinated Universal time name according to the
expression:
tm_sec + tm_min * 60 + tm_hour * 3600 +
tm_yday * 86 400 +
(tm_year-70) * 31 536 000 +
((tm_year-69)/4 -
(tm_year-1)/100 +
(tm_year+299)/400) * 86 400
The seconds since the Epoch should be represented as a numeric type
that covers a value range sufficiently large to handle times from 1970
until the end of the year 9999.
Additional clocks
Change the third paragraph in section 14.1.1 into
The tv_nsec member is only valid if greater than or equal
to zero, and less than the number of nanoseconds in a second (1000
million), unless a time within a leap second is represented by a clock
type that does indicate leap seconds. The time interval described by
this structure is (tv_sec × 109 +
tv_nsec) nanoseconds. Clocks that represent leap seconds do
so by keeping tv_sec at the value for the preceding second
while adding the value 1000 000 000 to tv_nsec,
that is a leap second is represented by a tv_nsec value in
the range 1000 000 000 to 1999 999 999.
Section 14.1.4 should in the end contain descriptions of the
following clocks:
- CLOCK_REALTIME
- This clock provides a best effort
estimate of UTC in a way that is backwards compatible with existing
practice. Very little is guaranteed for this clock. It will never show
leap seconds. When CLOCK_UTC becomes available, then CLOCK_REALTIME
should be adjusted to match CLOCK_UTC. For small phase adjustments of
the clock (up to 10 minutes difference), the frequency (rate) of this
clock will be increased or decreased by up to 1% until both clocks
show identical times. For larger adjustments (which are only to be
expected when the system is first installed), CLOCK_REALTIME jumps
directly to CLOCK_UTC and the system administrator should be warned
about this unusual event. After CLOCK_UTC has had a leap second,
CLOCK_REALTIME will need at least 100 s until both clocks are
phase synchronous again, because CLOCK_REALTIME has to follow the leap
second phase shift by temporarily changing its frequency slightly. In
BSD compatible systems, gettimeofday() shows the same
time as CLOCK_REALTIME (truncated to microsecond resolution).
CLOCK_REALTIME has a resolution of at least 20 ms (typically much
better) and unspecified accuracy, frequency stability, and
monotonicity.
- CLOCK_UTC
- This clock is only available when the system
knows with high assurance Coordinated Universal Time (UTC) with an
estimated accuracy of at least 1 s (typically much better).
Whether UTC is known with high assurance depends usually on whether
the system clock driver (e.g., Mill's kernel PLL) has recently
received UTC reference signals from an external source (GPS, NTP,
DCF77, WWV, etc.). Clock drivers are required to calculate an estimate
of the accuracy of the current clock value, for instance using a
Kalman filter that observes both the external and the internal
reference oscillators and makes a time estimate by modelling the errors
of both sources. The estimated accuracy decreases when the external
clock signal becomes unavailable for a longer time, and CLOCK_UTC must
be made unavailable when the estimated accuracy has become worse than
some documented limit that is not higher than 1 s. CLOCK_UTC also
becomes unavailable after a system disruption that could have affected
the continuity of the internal clock (e.g., a Laptop recovering from a
power saving mode with reduced clock frequency) until an accuracy
estimate has been established again. During inserted leap seconds, the
tv_nsec field will be in the range 1000000000 to 1999999999 in order
to represent the leap second 23:59:60Z for which the POSIX time_t does
not provide any legal value. CLOCK_UTC is the only clock described
here that indicates leap seconds.
- CLOCK_TAI
- This clock is only available when the system
knows International Atomic Time (TAI) with at least an accuracy of
1 s. The only difference between TAI and UTC is that TAI is never
corrected by leap seconds, therefore TAI is a few whole seconds ahead
of UTC (one second more after every UTC leap second). Some time
broadcasting services such as GPS provide both TAI and UTC (e.g., by
publishing a scale linked to TAI plus the difference to UTC). TAI is
needed for instance to control processes (e.g., astronomical
observations, navigation, etc.) where leap seconds are undesirable.
CLOCK_TAI is handled very similarly to CLOCK_UTC in that it becomes
unavailable when the clock filter algorithm estimates the accuracy of
its output to be worse than 1 s.
- CLOCK_MONOTONIC
- This clock never jumps, it is
guaranteed to be available all the time right after system startup,
and its frequency never varies by more than 500 ppm. It is intended
for systems that might not know UTC or TAI at boot time, but where a
monotonically increasing constant rate clock is needed right from boot
time for highly reliable time interval measurements. This clock's
frequency might be adjusted in a PLL control loop once an external
reference (NTP, GPS, etc.) has been available long enough to measure
the ±500 ppm frequency error and instability of typical
motherboard oscillators. No attempt is made to adjust the phase of
clock monotonic. Its timestamps are guaranteed to be unique and
monotonically increasing during the uptime of the operating system
(but not necessarily across several reboots). CLOCK_MONOTONIC does of
course not have leap seconds. CLOCK_MONOTONIC can be identical to
CLOCK_TAI at boot time if TAI is available, but this is not
guaranteed. CLOCK_MONOTONIC can also start its epoch at system startup
or preferably CLOCK_MONOTONIC starts with the best available TAI (or
UTC) estimate that is available.
- CLOCK_THREAD
- This clock started its Epoch when the current
thread was created and runs only when the current thread is running on
the CPU. This is execution time, which progresses always slower that
the wall clock times represented by the previous clocks.
- CLOCK_PROCESS
- This clock starts its Epoch when the current
process was created and runs only when a thread of the current process
is running on the CPU. This is execution time, which progresses always
slower than the wall clock times.
Clock control system call
A new clock_control() function should be introduced
into POSIX.1 to provide a standardized interface for programs such as
xntpd that read external reference clock signals and want
to pass them on to the clock driver, as well as for programs that want
to get more information than CLOCK_UTC can provide, for instance
accuracy estimates and leap second warnings. The functionality could
be roughly along the lines of ntp_gettime() and
ntp_adjtime() by Mills, but somewhat more generalized and
less xntpd implementation specific.
... work in progress ...
created 1998-06-07 -- last modified 1998-09-09 --
http://www.cl.cam.ac.uk/~mgk25/posix-clocks.html