Date: Mon, 14 Sep 1998 17:12:31 -0700 (PDT) From: Paul Eggert To: Deborah Donovan Cc: Rex Jaeschke , "J.Benito" , Clive Feather , Time zone mailing list , C9x working group Subject: Summary of problems with draft C9x , and a proposed fix Rex Jaeschke, NCITS/J11 chair, recently sent me a copy of document J11/98-048, which contains the disposition by J11 of my public review comments regarding the C9x standard Committee Draft (CD) 1 (CD 9899, Ballot Document N2620). One of my comments was not satisfactorily resolved, and I'd like to follow up in the hope of improving the eventual standard. I'm referring to Comment 14 in US0011 (1998-03-04), my comment about problems in the struct-tmx-related changes by CD 1 to . Unfortunately, many of these problems remain in CD 2, and I've since learned of other problems. I summarize these remaining problems in Appendix 1 below. Also, Clive Feather (who I understand is responsible for most of the changes in CDs 1 and 2) has proposed that a new section be written to address these problems. I welcome this proposal, and would like to contribute. However, I believe that it's too late in the standardization process to introduce major improvements to , as there will be insufficient time to gain implementation experience with these changes, experience that is needed for proper review. Instead, I propose that 's problems be fixed by removing the struct-tmx-related changes to , reverting to the the current ISO C standard (C89); we can then come up with a better for the next standard (C0x). In other words, I propose the following: * Change to define only the types and functions that were defined in C89's , and to remove a new requirement on mktime. Appendix 2 gives the details. * Work with Clive Feather and other interested parties to write and test a revised suitable for inclusion in C0x. Please let me know of any way that I can further help implement this proposal. ------------------------------------------------------------ Appendix 1. Problems in the struct-tmx-related part of Here is a summary of technical problems in the struct-tmx-related part of CD 2 (1998-08-03), section 7.23. The problems fall into two basic areas: * struct tmx is not headed in the right direction. The struct-tmx-related changes do not address several well-known problems with C89 , and do not form a good basis for addressing these problems. These problems include the following. - Lack of precision. The standard does not require precise timekeeping; typically, time_t has only 1-second precision. - Inability to determine properties of time_t. There's no portable way to determine the precision or range of time_t. - Poor arithmetic support for the time_t type. difftime is not enough for many practical applications. - The new interface is not reentrant. A common extension to C89 is the support of reentrant versions of functions like localtime. This extension is part of POSIX.1. There's no good reason (other than historical practice) for time-related functions to rely on global state; any new extensions should be reentrant. - No control over time zones. There's no portable way for an application to inquire about the time in New York, for example, even if the implementation supports this. - Missing conversions. There's no way to convert between UTC and TAI, or between times in different time zones, or to determine which time zone is in use. - No reliable interval time scale. If the clock is adjusted to keep in sync with UTC, there's no reliable way for a program to ignore this change. - One cannot apply strftime to the output of gmtime, as the %Z and %z formats may be misinterpreted. (Credit: I've borrowed many of the above points from discussions by Clive Feather and Markus Kuhn.) * struct tmx has several technical problems of its own. Even on its own terms, struct tmx has several technical problems that would need to be fixed before being made part of a standard. These problems include the following. - In 7.23.1 paragraph 5, struct tmx's tm_zone member counts minutes. This disagrees with common practice, which is to extend struct tm by adding a new member tm_gmtoff that is UTC offset in seconds. The extra precision is needed to support old time stamps -- UTC offsets that were not a multiple of one minute used to be quite common, and in at least one locale this practice did not die out until 1972. - The tm_leapsecs member defined by 7.23.1 paragraph 5 is an integer, but it is supposed to represent TAI - UTC, and this value is not normally an integer for time stamps before 1972. Also, it's not clear what this value should be for time stamps before the introduction of TAI in the 1950s. - The tm_ext and tm_extlen members defined by 7.23.1 paragraph 5 use a new method to allow for future extensions. This method has never before been tried in the C Standard, and is likely to lead to problems in practice. For example, the draft makes no requirement on the storage lifetime of storage addressed by tm_ext. This means that an application cannot reliably dereference the pointer returned by zonetime, because it has no way of knowing when the tm_ext member points to freed storage. - 7.23.2.3 paragraph 4 adds the following requirement for mktime not present in C89: If the call is successful, a second call to the mktime function with the resulting struct tm value shall always leave it unchanged and return the same value as the first call. This requirement was inspired by the struct-tmx-related changes to , but it requires changes to existing practice, and it cannot be implemented without hurting performance or breaking binary compatibility. For example, suppose I am in Sri Lanka, and invoke mktime on the equivalent of 1996-10-26 00:15:00 with tm_isdst==0. There are two distinct valid time_t values for this input, since Sri Lanka moved the clock back from 00:30 to 00:00 that day, permanently. There is no way to select the time_t by inspecting tm_isdst, since both times are standard time. On examples like these, C89 allows mktime to return different time_t values for the same input at different times during the execution of the program. This is common existing practice, but it is prohibited by this new requirement. It's possible to satisfy this new requirement by adding a new struct tm member, which specifies the UTC offset. However, this would break binary compatibility. It's also possible to satisfy this new requirement by always returning the earlier time_t value in ambiguous cases. However, this can greatly hurt performance, as it's not easy for some implementations to determine that the input is ambiguous; it would require scouting around each candidate returned value to see whether the value might be ambiguous, and this step would be expensive. - The limits on ranges for struct tmx members in 7.23.2.6 paragraph 2 are unreasonably tight. For example, they disallow the following program on a POSIX.1 host with a 32-bit `long', since `time (0)' currently returns values above 900000000 on POSIX.1 hosts, which is well above the limit LONG_MAX/8 == 268435455 imposed by 7.23.2.6. #include struct tmx tm; int main() { char buf[1000]; time_t t = 0; /* Add current time to POSIX.1 epoch, using mkxtime. */ tm.tm_version = 1; tm.tm_year = 1970 - 1900; tm.tm_mday = 1; tm.tm_sec = time (0); if (mkxtime (&tm) == (time_t) -1) return 1; strfxtime (buf, sizeof buf, "%Y-%m-%d %H:%M:%S", &tm); puts (buf); return 0; } The limits in 7.23.2.6 are not needed. A mktime implementation need not check for overflow on every internal arithmetic operation; instead, it can cheaply check for overflow by doing a relatively simple test at the end of its calculation. - 7.23.2.6 paragraph 3 contains several technical problems: . In some cases, it requires mkxtime to behave as if each day contains 86400 seconds, even if the implementation supports leap seconds. For example, if the host supports leap seconds and uses Japan time, then using mkxtime to add 1 day to 1999-01-01 00:00:00 must yield 1999-01-01 23:59:59, because there's a leap second at 08:59:60 that day in Japan. This is not what most programmers will want or expect. . The explanation starts off with ``Values S and D shall be determined as follows'', but the code that follows does not _determine_ S and D; it consults an oracle to find X1 and X2, which means that the code merely places _constraints_ on S and D. A non-oracular implementation cannot in general determine X1 and X2 until it knows S and D, so the code, if interpreted as a definition, is a circular one. . The code suffers from arithmetic overflow problems. For example, suppose tm_hour == INT_MAX && INT_MAX == 32767. Then tm_hour*3600 overflows, even though tm_hour satisfies the limits of paragraph 2. . The code does not declare the types of SS, M, Y, Z, D, or S, thus leading to confusion. Clearly these values cannot be of type `int', due to potential overflow problems like the one discussed above. It's not clear what type would suffice. . The definition for QUOT yields numerically incorrect results if either (b)-(a) or (b)-(a)-1 overflows. Similarly, REM yields incorrect results if (b)*QUOT(a,b) overflows. . The expression Y*365 + (Z/400)*97 + (Z%400)/4 doesn't match the Gregorian calendar, which has special rules for years that are multiples of 100. . The code is uncommented, so it's hard to understand and evaluate. For example, the epoch (D=0, S=0) is not described; it appears to be (-0001)-12-31 Gregorian, but this should be cleared up. - 7.23.3.7 says that the number of leap seconds is the ``UTC-UT1 offset''. It should say ``UTC - TAI''. ------------------------------------------------------------ Appendix 2. Details of proposed change to Here are the details about my proposed change to . This change reverts the part of the standard to define only the types, functions, and macros that were defined in C89's . It also removes the hard-to-implement requirement in 7.23.2.3 paragraph 4. * 7.23.1 paragraph 2. Remove the macros _NO_LEAP_SECONDS and _LOCALTIME. * 7.23.1 paragraph 3. Remove the type `struct tmx'. * 7.23.1 paragraph 5 (struct tmx). Remove this paragraph. * 7.23.2.3 paragraph 3 (mktime normalization). Remove this paragraph. * 7.23.2.3 paragraph 4. Remove the phrase ``and return the same value''. It's not feasible to return the same value in some cases; see the discussion of 7.23.2.3 paragraph 4 above. * 7.23.2.4 (mkxtime). Remove this section. * 7.23.2.6 (normalization of broken-down times). Remove this section; this means footnote 252 will be removed. * 7.23.3 paragraph 1. Remove the reference to strfxtime. * 7.23.3.6 (strfxtime). Remove this section. * 7.23.3.7 (zonetime). Remove this section.