Bug description
A POSIX TZ string requires an offset after the std abbreviation (e.g. EST5). When the
offset is missing and the std field is just an abbreviation (AAA, A, AA, B, ...),
the two zoneinfo implementations disagree:
- The C accelerator (
_zoneinfo) raises ValueError: Invalid STD offset.
- The pure-Python parser (
zoneinfo._zoneinfo) silently accepts it and builds a
fixed-offset-0 zone named after the abbreviation.
Reproduced by embedding the footer in a minimal TZif v2 file and loading it through
ZoneInfo.from_file against both implementations:
>>> import io, struct, datetime as dt
>>> import _zoneinfo # C accelerator
>>> import zoneinfo._zoneinfo # pure-Python reference
>>>
>>> def tzif(footer): # minimal TZif v2: 0 transitions, 1 ttinfo (UTC), footer
... def block():
... return (b"TZif" + b"\x32" + b"\x00" * 15
... + struct.pack(">6l", 0, 0, 0, 0, 1, 4)
... + struct.pack(">lbb", 0, 0, 0) + b"UTC\x00")
... return block() + block() + b"\n" + footer.encode() + b"\n"
...
>>> # C accelerator: rejects the offset-less std field
>>> _zoneinfo.ZoneInfo.from_file(io.BytesIO(tzif("AAA")), key="AAA")
Traceback (most recent call last):
...
ValueError: Invalid STD offset in b'AAA'
>>>
>>> # pure-Python: accepts it as a fixed offset-0 zone
>>> zi = zoneinfo._zoneinfo.ZoneInfo.from_file(io.BytesIO(tzif("AAA")), key="AAA")
>>> zi.utcoffset(dt.datetime(2025, 1, 15, 12))
datetime.timedelta(0)
>>> zi.tzname(dt.datetime(2025, 1, 15, 12))
'AAA'
The same divergence occurs for A, AA, B, and any other bare std abbreviation.
Root cause
Lib/zoneinfo/_zoneinfo.py, in _parse_tz_str, the std-offset branch (around L669-675 on
main):
if std_offset := m.group("stdoff"):
try:
std_offset = _parse_tz_delta(std_offset)
except ValueError as e:
raise ValueError(f"Invalid STD offset in {tz_str}") from e
else:
std_offset = 0 # <-- treats a missing std offset as 0
When the regex captures a std abbreviation but no stdoff group, the else branch
defaults the offset to 0 instead of raising. The C accelerator has no such default and
raises Invalid STD offset when the std offset is absent.
Spec
POSIX.1-2024 (Issue 8), §8.3 "Other Environment Variables" (the TZ rule format) gives
the std field as std offset and states that the offset following std shall be
required; only the dst offset is optional (DST then defaults to one hour ahead of std).
RFC 8536 §3.3 ("TZif Footer"), which governs the embedded TZ-string footer of a TZif file,
specifies that the footer uses the POSIX TZ grammar from Base Definitions §8.3; its
extensions (§3.3.1) only widen the offset hour range and add year-round-DST syntax; none
relax the required std offset. So a footer like AAA is not a valid POSIX TZ string, and
the C accelerator's rejection is the correct behavior. (For reference, macOS libc likewise
treats TZ=AAA as invalid and falls back to UTC: tzname=('UTC', 'UTC'), offset 0.)
Suggested fix
Make the pure-Python parser reject a missing std offset, matching the C accelerator. The
else branch becomes a raise with the same message wording the accelerator uses:
else:
raise ValueError(f"Invalid STD offset in {tz_str}")
This is non-breaking for real data: across the full IANA database (598 zones loaded
through the pure parser) no zone is newly rejected, and well-formed strings such as EST5,
<ABC>5, and AAA5 continue to parse identically on both implementations. A PR with the
one-line fix and a regression test (covering AAA, A, AA, B in test_invalid_tzstr,
which runs against both TZStrTest and CTZStrTest) follows.
Environment
- Reproduced on
main (3.16.0a0).
- The pure-Python parser is used whenever the
_zoneinfo C extension is unavailable, and
is also reachable directly via zoneinfo._zoneinfo.
- The same
else: std_offset = 0 code is present on 3.13, 3.14, and 3.15, which are
likewise affected.
This is a correctness/parity issue, not a security issue.
Linked PRs
Bug description
A POSIX TZ string requires an offset after the std abbreviation (e.g.
EST5). When theoffset is missing and the std field is just an abbreviation (
AAA,A,AA,B, ...),the two
zoneinfoimplementations disagree:_zoneinfo) raisesValueError: Invalid STD offset.zoneinfo._zoneinfo) silently accepts it and builds afixed-offset-0 zone named after the abbreviation.
Reproduced by embedding the footer in a minimal TZif v2 file and loading it through
ZoneInfo.from_fileagainst both implementations:The same divergence occurs for
A,AA,B, and any other bare std abbreviation.Root cause
Lib/zoneinfo/_zoneinfo.py, in_parse_tz_str, the std-offset branch (around L669-675 onmain):When the regex captures a std abbreviation but no
stdoffgroup, theelsebranchdefaults the offset to
0instead of raising. The C accelerator has no such default andraises
Invalid STD offsetwhen the std offset is absent.Spec
POSIX.1-2024 (Issue 8), §8.3 "Other Environment Variables" (the
TZrule format) givesthe std field as
std offsetand states that the offset followingstdshall berequired; only the
dstoffset is optional (DST then defaults to one hour ahead of std).RFC 8536 §3.3 ("TZif Footer"), which governs the embedded TZ-string footer of a TZif file,
specifies that the footer uses the POSIX
TZgrammar from Base Definitions §8.3; itsextensions (§3.3.1) only widen the offset hour range and add year-round-DST syntax; none
relax the required std offset. So a footer like
AAAis not a valid POSIX TZ string, andthe C accelerator's rejection is the correct behavior. (For reference, macOS libc likewise
treats
TZ=AAAas invalid and falls back to UTC:tzname=('UTC', 'UTC'), offset 0.)Suggested fix
Make the pure-Python parser reject a missing std offset, matching the C accelerator. The
elsebranch becomes araisewith the same message wording the accelerator uses:This is non-breaking for real data: across the full IANA database (598 zones loaded
through the pure parser) no zone is newly rejected, and well-formed strings such as
EST5,<ABC>5, andAAA5continue to parse identically on both implementations. A PR with theone-line fix and a regression test (covering
AAA,A,AA,Bintest_invalid_tzstr,which runs against both
TZStrTestandCTZStrTest) follows.Environment
main(3.16.0a0)._zoneinfoC extension is unavailable, andis also reachable directly via
zoneinfo._zoneinfo.else: std_offset = 0code is present on 3.13, 3.14, and 3.15, which arelikewise affected.
This is a correctness/parity issue, not a security issue.
Linked PRs