Skip to content

gh-152248: Reject a POSIX TZ abbreviation with non-ASCII-letter characters in pure-Python zoneinfo#152249

Open
tonghuaroot wants to merge 6 commits into
python:mainfrom
tonghuaroot:gh-152248-zoneinfo-abbr-charset
Open

gh-152248: Reject a POSIX TZ abbreviation with non-ASCII-letter characters in pure-Python zoneinfo#152249
tonghuaroot wants to merge 6 commits into
python:mainfrom
tonghuaroot:gh-152248-zoneinfo-abbr-charset

Conversation

@tonghuaroot

@tonghuaroot tonghuaroot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

The pure-Python zoneinfo parser accepts a POSIX TZ string whose unquoted std/dst abbreviation contains characters other than ASCII letters (for example an embedded space or a non-ASCII letter), while the C implementation rejects it. The unquoted alternative in the parser regex is a negated class ([^<0-9:.+-]+) that admits anything except a few delimiters, whereas the C parse_abbr walks the unquoted form with Py_ISALPHA (ASCII letters only), as POSIX (via RFC 8536) requires for the unquoted form.

This tightens the unquoted alternative to [a-zA-Z]+, matching the C accelerator and POSIX, and leaves the quoted <...> form untouched. Every well-formed TZ string and all bundled IANA zones still parse unchanged; only the previously-accepted strings now raise ValueError.

The non-ASCII case is reachable through the public from_file path, which UTF-8-decodes the footer, so it is covered by a dedicated regression test in addition to the whitespace cases added to the shared invalid_tzstrs list.

Comment thread Lib/test/test_zoneinfo/test_zoneinfo.py Outdated
Comment thread Lib/zoneinfo/_zoneinfo.py
Comment thread Misc/NEWS.d/next/Library/2026-06-26-13-39-11.gh-issue-152248.N2Rmaf.rst Outdated
Comment thread Lib/test/test_zoneinfo/test_zoneinfo.py Outdated
Co-authored-by: Stan Ulbrych <stan@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants