Skip to content

fix: write and read figure JSON as UTF-8#5633

Open
LukeTheoJohnson wants to merge 1 commit into
plotly:mainfrom
LukeTheoJohnson:fix-json-io-utf8-encoding
Open

fix: write and read figure JSON as UTF-8#5633
LukeTheoJohnson wants to merge 1 commit into
plotly:mainfrom
LukeTheoJohnson:fix-json-io-utf8-encoding

Conversation

@LukeTheoJohnson

@LukeTheoJohnson LukeTheoJohnson commented Jun 27, 2026

Copy link
Copy Markdown

Problem

pio.write_json(fig, path) raises UnicodeEncodeError for a figure whose text contains a character outside the platform's default codec, on platforms whose default text encoding is not UTF-8. On Windows the default is cp1252:

import plotly.graph_objects as go
import plotly.io as pio

fig = go.Figure(layout_title_text="μ 中文")   # Greek mu + CJK
pio.write_json(fig, "fig.json")
# UnicodeEncodeError: 'charmap' codec can't encode character 'μ'
#                     in position ...: character maps to <undefined>

read_json has the matching bug: it reads the file without specifying an encoding, so a UTF-8 JSON file is decoded with the platform codec (cp1252) and the text comes back mangled — or raises UnicodeDecodeError on byte sequences cp1252 leaves undefined. This works on macOS/Linux only because their default encoding is already UTF-8.

What actually triggers it

Two conditions both have to hold, which is why it is easy to miss:

  1. The text contains a character that is not in the platform codec. Latin-1 accents and common symbols are inside cp1252 — é, ö, °, the micro sign µ (U+00B5), the em-dash — so they do not raise. It takes a character outside cp1252, such as Greek μ (U+03BC), CJK (中文), or an emoji, to trip it.
  2. The active engine is orjson (the default when orjson is installed). orjson emits real UTF-8, so the string handed to write_text still contains the non-ASCII characters. The pure-Python json engine uses ensure_ascii=True, escaping everything to \uXXXX, so its output is pure ASCII — it never trips the codec on write and reads back cleanly.

Root cause

plotly/io/_json.py opened the file without an encoding on both sides:

  • write_json: path.write_text(json_str)
  • read_json: path.read_text()

write_html already does this correctly (path.write_text(html_str, "utf-8")), and the same class of bug was previously fixed for HTML output (#3898). The JSON I/O path was simply missed.

Fix

Pass "utf-8" explicitly in both write_json and read_json, matching write_html. JSON is UTF-8 by default per RFC 8259.

- path.write_text(json_str)
+ path.write_text(json_str, "utf-8")
...
- json_str = path.read_text()
+ json_str = path.read_text("utf-8")

Verification

On Windows (locale.getpreferredencoding(False)cp1252, orjson installed):

  • Before: write_json of a figure titled μ 中文 raises UnicodeEncodeError: 'charmap' codec can't encode character 'μ'.
  • After: the same figure writes, and read_json returns the original title unchanged.
  • Updated test_write_json_pathlib / test_read_json_from_pathlib assert the "utf-8" argument is passed; both fail on the unpatched source and pass with the fix.
  • tests/test_io/test_to_from_json.py otherwise shows only the pre-existing FigureWidget ImportErrors (anywidget not installed) — no new regressions.

Scope

Two-line source change plus test assertions. No behavior change on platforms that already default to UTF-8, or when the json engine is in use.

write_json wrote figure JSON with Path.write_text(json_str) and read_json
read it back with Path.read_text(), both omitting the encoding. On platforms
whose default text encoding is not UTF-8 (e.g. cp1252 on Windows), writing a
figure containing non-ASCII text raised UnicodeEncodeError and reading produced
mojibake. write_html already passes "utf-8" explicitly; apply the same to the
JSON I/O path so figures round-trip everywhere.

Update the existing pathlib mock tests to assert the UTF-8 encoding.
@LukeTheoJohnson LukeTheoJohnson force-pushed the fix-json-io-utf8-encoding branch from b6cbd44 to 1c182c4 Compare June 27, 2026 01:38
@LukeTheoJohnson LukeTheoJohnson marked this pull request as ready for review June 27, 2026 21:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant