`json` --- JSON encoder and decoder¶

ソースコード: Lib/json/__init__.py

JSON (JavaScript Object Notation), specified by RFC 7159 (which obsoletes RFC 4627) and by ECMA-404, is a lightweight data interchange format inspired by JavaScript object literal syntax (although it is not a strict subset of JavaScript [1] ).

警告

信頼されていないソースからの JSON データをパースするときは十分注意してください。悪意を持った JSON 文字列はデコーダに著しい量の CPU とメモリリソースを消費させる可能性があります。パースするデータ量を制限することを推奨します。

json exposes an API familiar to users of the standard library marshal and pickle modules.

基本的な Python オブジェクト階層のエンコーディング:

>>> import json
>>> json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}])
'["foo", {"bar": ["baz", null, 1.0, 2]}]'
>>> print(json.dumps("\"foo\bar"))
"\"foo\bar"
>>> print(json.dumps('\u1234'))
"\u1234"
>>> print(json.dumps('\\'))
"\\"
>>> print(json.dumps({"c": 0, "b": 0, "a": 0}, sort_keys=True))
{"a": 0, "b": 0, "c": 0}
>>> from io import StringIO
>>> io = StringIO()
>>> json.dump(['streaming API'], io)
>>> io.getvalue()
'["streaming API"]'

コンパクトなエンコーディング:

>>> import json
>>> json.dumps([1, 2, 3, {'4': 5, '6': 7}], separators=(',', ':'))
'[1,2,3,{"4":5,"6":7}]'

見やすい表示:

>>> import json
>>> print(json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4))
{
    "4": 5,
    "6": 7
}

JSON のデコーディング:

>>> import json
>>> json.loads('["foo", {"bar":["baz", null, 1.0, 2]}]')
['foo', {'bar': ['baz', None, 1.0, 2]}]
>>> json.loads('"\\"foo\\bar"')
'"foo\x08ar'
>>> from io import StringIO
>>> io = StringIO('["streaming API"]')
>>> json.load(io)
['streaming API']

Specializing JSON object decoding:

>>> import json
>>> def as_complex(dct):
...     if '__complex__' in dct:
...         return complex(dct['real'], dct['imag'])
...     return dct
...
>>> json.loads('{"__complex__": true, "real": 1, "imag": 2}',
...     object_hook=as_complex)
(1+2j)
>>> import decimal
>>> json.loads('1.1', parse_float=decimal.Decimal)
Decimal('1.1')

JSONEncoder の拡張:

>>> import json
>>> class ComplexEncoder(json.JSONEncoder):
...     def default(self, obj):
...         if isinstance(obj, complex):
...             return [obj.real, obj.imag]
...         # Let the base class default method raise the TypeError
...         return super().default(obj)
...
>>> json.dumps(2 + 1j, cls=ComplexEncoder)
'[2.0, 1.0]'
>>> ComplexEncoder().encode(2 + 1j)
'[2.0, 1.0]'
>>> list(ComplexEncoder().iterencode(2 + 1j))
['[2.0', ', 1.0', ']']

Using json.tool from the shell to validate and pretty-print:

$ echo '{"json":"obj"}' | python -m json.tool
{
    "json": "obj"
}
$ echo '{1.2:3.4}' | python -m json.tool
Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

詳細については Command Line Interface を参照してください。

注釈

JSON は YAML 1.2 のサブセットです。このモジュールのデフォルト設定 (特に、デフォルトの セパレータ 値) で生成される JSON は YAML 1.0 および 1.1 のサブセットでもあります。このモジュールは YAML シリアライザとしても使えます。

注釈

このモジュールのエンコーダとデコーダは、デフォルトで入力順と出力順を保つようになっています。根底のコンテナに順序がない場合のみ、順序が失われます。

基本的な使い方¶

json.dump(obj, fp, *, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw)¶

Serialize obj as a JSON formatted stream to fp (a .write()-supporting file-like object) using this conversion table.

If skipkeys is true (default: False), then dict keys that are not of a basic type (str, int, float, bool, None) will be skipped instead of raising a TypeError.

The json module always produces str objects, not bytes objects. Therefore, fp.write() must support str input.

ensure_ascii が (デフォルト値の) true の場合、出力では入力された全ての非 ASCII 文字はエスケープされていることが保証されています。ensure_ascii が false の場合、これらの文字はそのまま出力されます。

If check_circular is false (default: True), then the circular reference check for container types will be skipped and a circular reference will result in a RecursionError (or worse).

If allow_nan is false (default: True), then it will be a ValueError to serialize out of range float values (nan, inf, -inf) in strict compliance of the JSON specification. If allow_nan is true, their JavaScript equivalents (NaN, Infinity, -Infinity) will be used.

indent が非負の整数または文字列であれば、JSON の配列要素とオブジェクトメンバはそのインデントレベルで見やすく表示されます。インデントレベルが 0 か負数または "" であれば改行だけが挿入されます。None (デフォルト) では最もコンパクトな表現が選択されます。正の数のindentはレベル毎に、指定した数のスペースでインデントします。もし indent が文字列 ("\t" のような) であれば、その文字列が個々のレベルのインデントに使用されます。

バージョン 3.2 で変更: 整数に加えて、文字列が indent に使用できるようになりました。

separators はもし指定するなら (item_separator, key_separator) というタプルでなければなりません。デフォルトは indent が None のとき (', ', ': ') で、そうでなければ (',', ': ') です。最もコンパクトな JSON の表現を得たければ空白を削った (',', ':') を指定すればいいでしょう。

バージョン 3.4 で変更: indent が None でなければ (',', ': ') がデフォルトで使われます。

default を指定する場合は関数を指定して、この関数はそれ以外では直列化できないオブジェクトに対して呼び出されます。その関数は、オブジェクトを JSON でエンコードできるバージョンにして返すか、さもなければ TypeError を送出しなければなりません。指定しない場合は、 TypeError が送出されます。

If sort_keys is true (default: False), then the output of dictionaries will be sorted by key.

To use a custom JSONEncoder subclass (e.g. one that overrides the default() method to serialize additional types), specify it with the cls kwarg; otherwise JSONEncoder is used.

バージョン 3.6 で変更: すべてのオプション引数は、キーワード専用になりました。

注釈

pickle や marshal とは異なり JSON はフレーム付きのプロトコルではないので、同じ fp に対し繰り返し dump() を呼び、複数のオブジェクトを直列化しようとすると、不正な JSON ファイルが作られてしまいます。

json.dumps(obj, *, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw)¶: この変換表を使って、obj を JSON 形式の str オブジェクトに直列化します。引数は dump() と同じ意味です。

注釈

JSON のキー値ペアのキーは、常に str 型です。辞書が JSON に変換されるとき、辞書の全てのキーは文字列へ強制的に変換が行われます。この結果として、辞書が JSON に変換され、それから辞書に戻された場合、辞書は元のものと同じではありません。つまり文字列ではないキーを持っている場合、 loads(dumps(x)) != x となるということです。

json.load(fp, *, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)¶

Deserialize fp (a .read()-supporting text file or binary file containing a JSON document) to a Python object using this conversion table.

object_hook is an optional function that will be called with the result of any object literal decoded (a dict). The return value of object_hook will be used instead of the dict. This feature can be used to implement custom decoders (e.g. JSON-RPC class hinting).

object_pairs_hook is an optional function that will be called with the result of any object literal decoded with an ordered list of pairs. The return value of object_pairs_hook will be used instead of the dict. This feature can be used to implement custom decoders. If object_hook is also defined, the object_pairs_hook takes priority.

バージョン 3.1 で変更: object_pairs_hook のサポートが追加されました。

parse_float, if specified, will be called with the string of every JSON float to be decoded. By default, this is equivalent to float(num_str). This can be used to use another datatype or parser for JSON floats (e.g. decimal.Decimal).

parse_int, if specified, will be called with the string of every JSON int to be decoded. By default, this is equivalent to int(num_str). This can be used to use another datatype or parser for JSON integers (e.g. float).

バージョン 3.11 で変更: デフォルトの parse_int である int() は、インタープリタの integer string conversion length limitation により整数文字列の最大長を制限するようになり、サービスを妨害する攻撃を拒否します。

parse_constant, if specified, will be called with one of the following strings: '-Infinity', 'Infinity', 'NaN'. This can be used to raise an exception if invalid JSON numbers are encountered.

バージョン 3.1 で変更: 'null', 'true', 'false' に対して parse_constant は呼びされません。

To use a custom JSONDecoder subclass, specify it with the cls kwarg; otherwise JSONDecoder is used. Additional keyword arguments will be passed to the constructor of the class.

脱直列化しようとしているデータが不正な JSON ドキュメントだった場合、 JSONDecodeError が送出されます。

バージョン 3.6 で変更: すべてのオプション引数は、キーワード専用になりました。

バージョン 3.6 で変更: fp には binary file 型も使えるようになりました。入力のエンコーディングは UTF-8, UTF-16, UTF-32 のいずれかでなければなりません。

json.loads(s, *, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)¶

Deserialize s (a str, bytes or bytearray instance containing a JSON document) to a Python object using this conversion table.

The other arguments have the same meaning as in load().

脱直列化しようとしているデータが不正な JSON ドキュメントだった場合、 JSONDecodeError が送出されます。

バージョン 3.6 で変更: s には bytes 型と bytearray 型も使えるようになりました。入力エンコーディングは UTF-8, UTF-16, UTF-32 のいずれかでなければなりません。

バージョン 3.9 で変更: キーワード引数 encoding が削除されました。

エンコーダとデコーダ¶

class json.JSONDecoder(*, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, strict=True, object_pairs_hook=None)¶

単純な JSON デコーダ。

デフォルトではデコーディングの際、以下の変換を行います:

JSON	Python
object	辞書
array	list
string	文字列
number (int)	int
number (real)	浮動小数点数
true	True
false	False
null	None

また、このデコーダは NaN, Infinity, -Infinity を対応する float の値として、JSON の仕様からは外れますが、理解します。

object_hook, if specified, will be called with the result of every JSON object decoded and its return value will be used in place of the given dict. This can be used to provide custom deserializations (e.g. to support JSON-RPC class hinting).

object_pairs_hook, if specified will be called with the result of every JSON object decoded with an ordered list of pairs. The return value of object_pairs_hook will be used instead of the dict. This feature can be used to implement custom decoders. If object_hook is also defined, the object_pairs_hook takes priority.

バージョン 3.1 で変更: object_pairs_hook のサポートが追加されました。

parse_float, if specified, will be called with the string of every JSON float to be decoded. By default, this is equivalent to float(num_str). This can be used to use another datatype or parser for JSON floats (e.g. decimal.Decimal).

parse_int, if specified, will be called with the string of every JSON int to be decoded. By default, this is equivalent to int(num_str). This can be used to use another datatype or parser for JSON integers (e.g. float).

parse_constant, if specified, will be called with one of the following strings: '-Infinity', 'Infinity', 'NaN'. This can be used to raise an exception if invalid JSON numbers are encountered.

strict が false (デフォルトは True) の場合、制御文字を文字列に含めることができます。ここで言う制御文字とは、'\t' (タブ)、'\n'、'\r'、'\0' を含む 0-31 の範囲のコードを持つ文字のことです。

脱直列化しようとしているデータが不正な JSON ドキュメントだった場合、 JSONDecodeError が送出されます。

バージョン 3.6 で変更: すべての引数は、キーワード専用になりました。

decode(s)¶

s (str インスタンスで JSON 文書を含むもの) の Python 表現を返します。

不正な JSON ドキュメントが与えられた場合、 JSONDecodeError が送出されます。

raw_decode(s)¶

s (str インスタンスで JSON 文書で始まるもの) から JSON 文書をデコードし、Python 表現と s の文書の終わるところのインデックスからなる 2 要素のタプルを返します。

このメソッドは後ろに余分なデータを従えた文字列から JSON 文書をデコードするのに使えます。

class json.JSONEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)¶

Python データ構造に対する拡張可能な JSON エンコーダ。

デフォルトでは以下のオブジェクトと型をサポートします:

Python	JSON
辞書	object
list, tuple	array
文字列	string
int、float と int や float の派生列挙型	number
True	true
False	false
None	null

バージョン 3.4 で変更: int と float の派生列挙型クラスの対応が追加されました。

このクラスを拡張して他のオブジェクトも認識するようにするには、サブクラスを作って default() メソッドを次のように実装します。もう一つ別のメソッドでオブジェクト o に対する直列化可能なオブジェクトを返すものを呼び出すようにします。変換できない時はスーパークラスの実装を (TypeError を送出させるために) 呼ばなければなりません。

If skipkeys is false (the default), a TypeError will be raised when trying to encode keys that are not str, int, float or None. If skipkeys is true, such items are simply skipped.

ensure_ascii が (デフォルト値の) true の場合、出力では入力された全ての非 ASCII 文字はエスケープされていることが保証されています。ensure_ascii が false の場合、これらの文字はそのまま出力されます。

check_circular が true (デフォルト) ならば、リスト、辞書および自作でエンコードしたオブジェクトは循環参照がないかエンコード中にチェックされ、無限再帰 (これは RecursionError を引き起こします) を防止します。 True でない場合は、そういったチェックは施されません。

allow_nan が true (デフォルト) ならば、 NaN, Infinity, -Infinity はそのままエンコードされます。この振る舞いは JSON 仕様に従っていませんが、大半の JavaScript ベースのエンコーダ、デコーダと矛盾しません。 True でない場合は、そのような浮動小数点数をエンコードすると ValueError が送出されます。

sort_keys が true (デフォルトは False) ならば、辞書の出力がキーでソートされます。これは JSON の直列化がいつでも比較できるようになるので回帰試験の際に便利です。

indent が非負の整数または文字列であれば、JSON の配列要素とオブジェクトメンバはそのインデントレベルで見やすく表示されます。インデントレベルが 0 か負数または "" であれば改行だけが挿入されます。None (デフォルト) では最もコンパクトな表現が選択されます。正の数のindentはレベル毎に、指定した数のスペースでインデントします。もし indent が文字列 ("\t" のような) であれば、その文字列が個々のレベルのインデントに使用されます。

バージョン 3.2 で変更: 整数に加えて、文字列が indent に使用できるようになりました。

separators はもし指定するなら (item_separator, key_separator) というタプルでなければなりません。デフォルトは indent が None のとき (', ', ': ') で、そうでなければ (',', ': ') です。最もコンパクトな JSON の表現を得たければ空白を削った (',', ':') を指定すればいいでしょう。

バージョン 3.4 で変更: indent が None でなければ (',', ': ') がデフォルトで使われます。

default を指定する場合は関数を指定して、この関数はそれ以外では直列化できないオブジェクトに対して呼び出されます。その関数は、オブジェクトを JSON でエンコードできるバージョンにして返すか、さもなければ TypeError を送出しなければなりません。指定しない場合は、 TypeError が送出されます。

バージョン 3.6 で変更: すべての引数は、キーワード専用になりました。

default(o)¶

このメソッドをサブクラスで実装する際には o に対して直列化可能なオブジェクトを返すか、基底クラスの実装を (TypeError を送出するために) 呼び出すかします。

例えば、任意のイテレータをサポートする場合、 default() をこのように実装できます

def default(self, o):
   try:
       iterable = iter(o)
   except TypeError:
       pass
   else:
       return list(iterable)
   # Let the base class default method raise the TypeError
   return super().default(o)

encode(o)¶

Python データ構造 o の JSON 文字列表現を返します。たとえば:

>>> json.JSONEncoder().encode({"foo": ["bar", "baz"]})
'{"foo": ["bar", "baz"]}'

iterencode(o)¶

与えられたオブジェクト o をエンコードし、得られた文字列表現ごとに yield します。たとえば:

for chunk in json.JSONEncoder().iterencode(bigobject):
    mysocket.write(chunk)

例外¶

exception json.JSONDecodeError(msg, doc, pos)¶

ValueError のサブクラスで、以下の追加の属性を持ちます:

msg¶: フォーマットされていないエラーメッセージです。

doc¶: パース対象 JSON ドキュメントです。

pos¶: doc の、解析に失敗した開始インデクスです。

lineno¶: pos に対応する行です。

colno¶: pos に対応する列です。

バージョン 3.5 で追加.

標準への準拠と互換性¶

The JSON format is specified by RFC 7159 and by ECMA-404. This section details this module's level of compliance with the RFC. For simplicity, JSONEncoder and JSONDecoder subclasses, and parameters other than those explicitly mentioned, are not considered.

このモジュールは、JavaScript では正しいが JSON では不正ないくつかの拡張が実装されているため、厳密な意味では RFC に準拠していません。特に:

無限および NaN の数値を受け付け、また出力します;
あるオブジェクト内での同じ名前の繰り返しを受け付け、最後の名前と値のペアの値のみを使用します。

この RFC は、RFC 準拠のパーサが RFC 準拠でない入力テキストを受け付けることを許容しているので、このモジュールの脱直列化は技術的に言えば、デフォルトの設定では RFC に準拠しています。

文字エンコーディング¶

RFC は、UTF-8、UTF-16、UTF-32のいずれかでJSONを表現するように要求しており、UTF-8 が最大の互換性を確保するために推奨されるデフォルトです。

RFC で要求ではなく許可されている通り、このモジュールのシリアライザはデフォルトで ensure_ascii=True という設定を用い、従って、結果の文字列が ASCII 文字しか含まないように出力をエスケープします。

ensure_ascii パラメータ以外は、このモジュールは Python オブジェクトと Unicode 文字列 の間の変換において厳密に定義されていて、それ以外のパラメータで文字エンコーディングに直接的に関わるものはありません。

RFC は JSON テキストの最初にバイトオーダマーク(BOM)を追加することを禁止していますので、このモジュールはその出力に BOM を追加しません。RFC は JSON デシリアライザが入力の一番最初の BOM を無視することを、許容はしますが求めてはいません。このモジュールのデシリアライザは一番最初の BOM を見つけると ValueError を送出します。

RFC は JSON 文字列に正当な Unicode 文字に対応付かないバイト列(例えばペアにならない UTF-16 サロゲートのかたわれ)が含まれることを明示的に禁止してはおらず、もちろんこれは相互運用性の問題を引き起こします。デフォルトでは、このモジュールは(オリジナルの str にある場合)そのようなシーケンスのコードポイントを受け取り、出力します。

無限および NaN の数値¶

RFC は、無限もしくは NaN の数値の表現は許可していません。それにも関わらずデフォルトでは、このモジュールは Infinity、-Infinity、NaN を正しい JSON の数値リテラルの値であるかのように受け付け、出力します:

>>> # Neither of these calls raises an exception, but the results are not valid JSON
>>> json.dumps(float('-inf'))
'-Infinity'
>>> json.dumps(float('nan'))
'NaN'
>>> # Same when deserializing
>>> json.loads('-Infinity')
-inf
>>> json.loads('NaN')
nan

シリアライザでは、この振る舞いを変更するのに allow_nan パラメータが使えます。デシリアライザでは、この振る舞いを変更するのに parse_constant パラメータが使えます。

オブジェクト中に重複した名前の扱い¶

RFC は JSON オブジェクト中の名前はユニークでなければならないと規定していますが、JSONオブジェクトで名前が繰り返された場合の扱いについて指定していません。デフォルトでは、このモジュールは例外を送出せず、かわりに重複した名前のうち、最後に出現した名前と値のペア以外を無視します。

>>> weird_json = '{"x": 1, "x": 2, "x": 3}'
>>> json.loads(weird_json)
{'x': 3}

object_pairs_hook パラメータでこの動作を変更できます。

トップレベルの非オブジェクト、非配列の値の扱い¶

廃止された RFC 4627 によって規定された古いバージョンの JSON では、JSON テキストのトップレベルの値は JSON オブジェクトか配列(Python での dict か list)であることを要求していて、JSON の null, boolean, number, string であることは許されていませんでしたが、この制限は RFC 7159 により取り払われました。このモジュールはこの制限を持っていませんし、シリアライザでもデシリアライズでも、一度としてこの制限で実装されたことはありません。

それにも関わらず、相互運用可能性を最大化したいならば、あなた自身の手で自発的にその制約に忠実に従いたいと思うでしょう。

実装の制限¶

いくつかの JSON デシリアライザの実装は、以下の制限を設定することがあります。

受け入れられる JSON テキストのサイズ
JSON オブジェクトと配列のネストの最大の深さ
JSON 数値の範囲と精度
JSON 文字列の内容と最大の長さ

このモジュールは関連する Python データ型や Python インタプリタ自身の制約の世界を超えたそのような制約を強要はしません。

JSON にシリアライズする際には、あなたの JSON を消費する側のアプリケーションが持つ当該制約に思いを馳せてください。とりわけJSON 数値を IEEE 754 倍精度浮動小数にデシリアライズする際の問題はありがちで、すなわちその有効桁数と精度の制限の影響を受けます。これは、極端に大きな値を持った Python int をシリアライズするとき、あるいは decimal.Decimal のような "風変わりな" 数値型をシリアライズするとき、に特に関係があります。

Command Line Interface¶

ソースコード: Lib/json/tool.py

The json.tool module provides a simple command line interface to validate and pretty-print JSON objects.

オプション引数の infile と outfile が指定されない場合、それぞれ sys.stdin と sys.stdout が使用されます。

$ echo '{"json": "obj"}' | python -m json.tool
{
    "json": "obj"
}
$ echo '{1.2:3.4}' | python -m json.tool
Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

バージョン 3.5 で変更: 出力が、入力と同じ順序になりました。辞書をキーでアルファベット順に並べ替えた出力が欲しければ、 --sort-keys オプションを使ってください。

Command line options¶

infile¶

検証を行う、あるいは整形出力を行う JSON ファイルを指定します:

$ python -m json.tool mp_films.json
[
    {
        "title": "And Now for Something Completely Different",
        "year": 1971
    },
    {
        "title": "Monty Python and the Holy Grail",
        "year": 1975
    }
]

infile が指定されない場合、 sys.stdin から読み込みます。

outfile¶: infile の出力を outfile に書き込みます。そうでない場合、 sys.stdout に書き込みます。

--sort-keys¶: 辞書の出力を、キーのアルファベット順にソートします。

バージョン 3.5 で追加.

--no-ensure-ascii¶: 非 ASCII 文字のエスケープを無効化します。より詳しくは json.dumps() を参照してください。

バージョン 3.9 で追加.

--json-lines¶: すべての入力行を個別のJSON オブジェクトとしてパースします。

バージョン 3.8 で追加.

--indent, --tab, --no-indent, --compact¶: 空白文字の制御のための排他的なオプション。

バージョン 3.9 で追加.

-h, --help¶: ヘルプメッセージを出力します

脚注

`json` --- JSON encoder and decoder¶

基本的な使い方¶

エンコーダとデコーダ¶

例外¶

標準への準拠と互換性¶

文字エンコーディング¶

無限および NaN の数値¶

オブジェクト中に重複した名前の扱い¶

トップレベルの非オブジェクト、非配列の値の扱い¶

実装の制限¶

Command Line Interface¶

Command line options¶

目次

前のトピックへ

次のトピックへ

このページ

json --- JSON encoder and decoder¶

基本的な使い方¶

エンコーダとデコーダ¶

例外¶

標準への準拠と互換性¶

文字エンコーディング¶

無限および NaN の数値¶

オブジェクト中に重複した名前の扱い¶

トップレベルの非オブジェクト、非配列の値の扱い¶

実装の制限¶

Command Line Interface¶

Command line options¶

`json` --- JSON encoder and decoder¶