jsonschema#

PyPI version Supported Python versions Build status ReadTheDocs status pre-commit.ci status Zenodo

jsonschema is an implementation of the JSON Schema specification for Python.

>>> from jsonschema import validate

>>> # A sample schema, like what we'd get from json.load()
>>> schema = {
...     "type" : "object",
...     "properties" : {
...         "price" : {"type" : "number"},
...         "name" : {"type" : "string"},
...     },
... }

>>> # If no exception is raised by validate(), the instance is valid.
>>> validate(instance={"name" : "Eggs", "price" : 34.99}, schema=schema)

>>> validate(
...     instance={"name" : "Eggs", "price" : "Invalid"}, schema=schema,
... )                                   
Traceback (most recent call last):
    ...
ValidationError: 'Invalid' is not of type 'number'

It can also be used from console:

$ jsonschema --instance sample.json sample.schema

Features#

Installation#

jsonschema is available on PyPI. You can install using pip:

$ pip install jsonschema

Running the Test Suite#

If you have tox installed (perhaps via pip install tox or your package manager), running tox in the directory of your source checkout will run jsonschema’s test suite on all of the versions of Python jsonschema supports. If you don’t have all of the versions that jsonschema is tested under, you’ll likely want to run using tox’s --skip-missing-interpreters option.

Of course you’re also free to just run the tests on a single version with your favorite test runner. The tests live in the jsonschema.tests package.

Benchmarks#

jsonschema’s benchmarks make use of pyperf. Running them can be done via:

$ tox -e perf

Community#

The JSON Schema specification has a Slack, with an invite link on its home page. Many folks knowledgeable on authoring schemas can be found there.

Otherwise, asking questions on Stack Overflow is another means of getting help if you’re stuck.

About#

I’m Julian Berman.

jsonschema is on GitHub.

Get in touch, via GitHub or otherwise, if you’ve got something to contribute, it’d be most welcome!

You can also generally find me on Libera (nick: Julian) in various channels, including #python.

If you feel overwhelmingly grateful, you can also sponsor me.

And for companies who appreciate jsonschema and its continued support and growth, jsonschema is also now supportable via TideLift.

Contents#

Schema Validation#

The Basics#

The simplest way to validate an instance under a given schema is to use the validate() function.

jsonschema.validate(instance, schema, cls=None, *args, **kwargs)[source]#

Validate an instance under the given schema.

>>> validate([2, 3, 4], {"maxItems": 2})
Traceback (most recent call last):
    ...
ValidationError: [2, 3, 4] is too long

validate() will first verify that the provided schema is itself valid, since not doing so can lead to less obvious error messages and fail in less obvious or consistent ways.

If you know you have a valid schema already, especially if you intend to validate multiple instances with the same schema, you likely would prefer using the Validator.validate method directly on a specific validator (e.g. Draft7Validator.validate).

Parameters:
  • instance – The instance to validate

  • schema – The schema to validate with

  • cls (Validator) – The class that will be used to validate the instance.

If the cls argument is not provided, two things will happen in accordance with the specification. First, if the schema has a $schema keyword containing a known meta-schema [1] then the proper validator will be used. The specification recommends that all schemas contain $schema properties for this reason. If no $schema property is found, the default validator class is the latest released draft.

Any other provided positional and keyword arguments will be passed on when instantiating the cls.

Raises:

Footnotes

The Validator Protocol#

jsonschema defines a protocol that all validator classes should adhere to.

class jsonschema.protocols.Validator(*args, **kwargs)[source]#

The protocol to which all validator classes should adhere.

Parameters:
  • schema – the schema that the validator object will validate with. It is assumed to be valid, and providing an invalid schema can lead to undefined behavior. See Validator.check_schema to validate a schema first.

  • resolver – an instance of jsonschema.RefResolver that will be used to resolve $ref properties (JSON references). If unprovided, one will be created.

  • format_checker – an instance of jsonschema.FormatChecker whose jsonschema.FormatChecker.conforms method will be called to check and see if instances conform to each format property present in the schema. If unprovided, no validation will be done for format. Certain formats require additional packages to be installed (ipv5, uri, color, date-time). The required packages can be found at the bottom of this page.

FORMAT_CHECKER: ClassVar[jsonschema.FormatChecker]#

A jsonschema.FormatChecker that will be used when validating format properties in JSON schemas.

META_SCHEMA: ClassVar[dict]#

An object representing the validator’s meta schema (the schema that describes valid schemas in the given version).

TYPE_CHECKER: ClassVar[jsonschema.TypeChecker]#

A jsonschema.TypeChecker that will be used when validating type keywords in JSON schemas.

VALIDATORS: ClassVar[dict]#

A mapping of validation keywords (strs) to functions that validate the keyword with that name. For more information see Creating or Extending Validator Classes.

classmethod check_schema(schema)[source]#

Validate the given schema against the validator’s META_SCHEMA.

Raises:

jsonschema.exceptions.SchemaError if the schema is invalid

Return type:

None

evolve(**kwargs)[source]#

Create a new validator like this one, but with given changes.

Preserves all other attributes, so can be used to e.g. create a validator with a different schema but with the same $ref resolution behavior.

>>> validator = Draft202012Validator({})
>>> validator.evolve(schema={"type": "number"})
Draft202012Validator(schema={'type': 'number'}, format_checker=None)

The returned object satisfies the validator protocol, but may not be of the same concrete class! In particular this occurs when a $ref occurs to a schema with a different $schema than this one (i.e. for a different draft).

>>> validator.evolve(
...     schema={"$schema": Draft7Validator.META_SCHEMA["$id"]}
... )
Draft7Validator(schema=..., format_checker=None)
Return type:

Validator

is_type(instance, type)[source]#

Check if the instance is of the given (JSON Schema) type.

Return type:

bool

Raises:

jsonschema.exceptions.UnknownType if type is not a known type.

is_valid(instance)[source]#

Check if the instance is valid under the current schema.

Return type:

bool

>>> schema = {"maxItems" : 2}
>>> Draft202012Validator(schema).is_valid([2, 3, 4])
False
iter_errors(instance)[source]#

Lazily yield each of the validation errors in the given instance.

Return type:

an collections.abc.Iterable of jsonschema.exceptions.ValidationErrors

>>> schema = {
...     "type" : "array",
...     "items" : {"enum" : [1, 2, 3]},
...     "maxItems" : 2,
... }
>>> v = Draft202012Validator(schema)
>>> for error in sorted(v.iter_errors([2, 3, 4]), key=str):
...     print(error.message)
4 is not one of [1, 2, 3]
[2, 3, 4] is too long
schema: dict | bool#

The schema that was passed in when initializing the object.

validate(instance)[source]#

Check if the instance is valid under the current schema.

Raises:

jsonschema.exceptions.ValidationError if the instance is invalid

>>> schema = {"maxItems" : 2}
>>> Draft202012Validator(schema).validate([2, 3, 4])
Traceback (most recent call last):
    ...
ValidationError: [2, 3, 4] is too long
Return type:

None

All of the versioned validators that are included with jsonschema adhere to the protocol, and implementers of validator classes that extend or complement the ones included should adhere to it as well. For more information see Creating or Extending Validator Classes.

Type Checking#

To handle JSON Schema’s type keyword, a Validator uses an associated TypeChecker. The type checker provides an immutable mapping between names of types and functions that can test if an instance is of that type. The defaults are suitable for most users - each of the versioned validators that are included with jsonschema have a TypeChecker that can correctly handle their respective versions.

See also

Validating With Additional Types

For an example of providing a custom type check.

class jsonschema.TypeChecker(type_checkers=pmap({}))[source]#

A type property checker.

A TypeChecker performs type checking for a Validator. Type checks to perform are updated using TypeChecker.redefine or TypeChecker.redefine_many and removed via TypeChecker.remove. Each of these return a new TypeChecker object.

Parameters:

type_checkers (dict) – The initial mapping of types to their checking functions.

is_type(instance, type)[source]#

Check if the instance is of the appropriate type.

Parameters:
  • instance (object) – The instance to check

  • type (str) – The name of the type that is expected.

Returns:

Whether it conformed.

Return type:

bool

Raises:

jsonschema.exceptions.UndefinedTypeCheck – if type is unknown to this object.

redefine(type, fn)[source]#

Produce a new checker with the given type redefined.

Parameters:
  • type (str) – The name of the type to check.

  • fn (collections.abc.Callable) – A function taking exactly two parameters - the type checker calling the function and the instance to check. The function should return true if instance is of this type and false otherwise.

Returns:

A new TypeChecker instance.

redefine_many(definitions=())[source]#

Produce a new checker with the given types redefined.

Parameters:

definitions (dict) – A dictionary mapping types to their checking functions.

Returns:

A new TypeChecker instance.

remove(*types)[source]#

Produce a new checker with the given types forgotten.

Parameters:

types (Iterable) – the names of the types to remove.

Returns:

A new TypeChecker instance

Raises:

jsonschema.exceptions.UndefinedTypeCheck – if any given type is unknown to this object

exception jsonschema.exceptions.UndefinedTypeCheck(type)[source]#

A type checker was asked to check a type it did not have registered.

Raised when trying to remove a type check that is not known to this TypeChecker, or when calling jsonschema.TypeChecker.is_type directly.

Validating With Additional Types#

Occasionally it can be useful to provide additional or alternate types when validating JSON Schema’s type keyword.

jsonschema tries to strike a balance between performance in the common case and generality. For instance, JSON Schema defines a number type, which can be validated with a schema such as {"type" : "number"}. By default, this will accept instances of Python numbers.Number. This includes in particular ints and floats, along with decimal.Decimal objects, complex numbers etc. For integer and object, however, rather than checking for numbers.Integral and collections.abc.Mapping, jsonschema simply checks for int and dict, since the more general instance checks can introduce significant slowdown, especially given how common validating these types are.

If you do want the generality, or just want to add a few specific additional types as being acceptable for a validator object, then you should update an existing TypeChecker or create a new one. You may then create a new Validator via jsonschema.validators.extend.

from jsonschema import validators

class MyInteger(object):
    pass

def is_my_int(checker, instance):
    return (
        Draft202012Validator.TYPE_CHECKER.is_type(instance, "number") or
        isinstance(instance, MyInteger)
    )

type_checker = Draft202012Validator.TYPE_CHECKER.redefine(
    "number", is_my_int,
)

CustomValidator = validators.extend(
    Draft202012Validator,
    type_checker=type_checker,
)
validator = CustomValidator(schema={"type" : "number"})
exception jsonschema.exceptions.UnknownType(type, instance, schema)[source]#

A validator was asked to validate an instance against an unknown type.

Versioned Validators#

jsonschema ships with validator classes for various versions of the JSON Schema specification. For details on the methods and attributes that each validator class provides see the Validator protocol, which each included validator class implements.

class jsonschema.Draft202012Validator(schema, resolver=None, format_checker=None)#
class jsonschema.Draft201909Validator(schema, resolver=None, format_checker=None)#
class jsonschema.Draft7Validator(schema, resolver=None, format_checker=None)#
class jsonschema.Draft6Validator(schema, resolver=None, format_checker=None)#
class jsonschema.Draft4Validator(schema, resolver=None, format_checker=None)#
class jsonschema.Draft3Validator(schema, resolver=None, format_checker=None)#

For example, if you wanted to validate a schema you created against the Draft 2020-12 meta-schema, you could use:

from jsonschema import Draft202012Validator

schema = {
    "$schema": Draft202012Validator.META_SCHEMA["$id"],

    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "email": {"type": "string"},
    },
    "required": ["email"]
}
Draft202012Validator.check_schema(schema)

Validating Formats#

JSON Schema defines the format keyword which can be used to check if primitive types (strings, numbers, booleans) conform to well-defined formats. By default, no validation is enforced, but optionally, validation can be enabled by hooking in a format-checking object into an Validator.

>>> validate("127.0.0.1", {"format" : "ipv4"})
>>> validate(
...     instance="-12",
...     schema={"format" : "ipv4"},
...     format_checker=draft202012_format_checker,
... )
Traceback (most recent call last):
    ...
ValidationError: "-12" is not a "ipv4"
class jsonschema.FormatChecker(formats=None)[source]#

A format property checker.

JSON Schema does not mandate that the format property actually do any validation. If validation is desired however, instances of this class can be hooked into validators to enable format validation.

FormatChecker objects always return True when asked about formats that they do not know how to validate.

To check a custom format using a function that takes an instance and returns a bool, use the FormatChecker.checks or FormatChecker.cls_checks decorators.

Parameters:

formats (Iterable) – The known formats to validate. This argument can be used to limit which formats will be used during validation.

checkers#

A mapping of currently known formats to tuple of functions that validate them and errors that should be caught. New checkers can be added and removed either per-instance or globally for all checkers using the FormatChecker.checks or FormatChecker.cls_checks decorators respectively.

classmethod cls_checks(format, raises=())[source]#

Register a decorated function as globally validating a new format.

Any instance created after this function is called will pick up the supplied checker.

Parameters:
  • format (str) – the format that the decorated function will check

  • raises (Exception) – the exception(s) raised by the decorated function when an invalid instance is found. The exception object will be accessible as the jsonschema.exceptions.ValidationError.cause attribute of the resulting validation error.

check(instance, format)[source]#

Check whether the instance conforms to the given format.

Parameters:
  • instance (any primitive type, i.e. str, number, bool) – The instance to check

  • format (str) – The format that instance should conform to

Raises:

FormatError – if the instance does not conform to format

Return type:

None

checks(format, raises=())[source]#

Register a decorated function as validating a new format.

Parameters:
  • format (str) – The format that the decorated function will check.

  • raises (Exception) –

    The exception(s) raised by the decorated function when an invalid instance is found.

    The exception object will be accessible as the jsonschema.exceptions.ValidationError.cause attribute of the resulting validation error.

Return type:

Callable[[TypeVar(_F, bound= Callable[[object], bool])], TypeVar(_F, bound= Callable[[object], bool])]

conforms(instance, format)[source]#

Check whether the instance conforms to the given format.

Parameters:
  • instance (any primitive type, i.e. str, number, bool) – The instance to check

  • format (str) – The format that instance should conform to

Returns:

whether it conformed

Return type:

bool

exception jsonschema.FormatError(message, cause=None)[source]#

Validating a format failed.

There are a number of default checkers that FormatCheckers know how to validate. Their names can be viewed by inspecting the FormatChecker.checkers attribute. Certain checkers will only be available if an appropriate package is available for use. The easiest way to ensure you have what is needed is to install jsonschema using the format or format_nongpl collection of optional dependencies – e.g.

$ pip install jsonschema[format]

which will install all of the below dependencies for all formats.

Or if you want to install MIT-license compatible dependencies only:

$ pip install jsonschema[format_nongpl]

The non-GPL extra is intended to not install any direct dependencies that are GPL (but that of course end-users should do their own verification). At the moment, it supports all the available checkers except for iri and iri-reference.

The more specific list of available checkers, along with their requirement (if any,) are listed below.

Note

If the following packages are not installed when using a checker that requires it, validation will succeed without throwing an error, as specified by the JSON Schema specification.

Checker

Notes

color

requires webcolors

date

date-time

requires rfc3339-validator

duration

requires isoduration

email

hostname

requires fqdn

idn-hostname

requires idna

ipv4

ipv6

OS must have socket.inet_pton function

iri

requires rfc3987

iri-reference

requires rfc3987

json-pointer

requires jsonpointer

regex

relative-json-pointer

requires jsonpointer

time

requires rfc3339-validator

uri

requires rfc3987 or rfc3986-validator

uri-reference

requires rfc3987 or rfc3986-validator

uri-template

requires uri-template

Note

Since in most cases “validating” an email address is an attempt instead to confirm that mail sent to it will deliver to a recipient, and that that recipient is the correct one the email is intended for, and since many valid email addresses are in many places incorrectly rejected, and many invalid email addresses are in many places incorrectly accepted, the email format keyword only provides a sanity check, not full rfc5322 validation.

The same applies to the idn-email format.

Handling Validation Errors#

When an invalid instance is encountered, a ValidationError will be raised or returned, depending on which method or function is used.

exception jsonschema.exceptions.ValidationError(message, validator=<unset>, path=(), cause=None, context=(), validator_value=<unset>, instance=<unset>, schema=<unset>, schema_path=(), parent=None, type_checker=<unset>)[source]#

An instance was invalid under a provided schema.

The information carried by an error roughly breaks down into:

What Happened

Why Did It Happen

What Was Being Validated

message

context

cause

instance

json_path

path

schema

schema_path

validator

validator_value

message#

A human readable message explaining the error.

validator#

The name of the failed keyword.

validator_value#

The associated value for the failed keyword in the schema.

schema#

The full schema that this error came from. This is potentially a subschema from within the schema that was passed in originally, or even an entirely different schema if a $ref was followed.

relative_schema_path#

A collections.deque containing the path to the failed keyword within the schema.

absolute_schema_path#

A collections.deque containing the path to the failed keyword within the schema, but always relative to the original schema as opposed to any subschema (i.e. the one originally passed into a validator class, not schema).

schema_path#

Same as relative_schema_path.

relative_path#

A collections.deque containing the path to the offending element within the instance. The deque can be empty if the error happened at the root of the instance.

absolute_path#

A collections.deque containing the path to the offending element within the instance. The absolute path is always relative to the original instance that was validated (i.e. the one passed into a validation method, not instance). The deque can be empty if the error happened at the root of the instance.

json_path#

A JSON path to the offending element within the instance.

path#

Same as relative_path.

instance#

The instance that was being validated. This will differ from the instance originally passed into validate if the validator object was in the process of validating a (possibly nested) element within the top-level instance. The path within the top-level instance (i.e. ValidationError.path) could be used to find this object, but it is provided for convenience.

context#

If the error was caused by errors in subschemas, the list of errors from the subschemas will be available on this property. The schema_path and path of these errors will be relative to the parent error.

cause#

If the error was caused by a non-validation error, the exception object will be here. Currently this is only used for the exception raised by a failed format checker in jsonschema.FormatChecker.check.

parent#

A validation error which this error is the context of. None if there wasn’t one.

In case an invalid schema itself is encountered, a SchemaError is raised.

exception jsonschema.exceptions.SchemaError(message, validator=<unset>, path=(), cause=None, context=(), validator_value=<unset>, instance=<unset>, schema=<unset>, schema_path=(), parent=None, type_checker=<unset>)[source]#

A schema was invalid under its corresponding metaschema.

The same attributes are present as for ValidationErrors.

These attributes can be clarified with a short example:

schema = {
    "items": {
        "anyOf": [
            {"type": "string", "maxLength": 2},
            {"type": "integer", "minimum": 5}
        ]
    }
}
instance = [{}, 3, "foo"]
v = Draft202012Validator(schema)
errors = sorted(v.iter_errors(instance), key=lambda e: e.path)

The error messages in this situation are not very helpful on their own.

for error in errors:
    print(error.message)

outputs:

{} is not valid under any of the given schemas
3 is not valid under any of the given schemas
'foo' is not valid under any of the given schemas

If we look at ValidationError.path on each of the errors, we can find out which elements in the instance correspond to each of the errors. In this example, ValidationError.path will have only one element, which will be the index in our list.

for error in errors:
    print(list(error.path))
[0]
[1]
[2]

Since our schema contained nested subschemas, it can be helpful to look at the specific part of the instance and subschema that caused each of the errors. This can be seen with the ValidationError.instance and ValidationError.schema attributes.

With keywords like anyOf, the ValidationError.context attribute can be used to see the sub-errors which caused the failure. Since these errors actually came from two separate subschemas, it can be helpful to look at the ValidationError.schema_path attribute as well to see where exactly in the schema each of these errors come from. In the case of sub-errors from the ValidationError.context attribute, this path will be relative to the ValidationError.schema_path of the parent error.

for error in errors:
    for suberror in sorted(error.context, key=lambda e: e.schema_path):
        print(list(suberror.schema_path), suberror.message, sep=", ")
[0, 'type'], {} is not of type 'string'
[1, 'type'], {} is not of type 'integer'
[0, 'type'], 3 is not of type 'string'
[1, 'minimum'], 3 is less than the minimum of 5
[0, 'maxLength'], 'foo' is too long
[1, 'type'], 'foo' is not of type 'integer'

The string representation of an error combines some of these attributes for easier debugging.

print(errors[1])
3 is not valid under any of the given schemas

Failed validating 'anyOf' in schema['items']:
    {'anyOf': [{'maxLength': 2, 'type': 'string'},
               {'minimum': 5, 'type': 'integer'}]}

On instance[1]:
    3

ErrorTrees#

If you want to programmatically query which validation keywords failed when validating a given instance, you may want to do so using jsonschema.exceptions.ErrorTree objects.

class jsonschema.exceptions.ErrorTree(errors=())[source]#

ErrorTrees make it easier to check which validations failed.

errors#

The mapping of validator keywords to the error objects (usually jsonschema.exceptions.ValidationErrors) at this level of the tree.

__contains__(index)[source]#

Check whether instance[index] has any errors.

__getitem__(index)[source]#

Retrieve the child tree one level down at the given index.

If the index is not in the instance that this tree corresponds to and is not known by this tree, whatever error would be raised by instance.__getitem__ will be propagated (usually this is some subclass of LookupError.

__init__(errors=())[source]#
__iter__()[source]#

Iterate (non-recursively) over the indices in the instance with errors.

__len__()[source]#

Return the total_errors.

__repr__()[source]#

Return repr(self).

__setitem__(index, value)[source]#

Add an error to the tree at the given index.

property total_errors#

The total number of errors in the entire tree, including children.

Consider the following example:

schema = {
    "type" : "array",
    "items" : {"type" : "number", "enum" : [1, 2, 3]},
    "minItems" : 3,
}
instance = ["spam", 2]

For clarity’s sake, the given instance has three errors under this schema:

v = Draft202012Validator(schema)
for error in sorted(v.iter_errors(["spam", 2]), key=str):
    print(error.message)
'spam' is not of type 'number'
'spam' is not one of [1, 2, 3]
['spam', 2] is too short

Let’s construct an jsonschema.exceptions.ErrorTree so that we can query the errors a bit more easily than by just iterating over the error objects.

tree = ErrorTree(v.iter_errors(instance))

As you can see, jsonschema.exceptions.ErrorTree takes an iterable of ValidationErrors when constructing a tree so you can directly pass it the return value of a validator object’s jsonschema.protocols.Validator.iter_errors method.

ErrorTrees support a number of useful operations. The first one we might want to perform is to check whether a given element in our instance failed validation. We do so using the in operator:

>>> 0 in tree
True

>>> 1 in tree
False

The interpretation here is that the 0th index into the instance ("spam") did have an error (in fact it had 2), while the 1th index (2) did not (i.e. it was valid).

If we want to see which errors a child had, we index into the tree and look at the ErrorTree.errors attribute.

>>> sorted(tree[0].errors)
['enum', 'type']

Here we see that the enum and type keywords failed for index 0. In fact ErrorTree.errors is a dict, whose values are the ValidationErrors, so we can get at those directly if we want them.

>>> print(tree[0].errors["type"].message)
'spam' is not of type 'number'

Of course this means that if we want to know if a given validation keyword failed for a given index, we check for its presence in ErrorTree.errors:

>>> "enum" in tree[0].errors
True

>>> "minimum" in tree[0].errors
False

Finally, if you were paying close enough attention, you’ll notice that we haven’t seen our minItems error appear anywhere yet. This is because minItems is an error that applies globally to the instance itself. So it appears in the root node of the tree.

>>> "minItems" in tree.errors
True

That’s all you need to know to use error trees.

To summarize, each tree contains child trees that can be accessed by indexing the tree to get the corresponding child tree for a given index into the instance. Each tree and child has a ErrorTree.errors attribute, a dict, that maps the failed validation keyword to the corresponding validation error.

best_match and relevance#

The best_match function is a simple but useful function for attempting to guess the most relevant error in a given bunch.

>>> from jsonschema import Draft202012Validator
>>> from jsonschema.exceptions import best_match

>>> schema = {
...     "type": "array",
...     "minItems": 3,
... }
>>> print(best_match(Draft202012Validator(schema).iter_errors(11)).message)
11 is not of type 'array'
jsonschema.exceptions.best_match(errors, key=<function by_relevance.<locals>.relevance>)[source]#

Try to find an error that appears to be the best match among given errors.

In general, errors that are higher up in the instance (i.e. for which ValidationError.path is shorter) are considered better matches, since they indicate “more” is wrong with the instance.

If the resulting match is either oneOf or anyOf, the opposite assumption is made – i.e. the deepest error is picked, since these keywords only need to match once, and any other errors may not be relevant.

Parameters:
  • errors (collections.abc.Iterable) – the errors to select from. Do not provide a mixture of errors from different validation attempts (i.e. from different instances or schemas), since it won’t produce sensical output.

  • key (collections.abc.Callable) – the key to use when sorting errors. See relevance and transitively by_relevance for more details (the default is to sort with the defaults of that function). Changing the default is only useful if you want to change the function that rates errors but still want the error context descent done by this function.

Returns:

the best matching error, or None if the iterable was empty

Note

This function is a heuristic. Its return value may change for a given set of inputs from version to version if better heuristics are added.

jsonschema.exceptions.relevance(validation_error)#

A key function that sorts errors based on heuristic relevance.

If you want to sort a bunch of errors entirely, you can use this function to do so. Using this function as a key to e.g. sorted or max will cause more relevant errors to be considered greater than less relevant ones.

Within the different validation keywords that can fail, this function considers anyOf and oneOf to be weak validation errors, and will sort them lower than other errors at the same level in the instance.

If you want to change the set of weak [or strong] validation keywords you can create a custom version of this function with by_relevance and provide a different set of each.

>>> schema = {
...     "properties": {
...         "name": {"type": "string"},
...         "phones": {
...             "properties": {
...                 "home": {"type": "string"}
...             },
...         },
...     },
... }
>>> instance = {"name": 123, "phones": {"home": [123]}}
>>> errors = Draft202012Validator(schema).iter_errors(instance)
>>> [
...     e.path[-1]
...     for e in sorted(errors, key=exceptions.relevance)
... ]
['home', 'name']
jsonschema.exceptions.by_relevance(weak=frozenset({'anyOf', 'oneOf'}), strong=frozenset({}))[source]#

Create a key function that can be used to sort errors by relevance.

Parameters:
  • weak (set) – a collection of validation keywords to consider to be “weak”. If there are two errors at the same level of the instance and one is in the set of weak validation keywords, the other error will take priority. By default, anyOf and oneOf are considered weak keywords and will be superseded by other same-level validation errors.

  • strong (set) – a collection of validation keywords to consider to be “strong”

Resolving JSON References#

class jsonschema.RefResolver(base_uri, referrer, store=(), cache_remote=True, handlers=(), urljoin_cache=None, remote_cache=None)[source]#

Resolve JSON References.

Parameters:
  • base_uri (str) – The URI of the referring document

  • referrer – The actual referring document

  • store (dict) – A mapping from URIs to documents to cache

  • cache_remote (bool) – Whether remote refs should be cached after first resolution

  • handlers (dict) – A mapping from URI schemes to functions that should be used to retrieve them

  • urljoin_cache (functools.lru_cache()) – A cache that will be used for caching the results of joining the resolution scope to subscopes.

  • remote_cache (functools.lru_cache()) – A cache that will be used for caching the results of resolved remote URLs.

cache_remote#

Whether remote refs should be cached after first resolution

Type:

bool

property base_uri#

Retrieve the current base URI, not including any fragment.

classmethod from_schema(schema, id_of=<function _id_of>, *args, **kwargs)[source]#

Construct a resolver from a JSON schema object.

Parameters:

schema – the referring schema

Returns:

RefResolver

in_scope(scope)[source]#

Temporarily enter the given scope for the duration of the context.

pop_scope()[source]#

Exit the most recent entered scope.

Treats further dereferences as being performed underneath the original scope.

Don’t call this method more times than push_scope has been called.

push_scope(scope)[source]#

Enter a given sub-scope.

Treats further dereferences as being performed underneath the given scope.

property resolution_scope#

Retrieve the current resolution scope.

resolve(ref)[source]#

Resolve the given reference.

resolve_fragment(document, fragment)[source]#

Resolve a fragment within the referenced document.

Parameters:
  • document – The referent document

  • fragment (str) – a URI fragment to resolve within it

resolve_from_url(url)[source]#

Resolve the given URL.

resolve_remote(uri)[source]#

Resolve a remote uri.

If called directly, does not check the store first, but after retrieving the document at the specified URI it will be saved in the store if cache_remote is True.

Note

If the requests library is present, jsonschema will use it to request the remote uri, so that the correct encoding is detected and used.

If it isn’t, or if the scheme of the uri is not http or https, UTF-8 is assumed.

Parameters:

uri (str) – The URI to resolve

Returns:

The retrieved document

resolving(ref)[source]#

Resolve the given ref and enter its resolution scope.

Exits the scope on exit of this context manager.

Parameters:

ref (str) – The reference to resolve

exception jsonschema.RefResolutionError(cause)[source]#

A ref could not be resolved.

A JSON reference failed to resolve.

Creating or Extending Validator Classes#

jsonschema.validators.create(meta_schema, validators=(), version=None, type_checker=TypeChecker(_type_checkers=pmap({'boolean': <function is_bool>, 'array': <function is_array>, 'integer': <function <lambda>>, 'number': <function is_number>, 'object': <function is_object>, 'string': <function is_string>, 'null': <function is_null>})), format_checker=<FormatChecker checkers=['date', 'email', 'idn-email', 'idn-hostname', 'ipv4', 'ipv6', 'regex', 'uuid']>, id_of=<function _id_of>, applicable_validators=operator.methodcaller('items'))[source]#

Create a new validator class.

Parameters:
  • meta_schema (collections.abc.Mapping) – the meta schema for the new validator class

  • validators (collections.abc.Mapping) –

    a mapping from names to callables, where each callable will validate the schema property with the given name.

    Each callable should take 4 arguments:

    1. a validator instance,

    2. the value of the property being validated within the instance

    3. the instance

    4. the schema

  • version (str) – an identifier for the version that this validator class will validate. If provided, the returned validator class will have its __name__ set to include the version, and also will have jsonschema.validators.validates automatically called for the given version.

  • type_checker (jsonschema.TypeChecker) –

    a type checker, used when applying the type keyword.

    If unprovided, a jsonschema.TypeChecker will be created with a set of default types typical of JSON Schema drafts.

  • format_checker (jsonschema.FormatChecker) –

    a format checker, used when applying the format keyword.

    If unprovided, a jsonschema.FormatChecker will be created with a set of default formats typical of JSON Schema drafts.

  • id_of (collections.abc.Callable) – A function that given a schema, returns its ID.

  • applicable_validators (collections.abc.Callable) – A function that given a schema, returns the list of applicable validators (validation keywords and callables) which will be used to validate the instance.

Returns:

a new jsonschema.protocols.Validator class

jsonschema.validators.extend(validator, validators=(), version=None, type_checker=None, format_checker=None)[source]#

Create a new validator class by extending an existing one.

Parameters:
  • validator (jsonschema.protocols.Validator) – an existing validator class

  • validators (collections.abc.Mapping) –

    a mapping of new validator callables to extend with, whose structure is as in create.

    Note

    Any validator callables with the same name as an existing one will (silently) replace the old validator callable entirely, effectively overriding any validation done in the “parent” validator class.

    If you wish to instead extend the behavior of a parent’s validator callable, delegate and call it directly in the new validator function by retrieving it using OldValidator.VALIDATORS["validation_keyword_name"].

  • version (str) – a version for the new validator class

  • type_checker (jsonschema.TypeChecker) –

    a type checker, used when applying the type keyword.

    If unprovided, the type checker of the extended jsonschema.protocols.Validator will be carried along.

  • format_checker (jsonschema.FormatChecker) –

    a format checker, used when applying the format keyword.

    If unprovided, the format checker of the extended jsonschema.protocols.Validator will be carried along.

Returns:

a new jsonschema.protocols.Validator class extending the one provided

Note

Meta Schemas

The new validator class will have its parent’s meta schema.

If you wish to change or extend the meta schema in the new validator class, modify META_SCHEMA directly on the returned class. Note that no implicit copying is done, so a copy should likely be made before modifying it, in order to not affect the old validator.

jsonschema.validators.validator_for(schema, default=<class 'jsonschema.validators.Draft202012Validator'>)[source]#

Retrieve the validator class appropriate for validating the given schema.

Uses the $schema keyword that should be present in the given schema to look up the appropriate validator class.

Parameters:
  • schema (collections.abc.Mapping or bool) – the schema to look at

  • default

    the default to return if the appropriate validator class cannot be determined.

    If unprovided, the default is to return the latest supported draft.

jsonschema.validators.validates(version)[source]#

Register the decorated validator for a version of the specification.

Registered validators and their meta schemas will be considered when parsing $schema keywords’ URIs.

Parameters:

version (str) – An identifier to use as the version’s name

Returns:

a class decorator to decorate the validator with the version

Return type:

collections.abc.Callable

Creating Validation Errors#

Any validating function that validates against a subschema should call descend, rather than iter_errors. If it recurses into the instance, or schema, it should pass one or both of the path or schema_path arguments to descend in order to properly maintain where in the instance or schema respectively the error occurred.

The Validator Protocol#

jsonschema defines a protocol, jsonschema.protocols.Validator which can be used in type annotations to describe the type of a validator object.

For full details, see The Validator Protocol.

Frequently Asked Questions#

My schema specifies format validation. Why do invalid instances seem valid?#

The format keyword can be a bit of a stumbling block for new users working with JSON Schema.

In a schema such as:

{"type": "string", "format": "date"}

JSON Schema specifications have historically differentiated between the format keyword and other keywords. In particular, the format keyword was specified to be informational as much as it may be used for validation.

In other words, for many use cases, schema authors may wish to use values for the format keyword but have no expectation they be validated alongside other required assertions in a schema.

Of course this does not represent all or even most use cases – many schema authors do wish to assert that instances conform fully, even to the specific format mentioned.

In drafts prior to draft2019-09, the decision on whether to automatically enable format validation was left up to validation implementations such as this one.

This library made the choice to leave it off by default, for two reasons:

  • for forward compatibility and implementation complexity reasons – if format validation were on by default, and a future draft of JSON Schema introduced a hard-to-implement format, either the implementation of that format would block releases of this library until it were implemented, or the behavior surrounding format would need to be even more complex than simply defaulting to be on. It therefore was safer to start with it off, and defend against the expectation that a given format would always automatically work.

  • given that a common use of JSON Schema is for portability across languages (and therefore implementations of JSON Schema), so that users be aware of this point itself regarding format validation, and therefore remember to check any other implementations they were using to ensure they too were explicitly enabled for format validation.

As of draft2019-09 however, the opt-out by default behavior mentioned here is now required for all validators.

Difficult as this may sound for new users, at this point it at least means they should expect the same behavior that has always been implemented here, across any other implementation they encounter.

See also

Draft 2019-09’s release notes on format

for upstream details on the behavior of format and how it has changed in draft2019-09

Validating Formats

for details on how to enable format validation

jsonschema.FormatChecker

the object which implements format validation

How do I configure a base URI for $ref resolution using local files?#

jsonschema supports loading schemas from the filesystem.

The most common mistake when configuring a RefResolver to retrieve schemas from the local filesystem is to give it a base URI which points to a directory, but forget to add a trailing slash.

For example, given a directory /tmp/foo/ with bar/schema.json within it, you should use something like:

from pathlib import Path

import jsonschema.validators

path = Path("/tmp/foo")
resolver = jsonschema.validators.RefResolver(
    base_uri=f"{path.as_uri()}/",
    referrer=True,
)
jsonschema.validate(
    instance={},
    schema={"$ref": "bar/schema.json"},
    resolver=resolver,
)

where note:

  • the base URI has a trailing slash, even though pathlib.PurePath.as_uri does not add it!

  • any relative refs are now given relative to the provided directory

If you forget the trailing slash, you’ll find references are resolved a directory too high.

You’re likely familiar with this behavior from your browser. If you visit a page at https://example.com/foo, then links on it like <a href="./bar"> take you to https://example.com/bar, not https://example.com/foo/bar. For this reason many sites will redirect https://example.com/foo to https://example.com/foo/, i.e. add the trailing slash, so that relative links on the page will keep the last path component.

There are, in summary, 2 ways to do this properly:

  • Remember to include a trailing slash, so your base URI is file:///foo/bar/ rather than file:///foo/bar, as shown above

  • Use a file within the directory as your base URI rather than the directory itself, i.e. file://foo/bar/baz.json, which will of course cause baz.json to be removed while resolving relative URIs

Why doesn’t my schema’s default property set the default on my instance?#

The basic answer is that the specification does not require that default actually do anything.

For an inkling as to why it doesn’t actually do anything, consider that none of the other keywords modify the instance either. More importantly, having default modify the instance can produce quite peculiar things. It’s perfectly valid (and perhaps even useful) to have a default that is not valid under the schema it lives in! So an instance modified by the default would pass validation the first time, but fail the second!

Still, filling in defaults is a thing that is useful. jsonschema allows you to define your own validator classes and callables, so you can easily create an jsonschema.protocols.Validator that does do default setting. Here’s some code to get you started. (In this code, we add the default properties to each object before the properties are validated, so the default values themselves will need to be valid under the schema.)

from jsonschema import Draft202012Validator, validators


def extend_with_default(validator_class):
    validate_properties = validator_class.VALIDATORS["properties"]

    def set_defaults(validator, properties, instance, schema):
        for property, subschema in properties.items():
            if "default" in subschema:
                instance.setdefault(property, subschema["default"])

        for error in validate_properties(
            validator, properties, instance, schema,
        ):
            yield error

    return validators.extend(
        validator_class, {"properties" : set_defaults},
    )


DefaultValidatingValidator = extend_with_default(Draft202012Validator)


# Example usage:
obj = {}
schema = {'properties': {'foo': {'default': 'bar'}}}
# Note jsonschema.validate(obj, schema, cls=DefaultValidatingValidator)
# will not work because the metaschema contains `default` keywords.
DefaultValidatingValidator(schema).validate(obj)
assert obj == {'foo': 'bar'}

See the above-linked document for more info on how this works, but basically, it just extends the properties keyword on a jsonschema.Draft202012Validator to then go ahead and update all the defaults.

Note

If you’re interested in a more interesting solution to a larger class of these types of transformations, keep an eye on Seep, which is an experimental data transformation and extraction library written on top of jsonschema.

Hint

The above code can provide default values for an entire object and all of its properties, but only if your schema provides a default value for the object itself, like so:

schema = {
    "type": "object",
    "properties": {
        "outer-object": {
            "type": "object",
            "properties" : {
                "inner-object": {
                    "type": "string",
                    "default": "INNER-DEFAULT"
                }
            },
            "default": {} # <-- MUST PROVIDE DEFAULT OBJECT
        }
    }
}

obj = {}
DefaultValidatingValidator(schema).validate(obj)
assert obj == {'outer-object': {'inner-object': 'INNER-DEFAULT'}}

…but if you don’t provide a default value for your object, then it won’t be instantiated at all, much less populated with default properties.

del schema["properties"]["outer-object"]["default"]
obj2 = {}
DefaultValidatingValidator(schema).validate(obj2)
assert obj2 == {} # whoops

How do jsonschema version numbers work?#

jsonschema tries to follow the Semantic Versioning specification.

This means broadly that no backwards-incompatible changes should be made in minor releases (and certainly not in dot releases).

The full picture requires defining what constitutes a backwards-incompatible change.

The following are simple examples of things considered public API, and therefore should not be changed without bumping a major version number:

  • module names and contents, when not marked private by Python convention (a single leading underscore)

  • function and object signature (parameter order and name)

The following are not considered public API and may change without notice:

  • the exact wording and contents of error messages; typical reasons to rely on this seem to involve downstream tests in packages using jsonschema. These use cases are encouraged to use the extensive introspection provided in jsonschema.exceptions.ValidationErrors instead to make meaningful assertions about what failed rather than relying on how what failed is explained to a human.

  • the order in which validation errors are returned or raised

  • the contents of the jsonschema.tests package

  • the contents of the jsonschema.benchmarks package

  • the specific non-zero error codes presented by the command line interface

  • the exact representation of errors presented by the command line interface, other than that errors represented by the plain outputter will be reported one per line

  • anything marked private

With the exception of the last two of those, flippant changes are avoided, but changes can and will be made if there is improvement to be had. Feel free to open an issue ticket if there is a specific issue or question worth raising.

Indices and tables#