Skip to content
Brett Graham edited this page Mar 29, 2024 · 21 revisions

This document describes proposed changes. Please see the changelog for previously release and current development version changes. If you see changes you could like to discuss please open a discussion (or issue) and when resolved update this document.

Changes not tied to a particular version

  • Organize unit tests in a structure that matches the code (either before or after the API clean-up in 3.x.x)
  • Release asdf-compression adding new block compression algorithms (zstd, lz4)
  • Move pytest plugin to a different repository (or remove it entirely)

Possible file format changes (in a future major release), this would require updating the standard

  • Store block index as binary object (to speed up parsing of files with a large number of blocks)
  • Use block offsets relative to start of blocks (instead of start of file)
  • Store asdf metadata (history, extensions, etc) in a separate yaml document to free up namespace for user tree keys

4.0.0

By default, do not memmap arrays.

Remove featured deprecated in 3.x.x.

Switch to ASDF standard 1.6.0 as default.

Remove the pytest-asdf plugin

Disable saving "base array" by default (hopefully with a deprecation in 3.x.x).

3.x.x

pytest-asdf

This plugin does not provide much utility (it reads examples from the schema) and is intimately tied to pytest (which requires that pytest be installed and used to use these tests). It should be possible to replace the functionality with one or more functions in asdf.testing.helpers to allow the plugin to be deprecated. See related issues/PRs:

Clean-up and new deprecations

Clean up the public API

There are a number of improvements that could be made to the public API including:

  • prefix non-public modules, etc with an underscore to make it more explicit that they are private
  • layout plan for top-level imports and deprecate anything we don't want to include at the top-level (e.g. Stream, IntegerType)
  • evaluate what parts of the public API are critical to asdf functionality and what should be provided by a different library (e.g. asdf.util.get_class_name)
  • add custom exceptions to replace generic exceptions and to replace exceptions that are 'passed through' from dependencies (e.g. pyyaml.RepresenterError)
  • should generic_io be private? Can it be removed/replaced?
  • are there redundant options/configuration settings (see: https://github.com/asdf-format/asdf/pull/1477 https://github.com/asdf-format/asdf/pull/1476)

Consider deprecating unused features or things that can now be implemented as extensions

For example:

  • can external blocks now be implemented as an extension?
  • can external array reference be implemented as an extension?
  • lz4 compression, this can be moved to an extension

Expansion and cleanup of AsdfConfig

As AsdfConfig provides a centralized and flexible way to define various asdf options we should investigate moving more options into AsdfConfig. This will likely result in a 'too large' number of options so we should consider ways to organize, nest, or in some other ways make these options easy to understand and use.

New features

Consider some new features! These include:

  • super-dictionary access to the ASDF tree with per-node lazy-loading
  • partial block reading (for local and cloud-based non-chunked files)
  • migrate to a new jsonschema library (perhaps jsonschema-rs)

3.0.0

Changes to the public API

As this is a major version change, asdf 3.0 removes several deprecated features:

  • legacy extension API based on AsdfType
  • AsdfInFits
  • other deprecated features

In addition to the above removals, asdf 3.0 adds a few new features to the public API:

  • Converter block access
  • Converter deferral
  • Array storage option control to AsdfConfig

Internal changes

Internally, asdf 3.0 will include a major rewrite of ASDF block management code that is necessary to move NDArrayType to a Converter. This rewrite fixes many bugs in ASDF block reading and writing.

Additionally, asdf 2.15.1 included internally (vendorized) jsonschema 4.17.3 to deal with jsonschema 4.18 dropping support for features required by asdf. To ease the transition for downstream packages, asdf 2.15.1 kept jsonschema as a dependency and attempts to use some exceptions from jsonschema (to allow downstream code that catches these errors to function). Asdf 3.0 will drop jsonschema as a dependency.

We also want to strongly consider mentioning in the 3.0 docs that we will be (or are at least considering) disabling memmapping as the default option when files are opened in version 4.0.

2.8.0

Summary

This release will remove the experimental subclass attribute serialization feature, add support for ASDF Standard 1.6.0, add a global configuration feature, and add new APIs for extending ASDF.

Changes

Remove support for automatic subclass attribute serialization for all ASDF Standard versions

The experimental subclass attribute serialization feature will be removed (and its supporting schema dropped from ASDF Standard 1.6.0).

ASDF Standard 1.6.0

The proposed roadmap for ASDF Standard 1.6.0 entails the following new requirements:

Global configuration

Introduce a configuration mechanism that will allow certain AsdfFile options (such as read_on_validate) to be set globally.

New tag handling API

The current ExtensionType API is complicated and difficult to reason about. We'll introduce a new simplified API for handling custom tags which will also be sufficiently flexible to handle the new requirements of ASDF Standard 1.6.0.

New extension API

The current AsdfExtension API does not include any kind of extension identifier, which means we end up describing the extension by Python class name in the ASDF file's metadata, which is not a portable solution. There is also no convenient way for an extension to express that there are different versions of itself, what the default version should be, and what versions of what tags are permissible under that version.

We'll introduce a new extension API with properties that supply this missing information (and also provide a list of tag handlers associated with that extension).

Resource access API

The current AsdfExtension API for retrieving schemas has some drawbacks:

  • It is not possible to list the schemas provided by the extension. The code that maps schema URIs to file paths will happily map to filenames that don't exist.
  • The schema content must be provided as a URL, which is an obstacle to storing schemas as package resources or writing schemas in the REPL during development.

We will introduce a new API for mapping schema URIs to schema content.