Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quo vadis #1

Open
schlegelp opened this issue May 8, 2021 · 13 comments
Open

Quo vadis #1

schlegelp opened this issue May 8, 2021 · 13 comments

Comments

@schlegelp
Copy link
Owner

schlegelp commented May 8, 2021

Hi @clbarnes

I finished a first working version including some simple tests. Compared to the navis version:

  1. Moved a bunch of functions/modules around
  2. Renamed xform_brain -> xform_space and mirror_brain -> mirror_space
  3. Slimmed down the TemplateRegistry to bare minimum
  4. Already added source_space and target_space to all transforms because it made sense while working on the registry anyway

Re 2: for now, I dropped the distinction between bridging and mirror transforms and mirror/mirror_space are currently not working. In navis, mirroring requires a registered TemplateBrain (which is used to generate a Affine transform to flip the data) and an optional warping transform. Here, I was considering simply treating mirrors as a space of their own - e.g. you would have space1 and space1_mirr, and the transform from one to the other would minimally be just an Affine transform but could also be a TransformSequence. For mirror_space this would then be something along the lines of:

def mirror_space(x, template):
   # Check for mirror transform
   tr = registry.shortest_bridging_seq(source=template, target=f'{template}_mirr')
  
  return tr.xform(x)

I'm not entirely sold on this because (a) this is a rather opaque to the user and (b) won't work with just any hashable object. May need to go back to the registry tracking mirror vs bridging transforms.

Do you have thoughts on how to proceed from here? I'm primarily interested in preserving functionality required for navis but not at all married to how it's implemented so, from my end, I'd say have at it.

@schlegelp
Copy link
Owner Author

Some things to consider:

  • drop xform.xform() and xform.mirror(): currently they really only call sometransform.xform()
  • drop xform_space() and mirror_space(): all they do is call registry.shortest_bridging_sequence() followed by xform(); these functions would be better off in navis where the data is more complex than points

@clbarnes
Copy link
Collaborator

I think treating mirrors as any other transform is the way to go. A library/application downstream of a transformation library could certainly tighten the restrictions on what could be used as a space reference, so saying that navis only works with strings would be totally in keeping with what I was thinking about, at least.

Generating a mirror transform from a template brain would also be fine - it would just return an AffineTransform or a TransformSequence([AffineTransform, WarpingTransform]). As you say, I think navis would be the place for a global template registry and any free functions which depend on it.

You're right that more complex data, like meshes, would be another downstream use which it might be nice to provide some convenience functions for, as reflections would invert the triangles, and nonlinear transforms could invalidate the mesh.

@schlegelp
Copy link
Owner Author

Cool! So as brief road map before I start migrating navis to use this as backend for transforms:

  1. Drop xform() and mirror() functions
  2. Add your neat improvements (__add__, __call__, etc.)
  3. Switch from (N, dims) to (dims, N) format

Now or later:
4. Implement arbitrary dimensions
5. Extend README / write docs

I'll quickly do 1 later today and 2-4 is your baby?

Also some nice-to-have things off the top of my head:

  • a test transform for CMTK - I need to dig into how those are generated first

@clbarnes
Copy link
Collaborator

Switch from (N, dims) to (dims, N) format

I was actually just about to go the other way for transformnd - iteration and vectorisation is much easier in (N, dims) and it seems to be a more common way of handling coordinates. There's going to be a bunch of axis-moving and transposing either way and I am getting the sense that (N, dims) is more pythonic.

Not sure if it'll be any use but I also have a reflection transform class in the works where you can reflect your coordinates around a point, line, ... hyperplane (it infers orthogonal hyperplanes to get where you need to go, which I think is mathematically sound...).

@clbarnes
Copy link
Collaborator

clbarnes commented May 10, 2021

I think to get these even further down to basics, we could strip out the dataframe handling (which has opinions about dimension number and order) and let the caller do that - navis could just have one utility function like

import numpy as np
import pandas as pd

DIMS = ["x", "y", "z"]

def as_coords(x):
    if isinstance(x, pd.DataFrame):
        return x[DIMS].to_numpy()
    arr = np.asarray(x)
    if arr.ndim != 2 or arr.shape[1] != 3:
        raise ValueError("an informative message")
    return arr

Additionally, I think directionality is the responsibility of the caller or can be inferred by the space references, rather than having a "forward"/"reverse" variable (if that is necessary, could it be a boolean is_reversed?). I guess the H5 and possibly CMTK transforms might need to store their directionality but that can be an exception rather than a rule.

These minor wording changes etc. don't have to change in navis, which I appreciate is a bigger job - navis could have its own classes which inherit the current navis API and just contain an xform instance to do the work; then there could be a navis.XformWrapper which takes an arbitrary xform instance to give it that API.

@schlegelp
Copy link
Owner Author

FYI:

  • a036603 drops xform/mirror functions
  • 765f513 imports all transforms at top-level namespace
  • 261891d makes it so we don't actually instantiate a registry by default: the idea is that each downstream package will have its own but happy to revert if a global registry is preferable

@schlegelp
Copy link
Owner Author

Additionally, I think directionality is the responsibility of the caller or can be inferred by the space references, rather than having a "forward"/"reverse" variable (if that is necessary, could it be a boolean is_reversed?). I guess the H5 and possibly CMTK transforms might need to store their directionality but that can be an exception rather than a rule.

The forward/reverse argument was sort-of carried from nat.I agree that with source/target_space this is not strictly necessary anymore.

Not sure if it'll be any use but I also have a reflection transform class in the works where you can reflect your coordinates around a point, line, ... hyperplane (it infers orthogonal hyperplanes to get where you need to go, which I think is mathematically sound...).

I did notice that you had effectively subsets of the AffineTransform in transformnd. I think that's a good idea. Could even implement some fancy __add__ for affine transforms that can combine shear, mirror, scale, etc. transforms (should be a simple multiplication(?) of the affine matrix, right?)

@clbarnes
Copy link
Collaborator

drops xform/mirror functions

Cool!

imports all transforms at top-level namespace

I'd been trying to avoid this where possible - I suspect some transforms will bring in some hefty dependencies, so keeping transforms other than the real basics out of the top level keeps import times minimal until the user explicitly wants one of the other transforms. Additionally, with several transforms needing optional dependencies, if you keep them out of the top level import then you don't need to handle the importerror - if the caller explicitly imports that submodule, they get the importerror themselves which should be informative enough, so we get to ignore the case where it's not installed.

makes it so we don't actually instantiate a registry by default: the idea is that each downstream package will have its own but happy to revert if a global registry is preferable

I agree with not having a global registry; that makes more sense downstream packages with specific domains/ use cases.

I did notice that you had effectively subsets of the AffineTransform in transformnd.

Yes, separating the linear map from the translation - adding the extra coordinates for an augmented matrix is another copy which I'm sure would be trivial once benchmarked but given it was just a subclass away I thought I'd give the option of avoiding it.

I think that's a good idea. Could even implement some fancy add for affine transforms that can combine shear, mirror, scale, etc. transforms (should be a simple multiplication(?) of the affine matrix, right?)

Yes, I think that's right! I might be inclined towards overriding __matmul__ for that case to be doubly explicit about both the user and python knowing it's only affine transforms involved on both sides and to keep __add__ always returning a splittable TransformSequence. The associative/ not-commutative nature of affine transforms can be left up to the user.

I'd be inclined towards keeping the non-affine implementations of translation, mirror, and scale too, because they're easier to think about and put together in higher dimensionalities, although this does lean away from "one and only one way to do things". But they could always have methods to convert them into their affine form for supported dimensionalities, and having different mechanisms for composing affine (__matmul__) and non-affine transforms (__add__) means some value is added there.

@schlegelp
Copy link
Owner Author

I'd been trying to avoid this where possible - I suspect some transforms will bring in some hefty dependencies, so keeping transforms other than the real basics out of the top level keeps import times minimal until the user explicitly wants one of the other transforms.

I see your point. Avoiding imports means that we would need to leave transforms in individual submodules though. That in turn makes for opaque paths like from xform.transforms.elastix import ElastixTransform. We can cross that bridge when we get there but lazy or delayed imports for expensive dependencies might be an option?

@clbarnes
Copy link
Collaborator

clbarnes commented May 11, 2021

Are they opaque, or are they namespaces, "one honking great idea"? 😅 The transforms themselves could be promoted out of the transforms subpackage to remove one step, and non-transform utilities kept in a subpackage as it is now. Dependency-free utilities which might be useful downstream, like coordinate reshaping, could be imported at the top level, as could the base classes and sequences useful to people defining interfaces. So if you have a specific transform in mind, you go and get it from a specific place, but otherwise the general package is enough.

@clbarnes
Copy link
Collaborator

combine shear, mirror, scale, etc. transforms... overriding matmul

Played around with this in transformnd and have all of the affine operations available as alternate constructors, plus __matmul__ for combining them. Rotations are only implemented for 2 and 3D because despite a lot of googling I can't find any reference to doing them in higher D; but everything else should be ND.

https://github.com/clbarnes/transformnd/blob/main/transformnd/affine.py#L143

@schlegelp
Copy link
Owner Author

I like the class methods!

@clbarnes
Copy link
Collaborator

clbarnes commented Nov 18, 2021

I had another hack at transformnd. There was some housekeeping (docs and refactoring into submodules), some nitpicking (switching from + to | for composing transforms, and - to ~ for inverting transforms, to clarify the difference between those and maths on the coordinates themselves).

Something which might be of more interest is the adapters submodule: https://github.com/clbarnes/transformnd/tree/main/transformnd/adapters This provides an easy way to define how to transform non-ndarray objects. For example, if you had a bunch of dataframes where some columns had coordinates in them, you'd do

from transformnd.adapters.pd import DataFrameAdapter

adapter = DataFrameAdapter(list("xyz"))
transformed_df = adapter(some_transform, some_df)

If you had an object which had several attributes which needed transforming, potentially each with different adapters, you'd do

from transformnd.adapters import AttrAdapter, NullAdapter
from dataclasses import dataclass
import pandas as pd

@dataclass
class MyObject:
    pointcloud: np.ndarray
    treenodes: pd.DataFrame

adapter = AttrAdapter(pointcloud=NullAdapter(), treenodes=DataFrameAdapter(list("xyz")))
transformed_obj = adapter(some_transform, some_obj)

Adapters also have a .partial() method for creating a frozen adapter with some pre-filled arguments, which might be useful for storing on a transformable object where the 2nd argument will always be that object, or there are other arguments (AttrAdapter and DataFrameAdapter both take an in_place argument).

For the navis case, the combination of those two should make it reasonably accessible for each type to know how to transform itself given a generic transformation function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants