Update spec and explainer to clarify pose and transform semantics #569

klausw · 2019-03-23T00:29:11Z

This is intended to address issues #477, #565, and #567 which all involve areas where the spec isn't entirely clear about intended behavior.

klausw · 2019-03-23T00:40:12Z

FWIW, I think originOffset makes it difficult to explain reference spaces things clearly. An XRReferenceSpace basically consists of two XRSpaces that are transformed into each other by the originOffset, and it's hard to explain how that transform works when the two spaces don't have names. I've used the name "effective reference space" for the offset-modified externally visible space and "unmodified reference space" for the internal space corresponding to an identity originOffset, but that's clunky.

index.bs

update explainer to match updated spec, add more examples around teleporting Fix spelling for XRRigidTransform description inverse()->inverse in explainer Rebase

klausw · 2019-04-02T00:52:43Z

Rebased to pick up the inverse()=>inverse attribute change from 9ec29f5, and combined into a single commit.

toji · 2019-04-02T16:46:39Z

index.bs

@@ -681,7 +681,7 @@ Spaces {#spaces}
 XRSpace {#xrspace-interface}
 ------------------

-An {{XRSpace}} describes an entity that is tracked by the [=/XR device=]'s tracking systems. {{XRSpace}}s MAY NOT have a fixed spatial relationship to one another or to any given {{XRReferenceSpace}}. The transform between two {{XRSpace}} can be evaluated by calling an {{XRFrame}}'s {{XRFrame/getPose()}} method. The interface is intentionally opaque.
+An {{XRSpace}} describes an entity that is tracked by the [=/XR device=]'s tracking systems. It conceptually corresponds to a specific origin and orientation, but does not directly expose any numerical coordinate values. {{XRSpace}}s MAY NOT have a fixed spatial relationship to one another or to any given {{XRReferenceSpace}}. The current transform between two {{XRSpace}}s can be evaluated by calling an {{XRFrame}}'s {{XRFrame/getPose()}} method. The interface is intentionally opaque.


"current" is a hard concept to pin down in WebXR. This whole sentence could probably stand to be rephrased to clarify that. Something along the lines of:

"The transform between two {{XRSpace}}s at the time represented by an {{XRFrame}} can be evaluated by calling the frame's {{XRFrame/getPose()}} method."

Changed in 30ca9d2 . I've also rephrased the previous sentence to make it explicit that the spatial relationship can change even if there's no real-world movement. New text:

"The spacial relationships between {{XRSpace}}s or {{XRReferenceSpace}} change over time in response to movement of tracked entities, but can also change without real-world movement as a result of ongoing tracking system recalibration. The transform between two {{XRSpace}}s at the time represented by an {{XRFrame}} can be evaluated by calling the frame's's {{XRFrame/getPose()}} method."

Another reason for rephrasing that sentence was that I wanted to avoid the MAY NOT construction - that's not one of the rfc 2119 key terms, and it could be misinterpreted as disallowing fixed spacial relationships as opposed to warning not to assume them.

Manishearth · 2019-04-02T17:27:29Z

FWIW, I think originOffset makes it difficult to explain reference spaces things clearly

Filed #580

klausw · 2019-04-03T20:36:55Z

If I'm understanding it right, the consensus from the meeting was that we'll be revisiting originOffset as per @Manishearth's #580, but that shouldn't block us from proceeding with the bulk of the changes in this proposal. The originOffset part of it is localized to just one "unstable" paragraph

Reviewers, please let me know if you want me to make further changes, or if it's OK as-is.

Manishearth · 2019-04-03T20:49:35Z

Yeah, I would still like this pull request to land; it contains changes that are independent of originOffset, and the changes that depend on originOffset are still an unambiguous improvement, which at worst will be overwritten by #580.

Also explicitly mention that spacial relationships can change without any real-world movement due to ongoing tracking system recalibration.

klausw · 2019-04-19T18:35:41Z

index.bs

+  1. If |destinationSpace|'s [=XRSpace/session=] does not equal |session|, throw an {{InvalidStateError}} and abort these steps.
+  1. If |sourceSpace|'s [=XRSpace/session=] does not equal |session|, throw an {{InvalidStateError}} and abort these steps.
+  1. If |destinationSpace|'s pose cannot be determined relative to |sourceSpace|, return <code>null</code>
+  1. Return a new {{XRPose}} containing the {{XRRigidTransform}} from |sourceSpace| to |destinationSpace|. The transform's {{XRRigidTransform/position}} is the location of the |sourceSpace| origin in |destinationSpace| coordinates.


Does anyone have good naming suggestions here? "sourceSpace" and "destinationSpace" seemed like the best fit for describing the transform, but it's a bit clunky when talking about getPose.

Do any of these variants look better? I've added snippets how the naming would affect this paragraph and the XRRigidTransform definition below.

getPose(XRSpace sourceSpace, XRSpace destinationSpace); =>

Return a new {{XRPose}} containing the {{XRRigidTransform}} from |sourceSpace| to |destinationSpace|. The transform's {{XRRigidTransform/position}} is the location of the |sourceSpace| origin in |destinationSpace| coordinates.

An {{XRRigidTransform}} is a transform from a source {{XRSpace}} to a destination {{XRSpace}} described by ...

getPose(XRSpace space, XRSpace relativeTo); =>

Return a new {{XRPose}} containing the {{XRRigidTransform}} from |space| to |relativeTo|. The transform's {{XRRigidTransform/position}} is the location of the |space| origin in |relativeTo| coordinates.

An {{XRRigidTransform}} is a transform from an {{XRSpace}} to a relativeTo {{XRSpace}} described by ...

getPose(XRSpace targetSpace, XRSpace baseSpace); =>

Return a new {{XRPose}} containing the {{XRRigidTransform}} from |targetSpace| to |baseSpace|. The transform's {{XRRigidTransform/position}} is the location of the |targetSpace| origin in |baseSpace| coordinates.

An {{XRRigidTransform}} is a transform from a target {{XRSpace}} to a base {{XRSpace}} described by ...

getPose(XRSpace transformedSpace, XRSpace baseSpace); =>

Return a new {{XRPose}} containing the {{XRRigidTransform}} from |transformedSpace| to |baseSpace|. The transform's {{XRRigidTransform/position}} is the location of the |transformedSpace| origin in |baseSpace| coordinates.

An {{XRRigidTransform}} is a transform from a transformed {{XRSpace}} to a base {{XRSpace}} described by ... (This one sounds backwards)

@NellWaliczek please see ^^^, trying to get the naming sorted out in relation to your request from #589 (comment)

I hope to somewhat redundantly clarify getPose with some matrix math so that the confusion doesn't get too bad. I kinda like source destination.

Of the naming options above, I prefer targetSpace/baseSpace. Specifically, I worry that sourceSpace/destinationSpace would present a reversed mental model to the way in which getPose will be increasingly used in practice.

As we start to establish plane anchors, mesh anchors, face anchors, etc., there will be an increasing number of static and dynamic entities in WebXR that are tracked over time by XRSpace instances. To place these entities within their scene each frame, developers will use xrFrame.getPose(interestingEntitySpace, myReferenceSpace) and expect to get back the position and orientation of that entity within their engine's world coordinates (which are underwritten by myReferenceSpace), so they can update that entity's scene object transform in their engine. The mental model for getPose in particular will start to become "what's the position and orientation of this single entity in my reference space", moreso than "get me a model transform I can use to generally transform coordinates from this entity space into my reference space".

However, the above is ultimately just me arguing that we should preference naming that's intuitive for the position/orientation attributes on XRRigidTransform rather than the matrix attribute. In the end, @klausw had a key observation in #580 (comment) that applies here:

I'm starting to think that a core part of the confusion is that a tracked object's XRPose has-a XRRigidTransform, and the transform's position component corresponds to the tracked object's coordinates in reference space as expected, but a XRRigidTransform and its matrix are a transform from object space to reference space. That's internally consistent, but it's confusing when you supply a transform by itself as an argument (as in the constructor arg here) because it's easily interpreted as saying "please apply this transform" which would be backwards. In a way, OpenXR's approach of directly putting position + orientation on the pose objects helps avoid this ambiguity.

Ultimately, any naming pattern and/or conceptual framework we come up with that makes position/orientation make more sense will make matrix make less sense, and vice versa.

In OpenXR, we've managed to sidestep entire classes of confusion around the order or direction in which matrices are applied, whether they're pre-multiplied or post-multiplied, whether they are row-major/column-major, etc. by banishing matrices from the API altogether. Rather than expressing transform matrices "from" or "to" various spaces, all poses are expressed only as a position vector and orientation quaternion "in" a given space.

Even for view matrices, as discussed in #447, we found in OpenXR that any non-trivial engine ends up positioning a camera object within its scene in world coordinates anyway, and so even there it ends up being more convenient for developers to reason about the position and orientation of their view rather than getting a view matrix directly. You can see this pattern in OpenXR in xrLocateViews, which returns a set of XrView structs, each containing the position and orientation of a given view.

Specifically in the case of OpenXR reference spaces with a non-identity pose applied:

Each reference space type is specified in terms of how its "natural" origin behaves.

If you apply a non-identity pose when creating a reference space, that's specified as placing the origin of your reference space at that position and orientation "in" the "natural" reference space.

By defining the origin of any space as either a well-specified location in the physical world or a position and orientation within some other such space, all space origins have a clear binding to a place in the physical world. It then falls out without ambiguity how you "get" the "pose" of any one space "in" another.

If we ourselves continue to struggle with these naming and direction issues each week, despite being embedded in the details of the spec, perhaps it's the lesser evil to just remove matrix from XRRigidTransform? This will forcibly snap the developer's mental model to an unambiguous position and orientation approach throughout the API, and then we can design our naming to be consistent with that model.

I think the problem here isn't matrix, the problem is the use of the word transform itself.

I point this out here: https://github.com/immersive-web/editor-collab/pull/36#discussion_r278361579 , there are two ways one can look at a transform. Just talking about position/orientation as a "pose" is clear, however when you talk of the "transform" of a position/orientation that can mean opposite things.

If we can carefully clarify our use of the word "transform", talking about matrices isn't hard.

I feel like we should definitely try and keep matrix around for developer convenience.

klausw · 2019-04-26T20:11:33Z

Retiring this PR, please see the replacement #609 "Overhaul XRSpace, get(Viewer)Pose definitions". We can check separately to see if there's pieces in here we'd like to keep, i.e. for the explainer, but that would be a separate new PR if applicable.

toji · 2019-04-26T20:17:13Z

The primary thing that my PR doesn't capture, upon re-reading this, is the explainer changes. While it would be good to ensure that they're consistent, I'm inclined to view updating the explainer on topics like this as less critical, simply because the spec is authoritative and should be rigorous while the explainers are intended to be higher level.

toji added the agenda Request discussion in the next telecon/FTF label Mar 25, 2019

toji added this to the Next Working Draft milestone Mar 25, 2019

Manishearth reviewed Mar 28, 2019

View reviewed changes

index.bs Outdated Show resolved Hide resolved

Update spec to clarify pose and transform semantics

0615c13

update explainer to match updated spec, add more examples around teleporting Fix spelling for XRRigidTransform description inverse()->inverse in explainer Rebase

klausw force-pushed the r-poseupdate branch from e157b66 to 0615c13 Compare April 2, 2019 00:49

Fix an instance where getPose arguments were switched

a438109

toji reviewed Apr 2, 2019

View reviewed changes

Manishearth mentioned this pull request Apr 2, 2019

Should origin offset be a new reference space instead? #580

Closed

Rephrase "current transform" as suggested by @toji

30ca9d2

Also explicitly mention that spacial relationships can change without any real-world movement due to ongoing tracking system recalibration.

klausw force-pushed the r-poseupdate branch from 580f7b6 to 30ca9d2 Compare April 3, 2019 21:51

toji removed agenda Request discussion in the next telecon/FTF labels Apr 10, 2019

klausw mentioned this pull request Apr 16, 2019

Change originOffset w/ Reference Space constructor #589

Closed

klausw commented Apr 19, 2019

View reviewed changes

klausw closed this Apr 26, 2019

thetuvix mentioned this pull request May 16, 2019

Describe in detail the way used to obtain XRRay.matrix #582

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update spec and explainer to clarify pose and transform semantics #569

Update spec and explainer to clarify pose and transform semantics #569

klausw commented Mar 23, 2019

klausw commented Mar 23, 2019

klausw commented Apr 2, 2019

toji Apr 2, 2019

klausw Apr 3, 2019

klausw Apr 3, 2019

Manishearth commented Apr 2, 2019

klausw commented Apr 3, 2019

Manishearth commented Apr 3, 2019

klausw Apr 19, 2019

klausw Apr 19, 2019

Manishearth Apr 19, 2019

thetuvix Apr 23, 2019

Manishearth Apr 25, 2019

klausw commented Apr 26, 2019

toji commented Apr 26, 2019

Update spec and explainer to clarify pose and transform semantics #569

Update spec and explainer to clarify pose and transform semantics #569

Conversation

klausw commented Mar 23, 2019

klausw commented Mar 23, 2019

klausw commented Apr 2, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Manishearth commented Apr 2, 2019

klausw commented Apr 3, 2019

Manishearth commented Apr 3, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

klausw commented Apr 26, 2019

toji commented Apr 26, 2019