Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify meaning of PTZ constraints presence #256

Closed
jan-ivar opened this issue Aug 31, 2020 · 30 comments · Fixed by #271
Closed

Clarify meaning of PTZ constraints presence #256

jan-ivar opened this issue Aug 31, 2020 · 30 comments · Fixed by #271
Assignees
Labels
PTZ Pan-Tilt-Zoom

Comments

@jan-ivar
Copy link
Member

This spec appears to assume any mention of pan, tilt or zoom will result in a PTZ device (& thus require ptz permission).

But that's not how the constraints algorithm works. E.g.

navigator.mediaDevices.getUserMedia({video: {height: {min: 1080}, pan: true, tilt: true, zoom: true}});

...is equivalent to

navigator.mediaDevices.getUserMedia({video: {height: {min: 1080}, pan: {}, tilt: {}, zoom: {}}});

...where {} has no impact on fitness distance, and may choose your 1080p camera over your low-res PTZ camera.

This seems unexpected, because a constraint value (even true) would usually carry weight, except as defined here.

You may also have granted the site more permission than necessary to access the device chosen.

What's the model?

A related question is whether regular non-ptz cameras satisfy the following constraints:

navigator.mediaDevices.getUserMedia({video: {pan: 0, tilt: 0, zoom: 0}});

Is it yes, because non-ptz cameras point front and center without zoom or tilt?
Is it no, because non-ptz cameras shouldn't have read-only pan, tilt or zoom settings? The spec doesn't say.

It appears we've overloaded the presence or absence of pan, tilt or zoom constraints with meaning.

Options I see:

  1. Remove the true overload of pan, tilt and zoom constraints, and replace it with a ptz ConstrainBoolean.
  2. Hotfix the fitness distance algorithm to give weight to presence of constraints.
  3. Make true a first-class value of pan, tilt and zoom, and specify a new ConstrainULongOrBoolean.
@jan-ivar
Copy link
Member Author

For context, this came up in #246 (comment).

@youennf
Copy link
Contributor

youennf commented Sep 1, 2020

where {} has no impact on fitness distance, and may choose your 1080p camera over your low-res PTZ camera.

However we define constraints and distance, the user agent can select any of the devices in finalSet.
As per spec, the fitness distance MAY be used but user agent can use other sources.

Given PTZ is different in nature compared to other cameras in terms of permission, we could try to mandate that, in case the query asks for PTZ, UA is expected to prefer selecting PTZ devices first in finalSet.

  • Remove the true overload of pan, tilt and zoom constraints, and replace it with a ptz ConstrainBoolean.

That is my preference obviously. I would tend to leave 'pan', 'tilt' and 'zoom' to applyConstraints solely.
Then we could try getting consensus on whether an ideal deviceId constraint is above a PTZ constraint.

  • Hotfix the fitness distance algorithm to give weight to presence of constraints.

I do not see how this will guarantee to get what we want.

  • Make true a first-class value of pan, tilt and zoom, and specify a new ConstrainULongOrBoolean.

Isn't it investing in a model we know is too complex?

@eehakkin
Copy link
Contributor

eehakkin commented Sep 1, 2020

  • Hotfix the fitness distance algorithm to give weight to presence of constraints.

I do not see how this will guarantee to get what we want.

The fitness distance algorithm could be hotfixed so that the distance is positive infinity if a PTZ constraint is present but the camera does not support that PTZ capability. In that case

navigator.mediaDevices.getUserMedia({video: {tilt: {}}});

would select a tilt capable camera or fail with OverconstrainedError,

navigator.mediaDevices.getUserMedia({video: {deviceId: {exact: '20983-20o198-109283-098-09812'}, tilt: {}}});

would select the device 20983-20o198-109283-098-09812 if it is tilt capable camera or fail with OverconstrainedError overwise and

navigator.mediaDevices.getUserMedia({video: {deviceId: {exact: '20983-20o198-109283-098-09812'}, advanced: [{tilt: {}}]}});

would always select the device 20983-20o198-109283-098-09812 (given that it is available).

@youennf
Copy link
Contributor

youennf commented Sep 1, 2020

would select a tilt capable camera or fail with OverconstrainedError,

That is probably what we want to avoid.
If there is no PTZ camera, we probably want getUserMedia to resort to any other available camera (with a non PTZ prompt).

@eehakkin
Copy link
Contributor

eehakkin commented Sep 1, 2020

navigator.mediaDevices.getUserMedia({video: {tilt: {}}});

would select a tilt capable camera or fail with OverconstrainedError,

That is probably what we want to avoid.

Not really. A web site should have an option to choose either way.

If there is no PTZ camera, we probably want getUserMedia to resort to any other available camera (with a non PTZ prompt).

That could be achieved with hotfixed fitness distance by

navigator.mediaDevices.getUserMedia({video: {advanced: [{tilt: {}}]}});

If there were no PTZ cameras, the advanced constraint {tilt: {}} would be dropped because its fitness distance would be positive infinity. As a result the only remaining constraint would be video thus any available camera would do.

@jan-ivar
Copy link
Member Author

jan-ivar commented Sep 3, 2020

I think option 3 will give us what we want naturally.

Then pan: true like pan: 0 would give a fitness distance of 1 to all non-panable devices, based on¹

(actual == ideal) ? 0 : 1

Then we wouldn't need any special rules, which I think is a win.


1) For those with a sharp eye: yes we forgot to define distance for boolean constraints in mediacapture main, because it hasn't come up before. E.g. you can't really pick mics based on the echoCancellation constraint, because echo cancellation is done in software.

@riju
Copy link
Collaborator

riju commented Sep 4, 2020

Thanks @jan-ivar for the suggestions.

I am in favour of Option 3.
(double or boolean or ConstrainDoubleOrBoolean) pan;

IIUC for Option 1, there might be some confusion if
{pan:{exact:123}} equal to {pan:{exact:123}, ptz:{exact:true}} as I feel the ptz ConstrainBoolean might confuse more than clarify.

Option 3 aligns the pan, tilt and zoom to work in the same way as the other properties regarding exact and ideal and we can do away with the advanced constraints.

{pan:true}
{pan:123}
{pan:{ideal:true}}
{pan:{ideal:123}}
{pan:{exact:true}}
{pan:{exact:123}}
{pan:{min:-123,max:123}}

We will experiment with the changes in Chrome's implementation and then send a PR for the spec changes.

@guidou
Copy link

guidou commented Sep 8, 2020

I raised a similar concern when the boolean override was proposed since pan: {} is not different from having the property unconstrained in terms of the SelectSettings algorithm.
The issue was resolved by stating that the presence of the property (even if unconstrained) means that the PTZ capability is being requested.
To me, this was enough to have a meaning that was precise enough to produce an implementation.
In this case
{pan: true} means that PTZ is required and, when not available, gUM() fails with OverconstrainedError if used in the basic set (the set is ignored if used in an advanced set) . The somewhat confusing part is that "pan: true" looks more like an ideal naked value if used in the basic set.
There is one proposal is to treat {pan: true} (and pan:{}) like an ideal value and make it mean "prefer" the PTZ capability, but not require it. This would mean that there is no way to require the PTZ capability, which is the intended result of the proposal.

Option 3 is similar to the current spec, except that has more possibilities. I would interpret it as follows (unless some other explicit meaning is given):

  • pan: {ideal: true} would mean prefer a camera with pan (PTZ) capability
  • pan: {exact: true}, would mean require a camera with pan capability (OverconstrainedError if not possible in basic set, or ignore in advanced set)
  • pan: {ideal: false}, would mean prefer a camera without pan capability
  • pan: {exact: false}, would mean require a camera without pan capability
    -pan: {ideal: 10} would mean prefer a camera with pan capability and preferably set pan to 10.
    -pan: {exact: 10} would mean require a camera with pan capability and set pan to 10. (OverconstrainedError if not possible in basic set, ignore in advanced)

Not sure if the extra complexity of allowing pan to be used this way or in the usual DoubleConstraint way is worth it over using a separate constraint for the PTZ capability.

@jan-ivar
Copy link
Member Author

jan-ivar commented Sep 9, 2020

To me, ... {pan: true} means that PTZ is required and, when not available, gUM() fails with OverconstrainedError if used in the basic set (the set is ignored if used in an advanced set)

I find no support in the spec for that interpretation.

The spec says: "A value of true is normalized to a value of empty ConstrainDouble", which I don't see being able to cause OverconstrainedError.

This would mean that there is no way to require the PTZ capability, which is the intended result of the proposal.

I see no way in the current spec to require PTZ capability without setting a value, e.g. pan: {exact: 3600} (it's not clear if pan: {exact: 0} does).

Option 3 would add that with e.g. {pan: {exact: true}}, modulo whatever we come up with in #246.

Not sure if the extra complexity of allowing pan to be used this way or in the usual DoubleConstraint way is worth it over using a separate constraint for the PTZ capability.

You're listing the natural outcomes of the existing constraints algorithm.

I think the complexity is exactly the same as having a separate boolean constraint, except we avoid some nonsense combos:

{video: {ptz: {exact: false}, pan: {min: 3600}}}
{video: {ptz: false, pan: 3600}} 

@guidou
Copy link

guidou commented Sep 9, 2020

To me, ... {pan: true} means that PTZ is required and, when not available, gUM() fails with OverconstrainedError if used in the basic set (the set is ignored if used in an advanced set)

I find no support in the spec for that interpretation.

The spec says: " An empty ConstrainDoubleRange value implies no constraints but only a permission and capability request."
That's where my interpretation comes from. pan: {} means request PTZ capability. Interpreting not having the capability as failing to satisfy the request looks like a valid interpretation to me.

The spec says: "A value of true is normalized to a value of empty ConstrainDouble", which I don't see being able to cause OverconstrainedError.

See above.

This would mean that there is no way to require the PTZ capability, which is the intended result of the proposal.

I see no way in the current spec to require PTZ capability without setting a value, e.g. pan: {exact: 3600} (it's not clear if pan: {exact: 0} does).

IIRC, the result of a previous discussion was that the presence of the constraint (with any value, including empty/true) indicated the capability/permission request. As indicated above, the spec explicitly says: "An empty ConstrainDoubleRange value implies no constraints but only a permission and capability request."

Option 3 would add that with e.g. {pan: {exact: true}}, modulo whatever we come up with in #246.

Not sure if the extra complexity of allowing pan to be used this way or in the usual DoubleConstraint way is worth it over using a separate constraint for the PTZ capability.

You're listing the natural outcomes of the existing constraints algorithm.

Indeed.

I think the complexity is exactly the same as having a separate boolean constraint, except we avoid some nonsense combos:

{video: {ptz: {exact: false}, pan: {min: 3600}}}
{video: {ptz: false, pan: 3600}} 

Having to deal with the weird combos, is part of the extra complexity. Having to deal with a range that is the union of all floating point numbers, true and false, where true and false have specific meanings that are not translations to a floating-point range is probably going to be harder in terms of implementation than dealing with just the set of floating numbers or just {true,false}.

@youennf
Copy link
Contributor

youennf commented Sep 9, 2020

Having to deal with the weird combos, is part of the extra complexity. Having to deal with a range that is the union of all floating point numbers, true and false, where true and false have specific meanings that are not translations to a floating-point range is probably going to be harder in terms of implementation than dealing with just the set of floating numbers or just {true,false}.

Agreed, this seems a recipe for bad interop.
I haven't seen any convincing use case for this level of expressiveness.
The only justification so far is that the current constraint model allows that but we know we already have bad interop there and I think this is one reason we haven't been able to to tighten the selection algorithm.

Let's take { video : { pan : 10, width : 1920 } } as an example with a 720p PTZ camera and a 1080p non PTZ camera.
Which camera will be used? Say PTZ camera current pan value is 3600? or current pan value is 0?

We should drive the design with what the developer might want to express as part of the getUserMedia call and what the user might want to do in terms of selection.
The question we know of is: 'Give me a camera, with PTZ capabilities if possible'.
I think we can address that with a ptz boolean constraint and make it so that pan/tilt/zoom numerical constraints are considered for applyConstraints only.

@eehakkin
Copy link
Contributor

eehakkin commented Sep 9, 2020

I find no support in the spec for that interpretation.

The spec says: " An empty ConstrainDoubleRange value implies no constraints but only a permission and capability request."

But that is just a note. If that meaning is to be retained, it should be spelled out in the main descriptions and not only in a note.

That's where my interpretation comes from. pan: {} means request PTZ capability. Interpreting not having the capability as failing to satisfy the request looks like a valid interpretation to me.

The spec says: "A value of true is normalized to a value of empty ConstrainDouble", which I don't see being able to cause OverconstrainedError.

See above.

Like that text does not say anything about that in this case an empty ConstrainDouble implies required hardware support for the property also in the basic set unlike in the case of all non-PTZ properties.

In addition, in Chrome, {pan: {}} does not imply required hardware support for pan but only a permission request and a hardware support for some PTZ property (pan, tilt or zoom).

Option 3 would add that with e.g. {pan: {exact: true}}, modulo whatever we come up with in #246.

Not sure if the extra complexity of allowing pan to be used this way or in the usual DoubleConstraint way is worth it over using a separate constraint for the PTZ capability.

Option 3 would allow a request to prefer a pan and tilt capable camera over a zoom-only capable camera using

navigator.mediaDevices.getUserMedia({video: {pan: true, tilt: true}});

or to require a pan and tilt capable camera over a zoom-only capable camera using

navigator.mediaDevices.getUserMedia({video: {pan: {exact: true}, tilt: {exact: true}}});

The former is apparently not possible at the moment and the latter can be achieved only using

navigator.mediaDevices.getUserMedia({video: {
    pan: {min: -180.0 * 3600, ideal: 0.0, max: 180.0 * 3600},
    tilt: {min: -180.0 * 3600, ideal: 0.0, max: 180.0 * 3600}
    }});

although the idea behind the true semantics is to able to avoid guessed defaults like {ideal: 0.0}.

With a separate constraint for the PTZ capability the former would remain impossible and the latter would remain the same or become

navigator.mediaDevices.getUserMedia({video: {
    pan: {min: -180.0 * 3600, ideal: 0.0, max: 180.0 * 3600},
    panTiltZoom: {exact: true},
    tilt: {min: -180.0 * 3600, ideal: 0.0, max: 180.0 * 3600}
    }});

With @youennf's suggestion to remove pan/til/zoom support from getUserMedia, also the latter would become impossible.

@jan-ivar
Copy link
Member Author

jan-ivar commented Sep 9, 2020

But that is just a note. If that meaning is to be retained, it should be spelled out in the main descriptions and not only in a note.

Agree. Normative statements should never reside solely in notes. Notes clarify normative text or algorithms elsewhere, and should never directly contradict algorithms.

Here I see spec support for the permission part only, not "capability request" which appears undefined.

Having to deal with a range that is the union of all floating point numbers, true and false, where true and false have specific meanings that are not translations to a floating-point range is probably going to be harder in terms of implementation

Implementers come last in the priority of constituencies. I think we must prioritize preserving web compatible semantics for users (who come first).

Let's take { video : { pan : 10, width : 1920 } } as an example with a 720p PTZ camera and a 1080p non PTZ camera.
Which camera will be used? Say PTZ camera current pan value is 3600? or current pan value is 0?

This is a perfect example of why we defined the constraints algorithm in the first place.

What matters is not that I can intuit every corner case, but that it works the same in all browsers. Predictability trumps usefulness at the edges.

The only way to preserve web compatible semantics here is to strictly adhere to an algorithm. I think evidence of poor web compat points to browsers not following the algorithm.

@jan-ivar
Copy link
Member Author

jan-ivar commented Sep 9, 2020

Just to show I'm not insensitive to the complexity burden on implementers, I've been trying to get rid of advanced since forever. I'd be happy to reopen an issue on that if there's support for it.

w3c/mediacapture-main#369
w3c/mediacapture-main#426
w3c/mediacapture-main#455

@youennf
Copy link
Contributor

youennf commented Sep 9, 2020

We are deviating somehow from the initial issue, but I cannot resist... some answers inline.

This is a perfect example of why we defined the constraints algorithm in the first place.

The constraints algorithm tried to support a very high level of expressiveness and implementations fail to deliver.
Most of the constraints do not really help the user select the right device.

What matters is not that I can intuit every corner case, but that it works the same in all browsers. Predictability trumps usefulness at the edges.

I partially agree.
If the design was providing a lot of benefits on real use cases at the cost of complexity or being counterintuitive in edge cases, this would be fine.
Do you think this is the case?

The only hypothetical benefit I can see would be if we would be able to get rid off of enumerating all microphones and cameras, which we are very far from being able to do.

A tree-based approach for device selection seems more natural and would be much simpler.

I think evidence of poor web compat points to browsers not following the algorithm.

The spec should first define a precise and unambiguous algorithm. This is not yet the case.
Here are some examples:

  • Wording with regards to device selection is full of MAY: 'The User Agent MAY use the value of the computed "fitness distance" from the SelectSettings algorithm, or any other internally-available information about the devices, as an input to the selection algorithm'.
  • Handling of default devices is loosely defined: 'User Agents are encouraged to default to using the user's primary or system default device for kind (when possible).' How is it expected to combine the fitness distance and the default device?
  • Wording with regards to settings selection is using SHOULD and MAY, not MUST: 'Select one settings dictionary from candidates, and return it as the result of the SelectSettings algorithm. The UA SHOULD use the one with the smallest fitness distance, as calculated in step 3, but MAY prefer ones with resizeMode set to "none" over "crop-and-scale".

I've been trying to get rid of advanced since forever. I'd be happy to reopen an issue on that if there's support for it.

I support this.

@guidou
Copy link

guidou commented Sep 9, 2020

But that is just a note. If that meaning is to be retained, it should be spelled out in the main descriptions and not only in a note.

Then it should be moved to be normative. Without that note, pan:true being equivalent to pan: {} is totally worthless and has no meaning since {} otherwise means unconstrained, which is largely the same as not saying anything, which is what pan:false is supposed to mean. if true and false mean the same thing, then there is no point in having a boolean.
This fact is what motivated the discussion in issue #225 and I thought the consensus was that pan, tilt and zoom were going to be treated in a special way so as to ensure that the presence of any of them, even if unconstrained (as in {}) meant that the constraint set restricted candidates to those that could provide PTZ capability.

Like that text does not say anything about that in this case an empty ConstrainDouble implies required hardware support for the property also in the basic set unlike in the case of all non-PTZ properties.

In addition, in Chrome, {pan: {}} does not imply required hardware support for pan but only a permission request and a hardware support for some PTZ property (pan, tilt or zoom).

Yes, it does. At least, that's what it did the last time I reviewed the code.

Option 3 would add that with e.g. {pan: {exact: true}}, modulo whatever we come up with in #246.

Not sure if the extra complexity of allowing pan to be used this way or in the usual DoubleConstraint way is worth it over using a separate constraint for the PTZ capability.

Option 3 would allow a request to prefer a pan and tilt capable camera over a zoom-only capable camera using

navigator.mediaDevices.getUserMedia({video: {pan: true, tilt: true}});

That is correct. However, in the discussions I have seen (and in the Chromium implementation) pan/tilt/zoom has been treated as a single hardware capability. If that is not the case, then this should be clarified in the spec and, if possible in practice, also in the Chromium implementation.

although the idea behind the true semantics is to able to avoid guessed defaults like {ideal: 0.0}.

There are no

With a separate constraint for the PTZ capability the former would remain impossible and the latter would remain the same or become

navigator.mediaDevices.getUserMedia({video: {
    pan: {min: -180.0 * 3600, ideal: 0.0, max: 180.0 * 3600},
    panTiltZoom: {exact: true},
    tilt: {min: -180.0 * 3600, ideal: 0.0, max: 180.0 * 3600}
    }});

That is an issue only if pan,tilt and zoom are independent hardware capabilities and you use a single constraint. If they are indeed separate, you need three separate boolean constraints. This is orthogonal to the permission (there are lots of other constraints for the regular camera permission, just like there can be three constraints for a single PTZ permission).
Also, in practice, if a developer uses a numeric constraint in a request, the boolean one becomes unnecessary since the numeric one already implies a request for the capability.
The only purpose of the booleans (as I understood it from previous discussions), no matter how we implement it (separate constraints, syntactic sugar, or combined bool/double), is to allow selecting a PTZ-capable camera without forcing any movement to a particular setting. Is this not the case?

@jan-ivar
Copy link
Member Author

jan-ivar commented Sep 9, 2020

I think evidence of poor web compat points to browsers not following the algorithm.

The spec should first define a precise and unambiguous algorithm. This is not yet the case.

@youennf The SelectSettings algorithm is precise if you follow it. I'm all for tightening the SHOULD to a MUST, if you think we gave user agents too much rope to not follow it.

In any case, it would seem adhering to the algorithm is the answer to web compat, not the problem.

in the discussions I have seen (and in the Chromium implementation) pan/tilt/zoom has been treated as a single hardware capability. If that is not the case, then this should be clarified in the spec and, if possible in practice, also in the Chromium implementation.

@guidou While implementations are free to make such assumptions, I see no need to cement them in the spec¹ or contort the API over them. It seems cleaner in the abstract for apps to ask for what they want based on their needs, and not make cross-feature assumptions.

navigator.mediaDevices.getUserMedia({video: {
    pan: {min: -180.0 * 3600, ideal: 0.0, max: 180.0 * 3600},
    panTiltZoom: {exact: true},
    tilt: {min: -180.0 * 3600, ideal: 0.0, max: 180.0 * 3600}
}});

@eehakkin I find the separate panTiltZoom boolean constraint above semantically redundant and confusing (is it needed? What happens if I omit it?) On ergonomics alone, option 3 seems a superior way to express the above²:

navigator.mediaDevices.getUserMedia({video: {pan: {exact: 0}, tilt: {exact: 0}}});

...or if I don't want to set values (for the same impact on fitness distance):

navigator.mediaDevices.getUserMedia({video: {pan: {exact: true}, tilt: {exact: true}}});

1) We combine them only in the permissions-space, where I think it makes sense since the trio shares similar privacy concerns.
2) Note this assumes we specify that 0 values are not satisfied by non-panable, non-tiltable cameras.

@guidou
Copy link

guidou commented Sep 9, 2020

I think evidence of poor web compat points to browsers not following the algorithm.

The spec should first define a precise and unambiguous algorithm. This is not yet the case.

@youennf The SelectSettings algorithm is precise if you follow it. I'm all for tightening the SHOULD to a MUST, if you think we gave user agents too much rope to not follow it.

In any case, it would seem adhering to the algorithm is the answer to web compat, not the problem.

I think it's part of the answer. Another problem is that for requests that are not very constrained (e.g., unconstrained or wide min/max ranges) and without ideal values, implementations are allowed to select any setting that satisfies the constraints. Since lots of different values are allowed, browsers break this tie using different criteria. This is not bad, since different browsers can have very good reasons to choose different defaults, but it leads to different results.
Maybe the spec can suggest some implicit ideal values for some properties?

in the discussions I have seen (and in the Chromium implementation) pan/tilt/zoom has been treated as a single hardware capability. If that is not the case, then this should be clarified in the spec and, if possible in practice, also in the Chromium implementation.

@guidou While implementations are free to make such assumptions, I see no need to cement them in the spec¹ or contort the API over them. It seems cleaner in the abstract for apps to ask for what they want based on their needs, and not make cross-feature assumptions.

I agree. But I was referring more to the point that having a separate single boolean constraint (e.g., ptzAbility or similar) wouldn't allow zoom-only, or pan-only, or tilt-only selection, to which the reply is that there should be separate constraints for that, and that this is orthogonal to the permission.
Anyway, I think all three approaches are valid (separate bool constraints, syntactic sugar, and combined bool-double constraints). Of the three, I think the separate constraints is probably the most confusing for users, the combined bool/double is the most expressive (as long as we specify true to mean having the capability and false as not having the capability), and the syntactic sugar is easier to implement, but with more potential for confusion (is true required or best effort?, what if I want the other?).
Based on the priority of constituencies you cited, I'm inclined to go for the combined bool/double constraint, provided the bool values are defined as I indicated.
The permissions considerations can be discussed separately in the spec and it shouldn't be particularly different from the normal camera permission. Simply, if SelectSettings selected a camera with any of the PTZ capabilities, and that capability was requested (by having any of pan, tilt, zoom present in a non-ignored constraint set in the request), the PTZ permission should be requested.

@eehakkin I find the separate panTiltZoom boolean constraint above semantically redundant and confusing (is it needed? What happens if I omit it?) On ergonomics alone, option 3 seems a superior way to express the above²:

navigator.mediaDevices.getUserMedia({video: {pan: {exact: 0}, tilt: {exact: 0}}});

...or if I don't want to set values (for the same impact on fitness distance):

navigator.mediaDevices.getUserMedia({video: {pan: {exact: true}, tilt: {exact: true}}});
  1. We combine them only in the permissions-space, where I think it makes sense since the trio shares similar privacy concerns.

Agree

  1. Note this assumes we specify that 0 values are not satisfied by non-panable, non-tiltable cameras.

Agree that the spec should say this.

@eehakkin
Copy link
Contributor

But that is just a note. If that meaning is to be retained, it should be spelled out in the main descriptions and not only in a note.

Then it should be moved to be normative. Without that note, pan:true being equivalent to pan: {} is totally worthless and has no meaning since {} otherwise means unconstrained, which is largely the same as not saying anything, which is what pan:false is supposed to mean. if true and false mean the same thing, then there is no point in having a boolean.

True and false do not mean the same thing. Both {pan: true} and {pan: {}} causes PTZ permission to be requested and pan to be exposed in track.getCapabilities() and in track.getSettings() while {pan: false} does not.

From the spec: Any algorithm which uses a MediaTrackConstraintSet object and its pan dictionary member which exists after a possible normalization MUST request permission to use a PermissionDescriptor with its name member set to camera and its panTiltZoom member set to true, and, optionally, consider its deviceId member set to any appropriate device’s deviceId.

In addition, in Chrome, {pan: {}} does not imply required hardware support for pan but only a permission request and a hardware support for some PTZ property (pan, tilt or zoom).

Yes, it does. At least, that's what it did the last time I reviewed the code.

Does it? Below you say that in the Chromium implementation pan/tilt/zoom has been treated as a single hardware capability so in essence Chrome request and a hardware support for some PTZ property (pan, tilt or zoom).

Option 3 would add that with e.g. {pan: {exact: true}}, modulo whatever we come up with in #246.

Not sure if the extra complexity of allowing pan to be used this way or in the usual DoubleConstraint way is worth it over using a separate constraint for the PTZ capability.

Option 3 would allow a request to prefer a pan and tilt capable camera over a zoom-only capable camera using

navigator.mediaDevices.getUserMedia({video: {pan: true, tilt: true}});

That is correct. However, in the discussions I have seen (and in the Chromium implementation) pan/tilt/zoom has been treated as a single hardware capability. If that is not the case, then this should be clarified in the spec and, if possible in practice, also in the Chromium implementation.

True.

With a separate constraint for the PTZ capability the former would remain impossible and the latter would remain the same or become

navigator.mediaDevices.getUserMedia({video: {
    pan: {min: -180.0 * 3600, ideal: 0.0, max: 180.0 * 3600},
    panTiltZoom: {exact: true},
    tilt: {min: -180.0 * 3600, ideal: 0.0, max: 180.0 * 3600}
    }});

That is an issue only if pan,tilt and zoom are independent hardware capabilities and you use a single constraint. If they are indeed separate, you need three separate boolean constraints.

At least there are cameras which are capable to zoom but not capable to pan or tilt so might be useful to treat them as separate capabilities.
I have no data to compare popularity of PTZ cameras, zoom cameras and normal cameras, though.

The only purpose of the booleans (as I understood it from previous discussions), no matter how we implement it (separate constraints, syntactic sugar, or combined bool/double), is to allow selecting a PTZ-capable camera without forcing any movement to a particular setting. Is this not the case?

In the first place, the purpose of the boolean is to allow to request PTZ permission without forcing any movement to a particular setting.

Selection of a PTZ-capable camera is also very useful. I am just not sure if it makes sense to implement that in a way there only PTZ constraint precense in the basic set become required capability constraints while all other constraints in the basic without explicit 'exact', 'min' or 'max' are non-required constraints.

@youennf
Copy link
Contributor

youennf commented Sep 10, 2020

@youennf The SelectSettings algorithm is precise if you follow it

I do not think the following step is precise enough: 'For every possible settings dictionary of copy compute its fitness distance'?
Especially in a context where all values are not independent: frame rate might be 60fps for small resolution but 30fps for high resolution. I fear the same might apply with pan, tilt, zoom, focusDistance. Which is one of several reasons why we should refrain from using numerical values for PTZ device selection.

In general, this algorithm assumes to compute an absolute minimum of a multivariable function.
This seems overkill. Just the computation of the domain of the function might be difficult to get interoperability.

I agree it is good to define an algorithm to compute settings in an interoperable manner, but what is the motivation for such complexity? It seems we are trying to solve a problem that users do not care about.

The counter proposal that I hope to present next week is to select devices based on priorities of constraints. For camera, I was thinking of something like PTZ > deviceId > groupId > facingMode > aspectRatio > width > height > frameRate. To un break the tie, use the order exposed by enumerateDevices so that default devices are first.

@youennf
Copy link
Contributor

youennf commented Sep 10, 2020

In the first place, the purpose of the boolean is to allow to request PTZ permission without forcing any movement to a particular setting.

This seems like a natural and good default behavior. I think we should enforce it.
A web page can always use applyConstraints arbitrarily after getUserMedia resolves if they really have a good reason for that.

Selection of a PTZ-capable camera is also very useful.

I agree it is worth discussing whether ptz boolean values should be ideal only or can also be exact.
I do not see the benefits of allowing a web page to ask for a PTZ that must specifically support pan values between -100 and 67.

@beaufortfrancois
Copy link
Contributor

I do not see the benefits of allowing a web page to ask for a PTZ that must specifically support pan values between -100 and 67.

Using known pan, tilt, and zoom values could help on systems where multiple PTZ camera devices are plugged and you want user to pick one rather than another one.

@jan-ivar
Copy link
Member Author

This seems overkill.

@youennf I don't think I care about that, since what constraints bring of value to this spec is the established user semantics.

I think trying to carve out exceptions to those semantics solely for ptz is what is sinking us, and why we're here opening issues.

While I appreciate the value of reducing functionality for test coverage, I think the constraints ship has sailed. Especially since we're not talking about a new API here, but extensions to the existing getUserMedia and applyConstraints APIs. It's more important to me that all constraints in the same dictionary behave the same, than looking for implementer corners to cut.

Challenging the core tenets of the constraints algorithm at this late stage also seems out of scope for this issue, even this spec. If you will: it seems overkill to me is to reinvent constraints over this issue. If you'd still like to try, please open a separate issue, since I think we should be able to make progress here in parallel.

@jan-ivar
Copy link
Member Author

Back to this issue, to restate the problem, I think this spec has...

A legitimate need to extend the model

While all cameras have frameRate, not all cameras have pan, so we need:

  1. A way to constrain for the presence of a feature (constrainable property).
  2. Clarify that values are not satisfied unless that feature is available (i.e. not haha all regular cameras are pan 0).

The consensus among those not challenging the whole constraints algorithm appears to be option 3 in the OP.

@jan-ivar
Copy link
Member Author

Stated as a bug: we need to fix the way this spec has already embraced the true overload to produce the expected results.

@youennf
Copy link
Contributor

youennf commented Sep 11, 2020

I think trying to carve out exceptions to those semantics solely for ptz is what is sinking us

PTZ is an exception in itself, given it requires a distinct permission so a distinct user decision.
It is somehow similar to audio and video with that respect.
I'll try to make my points clearer during the interim meeting and started to draft slides in https://docs.google.com/presentation/d/14Bzs1ia23Q5yGcuPLiQnHvkjjFi9ypXgPue94Q65UIQ/edit?ts=5f3ae9f6#slide=id.g98686538a1_0_0 for that purpose.

we need to fix the way this spec has already embraced the true overload to produce the expected results.

Let's check first whether the spec needs to use a true overload to produce results we want to achieve.

@jan-ivar
Copy link
Member Author

@eehakkin @youennf Since I won't be able to make it to today's virtual interimg, I've added this slide to the deck if someone wants to present it if the discussion leads this way.

@youennf
Copy link
Contributor

youennf commented Sep 15, 2020

@eehakkin, can you attend the virtual interim?
@jan-ivar, if you have some feedback on the permission generic issue, we can try to proxy it at the interim.

@riju riju added the PTZ Pan-Tilt-Zoom label Sep 29, 2020
@eehakkin
Copy link
Contributor

Now that required image capture constraints are not allowed any more, I think that the original option 3 (ConstrainDoubleOrBoolean or (double or boolean or ConstrainDoubleRangeOrBooleanParameters)) is overly complex as I described in #257 (comment).

However, I still think that defining the effect of {pan: true, tilt: true} etc. on fitness distance is needed and useful. I know that @youennf has repeatedly said that he dislikes the use of fitness distance for device selection between PTZ cameras and non-PTZ cameras. But would you still agree that it would be useful that navigator.mediaDevices.getUserMedia({video: {pan: true, tilt: true}}) would prefer pan-tilt PTZ cameras over zoom-only PTZ cameras if the user grants the PTZ permission? If so then pan-tilt PTZ camera settings dictionaries should have better fitness i.e. lower fitness distance than zoom-only PTZ camera settings dictionaries.

So, how about that we define a new typedef ConstrainDoubleOrCapability (bikeshed the name as you will) to be (double or boolean or ConstrainDoubleRange)? So booleans would be allowed only as bare values. And then we could define that

  • If settings dictionary's value for the constraint exists (that is to say the source device supports the relevant capability and the UA exposes it):
    • If the constraint is required (in the advanced constraint sets) and the constraint value is false, the settings dictionary's value for the constraint does not satisfy the constraint. Thus the fitness distance step 2 applies and the fitness distance is positive infinity.
    • If the constraint is required (in the advanced constraint sets) and the constraint value is true, the settings dictionary's value for the constraint satisfies the constraint. Thus the fitness distance step 2 does not apply.
    • If the ideal value is specified and is a boolean value, the actual value is true. Thus the fitness distance step 6 applies and the fitness distance is (actual == ideal) ? 0 : 1.
  • If settings dictionary's value for the constraint does not exist (that is to say either the source device does not support the relevant capability or the UA does not expose it):
    • If the constraint is required (in the advanced constraint sets) and the constraint value is false, the settings dictionary's value for the constraint satisfies the constraint. Thus the fitness distance step 2 does not apply.
    • If the constraint is required (in the advanced constraint sets) and the constraint value is true, a double value or a ConstrainDoubleRange value (i.e. not false), the settings dictionary's value for the constraint does not satisfy the constraint. Thus the fitness distance step 2 applies and the fitness distance is positive infinity.
    • If the ideal value is specified and is a double value, there are no actual values and the fitness distance is 1.
    • If the ideal value is specified and is a boolean value, the actual value is false. Thus the fitness distance step 6 applies and the fitness distance is (actual == ideal) ? 0 : 1.

@beaufortfrancois
Copy link
Contributor

For info, WEBRTC WG meeting at TPAC 2020 recording about this issue is available at https://youtu.be/zjZy8evtkkc?t=6271

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PTZ Pan-Tilt-Zoom
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants