Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

show "spiderweb" lines between auto-matched points in 2 overlapping images #2

Open
jywarren opened this issue May 13, 2014 · 16 comments

Comments

@jywarren
Copy link
Member

if it's possible to identify matching interest points between two images, on the fly (client side would be AWESOME), then as someone drags an image, as it overlaps a neighboring image, the interface could try to find matches between the two images, and could draw spiderwebby red lines between possible matches, to help the user. It might even be possible to make those matches slightly "magnetic" if you know what I mean?

@btbonval
Copy link
Member

That's an intense calculation, but such things are known and maybe even
written in JS already..

Question is whether that feature could be disabled on mobile devices? Or
even better: enable it via a UI button so my netbook doesn't get hosed.
On May 13, 2014 4:06 PM, "Jeffrey Warren" [email protected] wrote:

if it's possible to identify matching interest points between two images,
on the fly (client side would be AWESOME), then as someone drags an image,
as it overlaps a neighboring image, the interface could try to find matches
between the two images, and could draw spiderwebby red lines between
possible matches, to help the user. It might even be possible to make those
matches slightly "magnetic" if you know what I mean?


Reply to this email directly or view it on GitHubhttps://github.com//issues/2
.

@erget
Copy link

erget commented May 14, 2014

I've done some preliminary checking and haven't found a JS library that implements feature detection... I'd also be for enabling the feature, since it's a pretty big calculation.

@btbonval
Copy link
Member

I looked for JS libs for feature detection and got the same result.

The closest I found was pixo-something-or-other. All the operations worked
on a single image and they were in super early dev stages. However, they
did say that they were leveraging HTML5 Canvas image stuff, which exposed
the raw pixel data.

It's more work, but we could find some handy published algorithm for image
fitting or partial image matching or partial image searching, then
implement it in a JS library ourselves. Apparently there is a need.

The work of visualizing the match results would be on top of implementing
the algo in the first place. :(
-Bryan

On Tue, May 13, 2014 at 11:36 PM, Daniel Lee [email protected]:

I've done some preliminary checking and haven't found a JS library that
implements feature detection... I'd also be for enabling the feature, since
it's a pretty big calculation.


Reply to this email directly or view it on GitHubhttps://github.com//issues/2#issuecomment-43047066
.

@jywarren
Copy link
Member Author

This sounds like a cool google summer of code project, or maybe an Outreach
Program for Women project since their next session is this coming winter. I
also wonder if any of the face detection demos in canvas are using a
feature detection-like approach.
On May 14, 2014 10:12 AM, "Bryan Bonvallet" [email protected]
wrote:

I looked for JS libs for feature detection and got the same result.

The closest I found was pixo-something-or-other. All the operations worked
on a single image and they were in super early dev stages. However, they
did say that they were leveraging HTML5 Canvas image stuff, which exposed
the raw pixel data.

It's more work, but we could find some handy published algorithm for image
fitting or partial image matching or partial image searching, then
implement it in a JS library ourselves. Apparently there is a need.

The work of visualizing the match results would be on top of implementing
the algo in the first place. :(
-Bryan

On Tue, May 13, 2014 at 11:36 PM, Daniel Lee [email protected]:

I've done some preliminary checking and haven't found a JS library that
implements feature detection... I'd also be for enabling the feature,
since
it's a pretty big calculation.


Reply to this email directly or view it on GitHub<
https://github.com/publiclab/mapknitter/issues/2#issuecomment-43047066>
.


Reply to this email directly or view it on GitHubhttps://github.com//issues/2#issuecomment-43085125
.

@erget
Copy link

erget commented May 14, 2014

This could be less challenging than it sounds. If we follow the typical workflow it would go something like this:

Back end

  1. Find good points in both pictures e.g. with SURF
  2. Pull good points' descriptors
  3. Match points from IMG 1 to IMG 2
  4. Match points from IMG 2 to IMG 1
  5. Discard points where match isn't same on both ends
  6. Maybe filter, e.g. RANSAC
  7. Bubble matched points to the top

Frontend

  1. Pass pictures to back end, get point matches
  2. Draw lines connecting them in a loop
  3. Do anything else you want to do with the images

You could make the frontend a bit smarter by having it reduce the number of points it shows, so you image isn't clogged with points. Also, you could perhaps have it to an initial best guess transform when first run that the user adjusts afterwards. It would probably be smart to only allow two images to be searched simultaneously so you don't freeze up the client though.

@jywarren
Copy link
Member Author

You could also do some sorting like not showing matched points that are too
nearby one another, or only showing the ones near the overlap between the
two images, or ranking how good the matches are, perhaps, if that's
possible. And capping it at 3-5 or so? Some of these could make it run
faster too.

On Wed, May 14, 2014 at 11:16 AM, Daniel Lee [email protected]:

This could be less challenging than it sounds. If we follow the typical
workflow it would go something like this:

Back end

  1. Find good points in both pictures e.g. with SURF
  2. Pull good points' descriptors
  3. Match points from IMG 1 to IMG 2
  4. Match points from IMG 2 to IMG 1
  5. Discard points where match isn't same on both ends
  6. Maybe filter, e.g. RANSAC
  7. Bubble matched points to the top

Frontend

  1. Pass pictures to back end, get point matches
  2. Draw lines connecting them in a loop
  3. Do anything else you want to do with the images

You could make the frontend a bit smarter by having it reduce the number
of points it shows, so you image isn't clogged with points. Also, you could
perhaps have it to an initial best guess transform when first run that the
user adjusts afterwards. It would probably be smart to only allow two
images to be searched simultaneously so you don't freeze up the client
though.


Reply to this email directly or view it on GitHubhttps://github.com//issues/2#issuecomment-43094041
.

@erget
Copy link

erget commented May 14, 2014

I think initializing would be the most expensive part of the operation. Although you could reduce time there by capping the matches to a certain number of points, I wouldn't suggest it for the simple reason that you have no guarantee that the points are really good. Let's say I find three points that have good descriptors. I have no guarantee that I'll be able to match them in the second image, or that my first match is correct. Effectively you'd have to search the second image completely anyway and because you don't know beforehand where the overlap area is, it's hard to restrict the search area to that part.

You could get pretty close to your idea by searching both images, filtering for good descriptors, matching them and discarding outliers, and then looking for clusters and only showing matches from the middle of clusters. That would mean one initial search, some filtering and then updating your spider webs for the remaining ones (perhaps with a defined maximum density) for everything else after the first guess. I think that could be doable for most browsers, as long as the users are okay with the longer start up time.

1 similar comment
@erget
Copy link

erget commented May 14, 2014

I think initializing would be the most expensive part of the operation. Although you could reduce time there by capping the matches to a certain number of points, I wouldn't suggest it for the simple reason that you have no guarantee that the points are really good. Let's say I find three points that have good descriptors. I have no guarantee that I'll be able to match them in the second image, or that my first match is correct. Effectively you'd have to search the second image completely anyway and because you don't know beforehand where the overlap area is, it's hard to restrict the search area to that part.

You could get pretty close to your idea by searching both images, filtering for good descriptors, matching them and discarding outliers, and then looking for clusters and only showing matches from the middle of clusters. That would mean one initial search, some filtering and then updating your spider webs for the remaining ones (perhaps with a defined maximum density) for everything else after the first guess. I think that could be doable for most browsers, as long as the users are okay with the longer start up time.

@jywarren
Copy link
Member Author

That makes sense. I think initialization could be done on a low-res version
of the image (a common optimization in SfM) and if done client-side, could
use a second thread (webworkers or how does that work?)

Part of limiting the # shown is a user interface issue -- lots and lots of
spiderwebs could obscure other things and not add a lot. I also think we
could potentially do the point finding at upload time, when the user is
already expecting a little delay, and do the matching at drag-time. Isn't
matching cheaper than finding? We might even decide to store the features
server side for future use...

On Wed, May 14, 2014 at 11:53 AM, Daniel Lee [email protected]:

I think initializing would be the most expensive part of the operation.
Although you could reduce time there by capping the matches to a certain
number of points, I wouldn't suggest it for the simple reason that you have
no guarantee that the points are really good. Let's say I find three points
that have good descriptors. I have no guarantee that I'll be able to match
them in the second image, or that my first match is correct. Effectively
you'd have to search the second image completely anyway and because you
don't know beforehand where the overlap area is, it's hard to restrict the
search area to that part.

You could get pretty close to your idea by searching both images,
filtering for good descriptors, matching them and discarding outliers, and
then looking for clusters and only showing matches from the middle of
clusters. That would mean one initial search, some filtering and then
updating your spider webs for the remaining ones (perhaps with a defined
maximum density) for everything else after the first guess. I think that
could be doable for most browsers, as long as the users are okay with the
longer start up time.


Reply to this email directly or view it on GitHubhttps://github.com//issues/2#issuecomment-43099131
.

@erget
Copy link

erget commented May 15, 2014

Part of limiting the # shown is a user interface issue -- lots and lots of
spiderwebs could obscure other things and not add a lot. I also think we
could potentially do the point finding at upload time, when the user is
already expecting a little delay, and do the matching at drag-time. Isn't
matching cheaper than finding? We might even decide to store the features
server side for future use...

I like the idea of storing the matches server side. I'm not sure about matching being cheaper than finding though. It's still not absolutely clear to me what images are matched. Do we match all to all at the beginning to find optimal candidates? That would be really expensive. Do we do some kind of preliminary matching, e.g. based on something simple like histogram similarity? Or do we find and store points for all images and only match them when the user drags them within a certain proximity or explicitly requests they be matched?

Probably the last option would be less expensive, but it'll still not be cheap, since the matches are most likely found in a different order. Of you match them unsorted you're looking at something like O(n**
2), and if you somehow rank them first you've got the ranking operation. And then you still have to identify the clusters to display and warp the image for your first guess.

I have no experience with doing this kind of thing in the browser, and JS engines are getting a lot faster. Also, I don't want to over-optimize at the cost of quality when there's no need. But maybe it would be smarter to split this up more? Like have the client upload low-res key points and have the server match and try to iteratively build the best conglomerate picture possible, and then pass hat back to the client as pre-placed and pre-stretched images? Then let the client do spider webs on images that are moved when in close proximity.

I understand though, that when we go that route it's almost as easy to just do the whole shebang with SfM and point clouds server side - I'm sure many users are also interested in the third dimension as well, but I admit that it's a lot to ask from a server.

@btbonval
Copy link
Member

If we could generate descriptors (ideally metrics that are linear in
storage) for each image as soon as they were uploaded, then we'd have some
precalculated descriptors cached for assisting the match algorithm later.

In most 3D shape matching systems, each object is converted into a linear
descriptor of N elements as soon as it is uploaded. When it comes time to
search that element against all others in the database, it's just a
geometric distance between descriptors, which can be ranked. Smaller
geometric distance between N-dimensional descriptors implies closer in
shape.

I'm sure we could come up with (read: find research on) some kind of handy
descriptor for 2D image matching. Preprocess the descriptor ahead of time,
perform client-side matching between the currently selected image against
all other images whose descriptors have been loaded (likely those images
which are currently on the map).

That's all very airy and not well defined. The devil is in the details.

On Wed, May 14, 2014 at 9:59 PM, Daniel Lee [email protected]:

Part of limiting the # shown is a user interface issue -- lots and lots of
spiderwebs could obscure other things and not add a lot. I also think we
could potentially do the point finding at upload time, when the user is
already expecting a little delay, and do the matching at drag-time. Isn't
matching cheaper than finding? We might even decide to store the features
server side for future use...

I like the idea of storing the matches server side. I'm not sure about
matching being cheaper than finding though. It's still not absolutely clear
to me what images are matched. Do we match all to all at the beginning to
find optimal candidates? That would be really expensive. Do we do some kind
of preliminary matching, e.g. based on something simple like histogram
similarity? Or do we find and store points for all images and only match
them when the user drags them within a certain proximity or explicitly
requests they be matched?

Probably the last option would be less expensive, but it'll still not be
cheap, since the matches are most likely found in a different order. Of you
match them unsorted you're looking at something like O(n**
2), and if you somehow rank them first you've got the ranking operation.
And then you still have to identify the clusters to display and warp the
image for your first guess.

I have no experience with doing this kind of thing in the browser, and JS
engines are getting a lot faster. Also, I don't want to over-optimize at
the cost of quality when there's no need. But maybe it would be smarter to
split this up more? Like have the client upload low-res key points and have
the server match and try to iteratively build the best conglomerate picture
possible, and then pass hat back to the client as pre-placed and
pre-stretched images? Then let the client do spider webs on images that are
moved when in close proximity.

I understand though, that when we go that route it's almost as easy to
just do the whole shebang with SfM and point clouds server side - I'm sure
many users are also interested in the third dimension as well, but I admit
that it's a lot to ask from a server.


Reply to this email directly or view it on GitHubhttps://github.com//issues/2#issuecomment-43168722
.

@btbonval
Copy link
Member

Regardless of implementation, you could probably cull the search space to
only those images which are currently on the viewport.

Why try to match the currently dragged image against images off the screen?
Humans are pretty good at getting it near the right place at a high level,
right?

On Wed, May 14, 2014 at 10:03 PM, Bryan [email protected] wrote:

If we could generate descriptors (ideally metrics that are linear in
storage) for each image as soon as they were uploaded, then we'd have some
precalculated descriptors cached for assisting the match algorithm later.

In most 3D shape matching systems, each object is converted into a linear
descriptor of N elements as soon as it is uploaded. When it comes time to
search that element against all others in the database, it's just a
geometric distance between descriptors, which can be ranked. Smaller
geometric distance between N-dimensional descriptors implies closer in
shape.

I'm sure we could come up with (read: find research on) some kind of handy
descriptor for 2D image matching. Preprocess the descriptor ahead of time,
perform client-side matching between the currently selected image against
all other images whose descriptors have been loaded (likely those images
which are currently on the map).

That's all very airy and not well defined. The devil is in the details.

On Wed, May 14, 2014 at 9:59 PM, Daniel Lee [email protected]:

Part of limiting the # shown is a user interface issue -- lots and lots
of
spiderwebs could obscure other things and not add a lot. I also think we
could potentially do the point finding at upload time, when the user is
already expecting a little delay, and do the matching at drag-time. Isn't
matching cheaper than finding? We might even decide to store the features
server side for future use...

I like the idea of storing the matches server side. I'm not sure about
matching being cheaper than finding though. It's still not absolutely clear
to me what images are matched. Do we match all to all at the beginning to
find optimal candidates? That would be really expensive. Do we do some kind
of preliminary matching, e.g. based on something simple like histogram
similarity? Or do we find and store points for all images and only match
them when the user drags them within a certain proximity or explicitly
requests they be matched?

Probably the last option would be less expensive, but it'll still not be
cheap, since the matches are most likely found in a different order. Of you
match them unsorted you're looking at something like O(n**
2), and if you somehow rank them first you've got the ranking operation.
And then you still have to identify the clusters to display and warp the
image for your first guess.

I have no experience with doing this kind of thing in the browser, and JS
engines are getting a lot faster. Also, I don't want to over-optimize at
the cost of quality when there's no need. But maybe it would be smarter to
split this up more? Like have the client upload low-res key points and have
the server match and try to iteratively build the best conglomerate picture
possible, and then pass hat back to the client as pre-placed and
pre-stretched images? Then let the client do spider webs on images that are
moved when in close proximity.

I understand though, that when we go that route it's almost as easy to
just do the whole shebang with SfM and point clouds server side - I'm sure
many users are also interested in the third dimension as well, but I admit
that it's a lot to ask from a server.


Reply to this email directly or view it on GitHubhttps://github.com//issues/2#issuecomment-43168722
.

@jywarren
Copy link
Member Author

I think we could easily start by culling matching to only images which are
within one image width of distance from one another... "relatively close"
images. We could even cull by matching only points on edges facing one
another, since spiderwebs might even be confusing if they try to link to
the opposite side of an image. If point finding is pre-loaded upon upload,
and we cull/constrain matching enough, I think this could be a pretty
snappy real-time interface.

On Thu, May 15, 2014 at 1:08 AM, Bryan Bonvallet
[email protected]:

Regardless of implementation, you could probably cull the search space to
only those images which are currently on the viewport.

Why try to match the currently dragged image against images off the
screen?
Humans are pretty good at getting it near the right place at a high level,
right?

On Wed, May 14, 2014 at 10:03 PM, Bryan [email protected] wrote:

If we could generate descriptors (ideally metrics that are linear in
storage) for each image as soon as they were uploaded, then we'd have
some
precalculated descriptors cached for assisting the match algorithm
later.

In most 3D shape matching systems, each object is converted into a
linear
descriptor of N elements as soon as it is uploaded. When it comes time
to
search that element against all others in the database, it's just a
geometric distance between descriptors, which can be ranked. Smaller
geometric distance between N-dimensional descriptors implies closer in
shape.

I'm sure we could come up with (read: find research on) some kind of
handy
descriptor for 2D image matching. Preprocess the descriptor ahead of
time,
perform client-side matching between the currently selected image
against
all other images whose descriptors have been loaded (likely those images
which are currently on the map).

That's all very airy and not well defined. The devil is in the details.

On Wed, May 14, 2014 at 9:59 PM, Daniel Lee [email protected]:

Part of limiting the # shown is a user interface issue -- lots and lots
of
spiderwebs could obscure other things and not add a lot. I also think
we
could potentially do the point finding at upload time, when the user is
already expecting a little delay, and do the matching at drag-time.
Isn't
matching cheaper than finding? We might even decide to store the
features
server side for future use...

I like the idea of storing the matches server side. I'm not sure about
matching being cheaper than finding though. It's still not absolutely
clear
to me what images are matched. Do we match all to all at the beginning
to
find optimal candidates? That would be really expensive. Do we do some
kind
of preliminary matching, e.g. based on something simple like histogram
similarity? Or do we find and store points for all images and only
match
them when the user drags them within a certain proximity or explicitly
requests they be matched?

Probably the last option would be less expensive, but it'll still not
be
cheap, since the matches are most likely found in a different order. Of
you
match them unsorted you're looking at something like O(n**
2), and if you somehow rank them first you've got the ranking
operation.
And then you still have to identify the clusters to display and warp
the
image for your first guess.

I have no experience with doing this kind of thing in the browser, and
JS
engines are getting a lot faster. Also, I don't want to over-optimize
at
the cost of quality when there's no need. But maybe it would be smarter
to
split this up more? Like have the client upload low-res key points and
have
the server match and try to iteratively build the best conglomerate
picture
possible, and then pass hat back to the client as pre-placed and
pre-stretched images? Then let the client do spider webs on images that
are
moved when in close proximity.

I understand though, that when we go that route it's almost as easy to
just do the whole shebang with SfM and point clouds server side - I'm
sure
many users are also interested in the third dimension as well, but I
admit
that it's a lot to ask from a server.


Reply to this email directly or view it on GitHub<
https://github.com/publiclab/mapknitter/issues/2#issuecomment-43168722>
.


Reply to this email directly or view it on GitHubhttps://github.com//issues/2#issuecomment-43169096
.

@erget
Copy link

erget commented May 15, 2014

Agreed. My worries were mostly from the perspective of automated image
placement and warping, but if the matches are provided as a user aid
instead of being used for automating the whole process it makes things a
lot simpler.
On May 15, 2014 7:19 AM, "Jeffrey Warren" [email protected] wrote:

I think we could easily start by culling matching to only images which are
within one image width of distance from one another... "relatively close"
images. We could even cull by matching only points on edges facing one
another, since spiderwebs might even be confusing if they try to link to
the opposite side of an image. If point finding is pre-loaded upon upload,
and we cull/constrain matching enough, I think this could be a pretty
snappy real-time interface.

On Thu, May 15, 2014 at 1:08 AM, Bryan Bonvallet
[email protected]:

Regardless of implementation, you could probably cull the search space
to
only those images which are currently on the viewport.

Why try to match the currently dragged image against images off the
screen?
Humans are pretty good at getting it near the right place at a high
level,
right?

On Wed, May 14, 2014 at 10:03 PM, Bryan [email protected] wrote:

If we could generate descriptors (ideally metrics that are linear in
storage) for each image as soon as they were uploaded, then we'd have
some
precalculated descriptors cached for assisting the match algorithm
later.

In most 3D shape matching systems, each object is converted into a
linear
descriptor of N elements as soon as it is uploaded. When it comes time
to
search that element against all others in the database, it's just a
geometric distance between descriptors, which can be ranked. Smaller
geometric distance between N-dimensional descriptors implies closer in
shape.

I'm sure we could come up with (read: find research on) some kind of
handy
descriptor for 2D image matching. Preprocess the descriptor ahead of
time,
perform client-side matching between the currently selected image
against
all other images whose descriptors have been loaded (likely those
images
which are currently on the map).

That's all very airy and not well defined. The devil is in the
details.

On Wed, May 14, 2014 at 9:59 PM, Daniel Lee [email protected]:

Part of limiting the # shown is a user interface issue -- lots and
lots
of
spiderwebs could obscure other things and not add a lot. I also think
we
could potentially do the point finding at upload time, when the user
is
already expecting a little delay, and do the matching at drag-time.
Isn't
matching cheaper than finding? We might even decide to store the
features
server side for future use...

I like the idea of storing the matches server side. I'm not sure
about
matching being cheaper than finding though. It's still not absolutely
clear
to me what images are matched. Do we match all to all at the
beginning
to
find optimal candidates? That would be really expensive. Do we do
some
kind
of preliminary matching, e.g. based on something simple like
histogram
similarity? Or do we find and store points for all images and only
match
them when the user drags them within a certain proximity or
explicitly
requests they be matched?

Probably the last option would be less expensive, but it'll still not
be
cheap, since the matches are most likely found in a different order.
Of
you
match them unsorted you're looking at something like O(n**
2), and if you somehow rank them first you've got the ranking
operation.
And then you still have to identify the clusters to display and warp
the
image for your first guess.

I have no experience with doing this kind of thing in the browser,
and
JS
engines are getting a lot faster. Also, I don't want to over-optimize
at
the cost of quality when there's no need. But maybe it would be
smarter
to
split this up more? Like have the client upload low-res key points
and
have
the server match and try to iteratively build the best conglomerate
picture
possible, and then pass hat back to the client as pre-placed and
pre-stretched images? Then let the client do spider webs on images
that
are
moved when in close proximity.

I understand though, that when we go that route it's almost as easy
to
just do the whole shebang with SfM and point clouds server side - I'm
sure
many users are also interested in the third dimension as well, but I
admit
that it's a lot to ask from a server.


Reply to this email directly or view it on GitHub<
https://github.com/publiclab/mapknitter/issues/2#issuecomment-43168722>
.


Reply to this email directly or view it on GitHub<
https://github.com/publiclab/mapknitter/issues/2#issuecomment-43169096>
.


Reply to this email directly or view it on GitHubhttps://github.com//issues/2#issuecomment-43169519
.

@jywarren
Copy link
Member Author

I also like it because it's a stepping stone towards more automation if we
want, and provides great visual feedback on whether matching works, and how
efficient it is -- which can inform more automation if we go that route.

On Thu, May 15, 2014 at 1:28 AM, Daniel Lee [email protected]:

Agreed. My worries were mostly from the perspective of automated image
placement and warping, but if the matches are provided as a user aid
instead of being used for automating the whole process it makes things a
lot simpler.
On May 15, 2014 7:19 AM, "Jeffrey Warren" [email protected]
wrote:

I think we could easily start by culling matching to only images which
are
within one image width of distance from one another... "relatively
close"
images. We could even cull by matching only points on edges facing one
another, since spiderwebs might even be confusing if they try to link to
the opposite side of an image. If point finding is pre-loaded upon
upload,
and we cull/constrain matching enough, I think this could be a pretty
snappy real-time interface.

On Thu, May 15, 2014 at 1:08 AM, Bryan Bonvallet
[email protected]:

Regardless of implementation, you could probably cull the search space
to
only those images which are currently on the viewport.

Why try to match the currently dragged image against images off the
screen?
Humans are pretty good at getting it near the right place at a high
level,
right?

On Wed, May 14, 2014 at 10:03 PM, Bryan [email protected] wrote:

If we could generate descriptors (ideally metrics that are linear in
storage) for each image as soon as they were uploaded, then we'd
have
some
precalculated descriptors cached for assisting the match algorithm
later.

In most 3D shape matching systems, each object is converted into a
linear
descriptor of N elements as soon as it is uploaded. When it comes
time
to
search that element against all others in the database, it's just a
geometric distance between descriptors, which can be ranked. Smaller
geometric distance between N-dimensional descriptors implies closer
in
shape.

I'm sure we could come up with (read: find research on) some kind of
handy
descriptor for 2D image matching. Preprocess the descriptor ahead of
time,
perform client-side matching between the currently selected image
against
all other images whose descriptors have been loaded (likely those
images
which are currently on the map).

That's all very airy and not well defined. The devil is in the
details.

On Wed, May 14, 2014 at 9:59 PM, Daniel Lee <
[email protected]>wrote:

Part of limiting the # shown is a user interface issue -- lots and
lots
of
spiderwebs could obscure other things and not add a lot. I also
think
we
could potentially do the point finding at upload time, when the
user
is
already expecting a little delay, and do the matching at drag-time.
Isn't
matching cheaper than finding? We might even decide to store the
features
server side for future use...

I like the idea of storing the matches server side. I'm not sure
about
matching being cheaper than finding though. It's still not
absolutely
clear
to me what images are matched. Do we match all to all at the
beginning
to
find optimal candidates? That would be really expensive. Do we do
some
kind
of preliminary matching, e.g. based on something simple like
histogram
similarity? Or do we find and store points for all images and only
match
them when the user drags them within a certain proximity or
explicitly
requests they be matched?

Probably the last option would be less expensive, but it'll still
not
be
cheap, since the matches are most likely found in a different
order.
Of
you
match them unsorted you're looking at something like O(n**
2), and if you somehow rank them first you've got the ranking
operation.
And then you still have to identify the clusters to display and
warp
the
image for your first guess.

I have no experience with doing this kind of thing in the browser,
and
JS
engines are getting a lot faster. Also, I don't want to
over-optimize
at
the cost of quality when there's no need. But maybe it would be
smarter
to
split this up more? Like have the client upload low-res key points
and
have
the server match and try to iteratively build the best conglomerate
picture
possible, and then pass hat back to the client as pre-placed and
pre-stretched images? Then let the client do spider webs on images
that
are
moved when in close proximity.

I understand though, that when we go that route it's almost as easy
to
just do the whole shebang with SfM and point clouds server side -
I'm
sure
many users are also interested in the third dimension as well, but
I
admit
that it's a lot to ask from a server.


Reply to this email directly or view it on GitHub<
https://github.com/publiclab/mapknitter/issues/2#issuecomment-43168722>

.


Reply to this email directly or view it on GitHub<
https://github.com/publiclab/mapknitter/issues/2#issuecomment-43169096>
.


Reply to this email directly or view it on GitHub<
https://github.com/publiclab/mapknitter/issues/2#issuecomment-43169519>
.


Reply to this email directly or view it on GitHubhttps://github.com//issues/2#issuecomment-43169915
.

@btbonval
Copy link
Member

I think the "more automation" route is the 3D alignment and point cloud
generation approach, which can be projected down to a single top-down image
and manually or automatically knitted. But if you're adding pictures, you
can generate better point clouds. It wouldn't make sense to stitch pictures
so much as align a single 2D projection every time.

I really feel that the raison d'etre for MapKnitter is the
community-focused, hand driven approach. Otherwise we can dump the images
into Autodesk or whatever has been used to generate the 3D models.
-Bryan

On Thu, May 15, 2014 at 6:40 AM, Jeffrey Warren [email protected]:

I also like it because it's a stepping stone towards more automation if we
want, and provides great visual feedback on whether matching works, and
how
efficient it is -- which can inform more automation if we go that route.

On Thu, May 15, 2014 at 1:28 AM, Daniel Lee [email protected]:

Agreed. My worries were mostly from the perspective of automated image
placement and warping, but if the matches are provided as a user aid
instead of being used for automating the whole process it makes things a
lot simpler.
On May 15, 2014 7:19 AM, "Jeffrey Warren" [email protected]
wrote:

I think we could easily start by culling matching to only images which
are
within one image width of distance from one another... "relatively
close"
images. We could even cull by matching only points on edges facing one
another, since spiderwebs might even be confusing if they try to link
to
the opposite side of an image. If point finding is pre-loaded upon
upload,
and we cull/constrain matching enough, I think this could be a pretty
snappy real-time interface.

On Thu, May 15, 2014 at 1:08 AM, Bryan Bonvallet
[email protected]:

Regardless of implementation, you could probably cull the search
space
to
only those images which are currently on the viewport.

Why try to match the currently dragged image against images off the
screen?
Humans are pretty good at getting it near the right place at a high
level,
right?

On Wed, May 14, 2014 at 10:03 PM, Bryan [email protected] wrote:

If we could generate descriptors (ideally metrics that are linear
in
storage) for each image as soon as they were uploaded, then we'd
have
some
precalculated descriptors cached for assisting the match algorithm
later.

In most 3D shape matching systems, each object is converted into a
linear
descriptor of N elements as soon as it is uploaded. When it comes
time
to
search that element against all others in the database, it's just
a
geometric distance between descriptors, which can be ranked.
Smaller
geometric distance between N-dimensional descriptors implies
closer
in
shape.

I'm sure we could come up with (read: find research on) some kind
of
handy
descriptor for 2D image matching. Preprocess the descriptor ahead
of
time,
perform client-side matching between the currently selected image
against
all other images whose descriptors have been loaded (likely those
images
which are currently on the map).

That's all very airy and not well defined. The devil is in the
details.

On Wed, May 14, 2014 at 9:59 PM, Daniel Lee <
[email protected]>wrote:

Part of limiting the # shown is a user interface issue -- lots
and
lots
of
spiderwebs could obscure other things and not add a lot. I also
think
we
could potentially do the point finding at upload time, when the
user
is
already expecting a little delay, and do the matching at
drag-time.
Isn't
matching cheaper than finding? We might even decide to store the
features
server side for future use...

I like the idea of storing the matches server side. I'm not sure
about
matching being cheaper than finding though. It's still not
absolutely
clear
to me what images are matched. Do we match all to all at the
beginning
to
find optimal candidates? That would be really expensive. Do we do
some
kind
of preliminary matching, e.g. based on something simple like
histogram
similarity? Or do we find and store points for all images and
only
match
them when the user drags them within a certain proximity or
explicitly
requests they be matched?

Probably the last option would be less expensive, but it'll still
not
be
cheap, since the matches are most likely found in a different
order.
Of
you
match them unsorted you're looking at something like O(n**
2), and if you somehow rank them first you've got the ranking
operation.
And then you still have to identify the clusters to display and
warp
the
image for your first guess.

I have no experience with doing this kind of thing in the
browser,
and
JS
engines are getting a lot faster. Also, I don't want to
over-optimize
at
the cost of quality when there's no need. But maybe it would be
smarter
to
split this up more? Like have the client upload low-res key
points
and
have
the server match and try to iteratively build the best
conglomerate
picture
possible, and then pass hat back to the client as pre-placed and
pre-stretched images? Then let the client do spider webs on
images
that
are
moved when in close proximity.

I understand though, that when we go that route it's almost as
easy
to
just do the whole shebang with SfM and point clouds server side -
I'm
sure
many users are also interested in the third dimension as well,
but
I
admit
that it's a lot to ask from a server.


Reply to this email directly or view it on GitHub<

https://github.com/publiclab/mapknitter/issues/2#issuecomment-43168722>

.


Reply to this email directly or view it on GitHub<
https://github.com/publiclab/mapknitter/issues/2#issuecomment-43169096>

.


Reply to this email directly or view it on GitHub<
https://github.com/publiclab/mapknitter/issues/2#issuecomment-43169519>
.


Reply to this email directly or view it on GitHub<
https://github.com/publiclab/mapknitter/issues/2#issuecomment-43169915>
.


Reply to this email directly or view it on GitHubhttps://github.com//issues/2#issuecomment-43210186
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants