`contain_exactly` matcher spins forever, never finishing whatever it's supposed to be doing #1006

hakanai · 2017-08-02T04:03:39Z

For some reason, this example never completes:

describe 'something' do
  it 'fails this test' do
    actual_path_names = ["y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y2", "y3", "y4", "y5", "y5", "y5", "y6", "y6", "y6", "y7", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y10", "y10", "y10", "y11", "y11", "y11", "y11", "y12", "y12", "y12", "y13", "y13", "y13", "y14"]
    puts "actual_path_names.size = #{actual_path_names.size}"

    expected_path_names = ["y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y2", "y3", "y3", "y3", "y3", "y3", "y3", "y3", "y3", "y3", "y3", "y3", "y4", "y5", "y5", "y5", "y5", "y5", "y6", "y6", "y6", "y6", "y6", "y6", "y6", "y7", "y7", "y7", "y7", "y7", "y7", "y7", "y7", "y7", "y7", "y7", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "[x1, x2, x10, x26, ]", "y10", "y10", "y10", "y10", "y11", "y11", "y11", "y11", "y11", "y11", "y12", "y12", "y12", "y12", "y12", "y12", "y13", "y13", "y13", "y15", "y15", "y16", "y16", "y14"]
    puts "expected_path_names.size = #{expected_path_names.size}"

    expect(actual_path_names).to contain_exactly(*expected_path_names)
  end
end

RSpec v3.6.0
JRuby v9.1.6.0, MRI v2.4.0

The text was updated successfully, but these errors were encountered:

myronmarston · 2017-08-02T04:38:41Z

Thanks for the bug report!

Unfortunately, the large amount of flexibility provided by the contain_exactly matcher forces it to fall back to a slow brute force algorithm that compares every actual element against every expected element, to see if there is any way to pair up every actual element against exactly one expected element each. To understand why, please read the lengthy comment the explains the PairingsMaximizer. In your case, you have very large actual and expected arrays, and given that they don't match, it causes a very slow brute force search to find the best set of pairings that minimizes what is unpaired.

You can work around this by not using contain_exactly in this one case; instead you could do:

expect(actual_path_names.sort).to eq(expected_path_names.sort)

Of course, that's not ideal, but it should at least unblock you. Long term, I'd love for us to improve contain_exactly to not have this issue but I'm not sure of a way to do that. I think we might be able to improve it to at least not use the slow brute force approach when all the expected items are matching using only exact equality (as strings do), using some logic like this:

irb(main):001:0> a = "abc"
=> "abc"
irb(main):002:0> a.method(:==) == a.method(:===)
=> true

Here we can see that for a string, === is just an alias of == -- which means we don't need to bother with the slow brute force solution in this case. I don't have the time to work on that optimization now, but maybe someone else can take it up (would you be interested, @trejkaz?).

hakanai · 2017-08-02T04:44:19Z

You say large arrays, but my array was only about 200 elements in size... And if I randomly move some elements around, suddenly it's much, much faster again, even though only a couple of elements changed. It doesn't really make sense to me.

myronmarston · 2017-08-02T05:14:44Z

Can you paste the re-ordering that makes it faster?

…

Sent from my iPhone

On Aug 2, 2017, at 11:44 AM, Trejkaz (pen name) ***@***.***> wrote: You say large arrays, but my array was only about 200 elements in size... And if I randomly move some elements around, suddenly it's much, much faster again, even though only a couple of elements changed. It doesn't really make sense to me. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

hakanai · 2017-08-02T05:29:16Z

Turns out I couldn't get it to happen from the original example. But my original test failure did fail instantly, so I converted that to the same style, and I get:

describe 'something' do
  it 'fails this test' do
    actual_path_names = ["y1", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y3", "y4", "y4", "y4", "y4", "y4", "y4", "y4", "y4", "y4", "y4", "y4", "y5", "y6", "y6", "y6", "y6", "y6", "y7", "y7", "y7", "y7", "y7", "y7", "y7", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y10", "y10", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y12", "y13", "y14", "y14", "y15", "y16", "y16", "y16", "y16", "y16", "y16", "y16", "y17", "y17", "y17", "y17", "y17", "y17", "y18", "y18", "y18", "y18", "y18", "y18", "y19", "y19", "y19", "y19", "y19", "y19", "y20", "y20", "y21", "y21", "y22"]
    puts "actual_path_names.size = #{actual_path_names.size}"

    expected_path_names = ["y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y3", "y4", "y5", "y6", "y6", "y6", "y7", "y7", "y7", "y8", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y16", "y16", "y16", "y17", "y17", "y17", "y17", "y18", "y18", "y18", "y19", "y19", "y19", "y22"]
    puts "expected_path_names.size = #{expected_path_names.size}"

    expect(actual_path_names).to contain_exactly(*expected_path_names)
  end
end

The sizes:

actual_path_names.size = 158
expected_path_names.size = 100

So I guess each list is half the size, but it completes much faster than 1/4 of the time. Actually, I haven't yet seen the other example complete. :)

So actually just looking at the lists, it seems like the main difference is that this one has a unique element at the front before the run of identical elements, whereas in the other example, it starts with a run of identical elements.

hakanai · 2017-08-02T05:39:54Z

Well, I ruled out that theory because I tried the original example with a "y0" at the front, and I still get a hang. So I have no idea what makes it run so much faster in one case and not the other.

hakanai · 2017-08-02T05:47:26Z

So this is interesting...

If:

    expected_path_names = ["y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y3", "y4", "y5", "y6", "y6", "y6", "y7", "y7", "y7", "y8", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y16", "y16", "y16", "y17", "y17", "y17", "y17", "y18", "y18", "y18", "y19", "y19", "y19", "y22"]

I get a hang with:

    actual_path_names = ["y0", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y1", "y2", "y3", "y4", "y5", "y5", "y5", "y6", "y6", "y6", "y7", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y10", "y10", "y10", "y11", "y11", "y11", "y11", "y12", "y12", "y12", "y13", "y13", "y13", "y14"]

But no hang with:

    actual_path_names = ["y1", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y2", "y3", "y4", "y4", "y4", "y4", "y4", "y4", "y4", "y4", "y4", "y4", "y4", "y5", "y6", "y6", "y6", "y6", "y6", "y7", "y7", "y7", "y7", "y7", "y7", "y7", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y8", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y9", "y10", "y10", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y11", "y12", "y13", "y14", "y14", "y15", "y16", "y16", "y16", "y16", "y16", "y16", "y16", "y17", "y17", "y17", "y17", "y17", "y17", "y18", "y18", "y18", "y18", "y18", "y18", "y19", "y19", "y19", "y19", "y19", "y19", "y20", "y20", "y21", "y21", "y22"]

Even though the latter is much longer. :)

myronmarston · 2017-08-02T10:11:26Z

Thanks, that's a useful example. I'll dig into it.

kaiwren · 2018-09-18T11:25:52Z

I'm not sure how much progress I can make here, but just in case it's useful to others solving this, I've move @trejkaz's example into a spec at https://github.com/kaiwren/rspec-expectations/tree/1006-contains-exactly-hangs (this isn't merge-able, just useful to get started)

mjayfrancis · 2018-09-21T12:48:18Z

The optimisation suggested in #1006 (comment) should solve this particular case as reported, but if anyone is motivated to improve the contain_exactly matcher in a more general way, the following may be of use:

The problem that PairingsMaximizer solves is finding a maximum cardinality matching for a bipartite graph. Replacing the current brute force algorithm with a solution such as the Hopcroft-Karp algorithm should give far better results in the general case.

@genehsu

Speed up the ContainExactly matcher by pre-emptively matching up corresponding elements in the expected and actual arrays. This addresses rspec#1006, rspec#1161. This PR is a collaboration between me and @genehsu based on a couple of our earlier PRs and discussion that resulted: 1) rspec#1325 2) rspec#1328 Co-authored-by: Gene Hsu (@genehsu)

@genehsu

Speed up the ContainExactly matcher by pre-emptively matching up corresponding elements in the expected and actual arrays. This addresses rspec#1006, rspec#1161. This PR is a collaboration between me and @genehsu based on a couple of our earlier PRs and discussion that resulted: 1) rspec#1325 2) rspec#1328 Co-authored-by: Gene Hsu (@genehsu)

@genehsu

Speed up the ContainExactly matcher by pre-emptively matching up corresponding elements in the expected and actual arrays. This addresses rspec#1006, rspec#1161. This PR is a collaboration between me and @genehsu based on a couple of our earlier PRs and discussion that resulted: 1) rspec#1325 2) rspec#1328 Co-authored-by: Gene Hsu (@genehsu)

bclayman-sq mentioned this issue Oct 12, 2021

Improve ContainExactly matcher speed when elements obey transitivity #1325

Closed

pirj mentioned this issue Oct 26, 2021

Speed up the ContainExactly matcher #1328

Closed

bclayman-sq mentioned this issue Oct 29, 2021

Speed up the ContainExactly matcher #1333

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`contain_exactly` matcher spins forever, never finishing whatever it's supposed to be doing #1006

`contain_exactly` matcher spins forever, never finishing whatever it's supposed to be doing #1006

hakanai commented Aug 2, 2017 •

edited

Loading

myronmarston commented Aug 2, 2017

hakanai commented Aug 2, 2017

myronmarston commented Aug 2, 2017 via email

hakanai commented Aug 2, 2017 •

edited

Loading

hakanai commented Aug 2, 2017

hakanai commented Aug 2, 2017

myronmarston commented Aug 2, 2017

kaiwren commented Sep 18, 2018

mjayfrancis commented Sep 21, 2018

contain_exactly matcher spins forever, never finishing whatever it's supposed to be doing #1006

contain_exactly matcher spins forever, never finishing whatever it's supposed to be doing #1006

Comments

hakanai commented Aug 2, 2017 • edited Loading

myronmarston commented Aug 2, 2017

hakanai commented Aug 2, 2017

myronmarston commented Aug 2, 2017 via email

hakanai commented Aug 2, 2017 • edited Loading

hakanai commented Aug 2, 2017

hakanai commented Aug 2, 2017

myronmarston commented Aug 2, 2017

kaiwren commented Sep 18, 2018

mjayfrancis commented Sep 21, 2018

`contain_exactly` matcher spins forever, never finishing whatever it's supposed to be doing #1006

`contain_exactly` matcher spins forever, never finishing whatever it's supposed to be doing #1006

hakanai commented Aug 2, 2017 •

edited

Loading

hakanai commented Aug 2, 2017 •

edited

Loading