Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the O(N^2) bug of passColName and passRowName #1782

Merged
merged 1 commit into from
May 30, 2024

Conversation

metab0t
Copy link
Contributor

@metab0t metab0t commented May 30, 2024

Reland #1780

I also simplify some logic here. The result of emplace has contained the duplicate key-value pair if it detects duplication, so we can mark the value as duplicate directly instead of using the slow search-erase-insert combo.

@metab0t
Copy link
Contributor Author

metab0t commented May 30, 2024

@jajhall CI tests pass. It is ready to be reviewed.

assert(int(search->second) < int(this->name2index.size()));
this->name2index.erase(search);
this->name2index.insert({name[index], kHashIsDuplicate});
search->second = kHashIsDuplicate;
Copy link
Member

@jajhall jajhall May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By deleting

this->name2index.erase(search);
this->name2index.insert({name[index], kHashIsDuplicate});

and only setting

search->second = kHashIsDuplicate;

(which has no effect since search is immediately destroyed) the property of marking duplicate names is lost. hence errors due to the same name being given to different columns are not trapped.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emplace will only insert when there is no duplication. When there is duplication, search points to the duplicated key and value, and setting the value will influence the original map.

This is an example: https://godbolt.org/z/YsYYrKrPb

Copy link
Member

@jajhall jajhall May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, it did cross my mind that there was more to setting

search->second = kHashIsDuplicate;

than I gave you credit for, sorry.

As I said, I'm not fluent in the use of std::unordered_map :-)

Just spotted the & in

auto& search = emplace_result.first;

Now I see why setting

search->second = kHashIsDuplicate;

achieves the desired result

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind. I also learned about the meaning of return values of emplace today. It is a clever choice to use emplace to detect duplication.

// Find the original and mark it as duplicate
auto& search = emplace_result.first;
assert(int(search->second) < int(this->name2index.size()));
search->second = kHashIsDuplicate;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Copy link
Member

@jajhall jajhall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may pass the CI tests, but that's because there's no unit test to check that duplicate names are marked as such

@metab0t
Copy link
Contributor Author

metab0t commented May 30, 2024

This may pass the CI tests, but that's because there's no unit test to check that duplicate names are marked as such

status = highs.getColByName(col0_name, iCol);
REQUIRE(status == HighsStatus::kOk);
REQUIRE(iCol == 0);
status = highs.getRowByName(row0_name, iRow);
REQUIRE(status == HighsStatus::kOk);
REQUIRE(iRow == 0);
// Change name of column num_col/2 to be the same as column 0
REQUIRE(highs.getColName(0, name) == HighsStatus::kOk);
REQUIRE(name == col0_name);
iCol = lp.num_col_ / 2;
std::string iCol_name;
REQUIRE(highs.getColName(iCol, iCol_name) == HighsStatus::kOk);
REQUIRE(highs.passColName(iCol, col0_name) == HighsStatus::kOk);
// column num_col/2 is no longer called iCol_name
status = highs.getColByName(iCol_name, iCol);
REQUIRE(status == HighsStatus::kError);
status = highs.getColByName(col0_name, iCol);
REQUIRE(status == HighsStatus::kError);

In the beginning, col0_name is unique, so it has corresponding column.

But after col0_name is assigned to num_col_ / 2 column, it will be duplicate and cannot get column by the name.

@jajhall jajhall merged commit e08e840 into ERGO-Code:latest May 30, 2024
@metab0t
Copy link
Contributor Author

metab0t commented May 30, 2024

There is still one edge case: if a name is marked as duplicate and the corresponding column is assigned with a new name, then the old name should not be deleted directly (there might be other columns with this old name).

However, this is an edge case and checking it would require $O(N)$ iteration through the col_names_. In practice, I believe that few people will set the name of a variable twice.

The final and complete solution is to use https://en.cppreference.com/w/cpp/container/unordered_multimap to map name to column/row if we want to allow multiple columns/rows to have the same name.

@jajhall
Copy link
Member

jajhall commented May 31, 2024

Indeed, there's a limit to making this idiot-proof

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants