Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPP: update the state of building a hash table when createOnce throw exceptions #4202

Merged
merged 15 commits into from
Mar 14, 2022

Conversation

fzhedu
Copy link
Contributor

@fzhedu fzhedu commented Mar 8, 2022

Signed-off-by: fzhedu [email protected]

What problem does this PR solve?

Issue Number: close #4195
Problem Summary:
set FinishBuild be true when createOnce throw exceptions

What is changed and how it works?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Fix the issue that a query containing `JOIN` could be hung if an error was encountered

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Mar 8, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • fuzhe1989
  • windtalker

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/needs-triage-completed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 8, 2022
@fzhedu
Copy link
Contributor Author

fzhedu commented Mar 8, 2022

/run-all-tests

Comment on lines 153 to 159
std::stringstream log_msg;
log_msg << std::fixed << std::setprecision(3);
log_msg << (subquery.set ? "Creating set. " : "")
<< (subquery.join ? "Creating join. " : "") << (subquery.table ? "Filling temporary table. " : "") << " for task "
<< mpp_task_id.toString();

LOG_DEBUG(log, log_msg.rdbuf());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use fmt instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, and you don't have to move the log part out of the try block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 283 to 285
log_msg << " throw exception: unknown error"
<< " In " << watch.elapsedSeconds() << " sec. ";
LOG_ERROR(log, log_msg.rdbuf());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, use LOG_FMT_ERROR instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 271 to 279
catch (std::exception & e)
{
std::unique_lock<std::mutex> lock(exception_mutex);
log_msg << " throw exception: " << e.what() << " In " << watch.elapsedSeconds() << " sec. ";
LOG_ERROR(log, log_msg.rdbuf());
exception_from_workers.push_back(std::current_exception());
if (subquery.join)
subquery.join->setFinishBuildTable(true);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guess now we don't need to catch std::exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not?
we should push the exception and set finish flag here.

LOG_DEBUG(log, "Creat all tasks of " << mpp_task_id.toString() << " take " << watch.elapsedSeconds() << " sec with exception and rethrow the first, left " << exception_from_workers.size());
std::rethrow_exception(exception_from_workers.front());
}
else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when will this branch happen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just safty check, in case some bugs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted

@sre-bot
Copy link
Collaborator

sre-bot commented Mar 8, 2022

Coverage for changed files

Filename                                                              Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
/build/tics/dbms/src/DataStreams/CreatingSetsBlockInputStream.cpp         172               172     0.00%           9                 9     0.00%         197               197     0.00%         118               118     0.00%
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                                                     172               172     0.00%           9                 9     0.00%         197               197     0.00%         118               118     0.00%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
16777      9447             43.69%    188937  95760        49.32%

full coverage report (for internal network access only)

Comment on lines 153 to 159
std::stringstream log_msg;
log_msg << std::fixed << std::setprecision(3);
log_msg << (subquery.set ? "Creating set. " : "")
<< (subquery.join ? "Creating join. " : "") << (subquery.table ? "Filling temporary table. " : "") << " for task "
<< mpp_task_id.toString();

LOG_DEBUG(log, log_msg.rdbuf());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, and you don't have to move the log part out of the try block.

LOG_ERROR(log, log_msg.rdbuf());
exception_from_workers.push_back(std::current_exception());
if (subquery.join)
subquery.join->setFinishBuildTable(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not add a flag indicating whether the build success or not in Join, so in probe stage, we don't have to do the actual probe if the build failed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about calling IProfilingBlockInputStream.cancel for other subqueries

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about calling IProfilingBlockInputStream.cancel for other subqueries

that will break the current cancel process. now, first tiflash report errors to tidb, then tidb sends kill command, last, the tiflash conduct the cancel command.

@windtalker
Copy link
Contributor

Please add a test for this.

@windtalker windtalker added needs-cherry-pick-release-5.0 PR which needs to be cherry-picked to release-5.0 needs-cherry-pick-release-5.1 PR which needs to be cherry-picked to release-5.1 needs-cherry-pick-release-5.2 PR which needs to be cherry-picked to release-5.2 needs-cherry-pick-release-5.3 Type: Need cherry pick to release-5.3 needs-cherry-pick-release-5.4 Should cherry pick this PR to release-5.4 branch. and removed affects-5.0 affects-5.1 affects-5.2 affects-5.3 affects-5.4 labels Mar 9, 2022
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #4268.

ti-chi-bot pushed a commit to ti-chi-bot/tiflash that referenced this pull request Mar 14, 2022
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #4269.

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #4270.

JaySon-Huang pushed a commit to JaySon-Huang/tiflash that referenced this pull request Mar 17, 2022
windtalker pushed a commit that referenced this pull request Apr 1, 2022
@VelocityLight VelocityLight added cherry-pick-approved Cherry pick PR approved by release team. do-not-merge/cherry-pick-not-approved and removed cherry-pick-approved Cherry pick PR approved by release team. do-not-merge/cherry-pick-not-approved labels Apr 10, 2022
fzhedu added a commit to fzhedu/tiflash that referenced this pull request Apr 14, 2022
fzhedu added a commit that referenced this pull request Apr 14, 2022
ti-chi-bot added a commit that referenced this pull request Apr 14, 2022
@ti-chi-bot ti-chi-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Apr 22, 2022
@sre-bot
Copy link
Collaborator

sre-bot commented Apr 22, 2022

Coverage for changed files

Filename                                         Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Common/FailPoint.cpp                                 416                91    78.12%           6                 0   100.00%          56                 8    85.71%         138                58    57.97%
DataStreams/CreatingSetsBlockInputStream.cpp         186               186     0.00%           9                 9     0.00%         178               178     0.00%         118               118     0.00%
Interpreters/Join.cpp                               1276              1276     0.00%          79                79     0.00%        1410              1410     0.00%         856               856     0.00%
Interpreters/Join.h                                   16                16     0.00%          16                16     0.00%          16                16     0.00%           0                 0         -
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                               1894              1569    17.16%         110               104     5.45%        1660              1612     2.89%        1112              1032     7.19%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
16907      9490             43.87%    190400  96266        49.44%

full coverage report (for internal network access only)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-cherry-pick-release-5.0 PR which needs to be cherry-picked to release-5.0 needs-cherry-pick-release-5.1 PR which needs to be cherry-picked to release-5.1 needs-cherry-pick-release-5.2 PR which needs to be cherry-picked to release-5.2 needs-cherry-pick-release-5.3 Type: Need cherry pick to release-5.3 needs-cherry-pick-release-5.4 Should cherry pick this PR to release-5.4 branch. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

a query is hanged when one of subqueries of CreatingSet throw exceptions
7 participants