Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: improve error handling during ClickHouse query type mismatch while filtering session replays #24300

Conversation

namit-chandwani
Copy link
Contributor

@namit-chandwani namit-chandwani commented Aug 9, 2024

Linked Issues

Problem

  • Filtering recordings by person properties fails when string inputs are used for numeric data types.
  • Improving error handling for type mismatches in queries is essential to provide users with clear feedback and guidance for resolving such issues.

Changes

  • Backend:

    • posthog/errors.py:

      • Updated the CLICKHOUSE_SPECIFIC_ERROR_LOOKUP dictionary to include the TYPE_MISMATCH error with a descriptive message guiding users to correct the data types used in their filters.
    • posthog/session_recordings/session_recording_api.py:

      • Modified the list method within SessionRecordingViewSet to handle ExposedCHQueryError and ExposedHogQLError exceptions by raising a ValidationError with an appropriate error message and status code (400: Bad Request).
  • Frontend:

    • No changes involved

These changes improve the robustness of error handling for ClickHouse queries, offering users clearer and more actionable error messages, especially in cases where there is a type mismatch in filter parameters during session recording queries.

Does this work well for both Cloud and self-hosted?

  • Yes, it will work well for both Cloud and self-hosted PostHog instances.

How did you test this code?

  • Manual Testing:

    • Manually tested filtering recordings by person properties using both correct and incorrect data types.
    • Ensured that appropriate error messages are displayed when type mismatches occur.
  • Screen Recording of Testing:

PostHog.CH.Error.Handling.mov

Type of change:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Refactoring (altering code without changing its external behaviour)
  • Documentation change
  • Other

Checklist:

  • Development completed
  • Comments added to code (wherever necessary)
  • Documentation updated (if applicable)
  • Added tests [Unit tests/Integration tests] (if required)
  • Tested changes locally

Follow-up tasks (if any):

  • None

Additional Comments

  • None

@daibhin daibhin requested a review from a team August 12, 2024 10:22
@daibhin
Copy link
Contributor

daibhin commented Aug 12, 2024

Looks good to me but going to tag the Product Analytics team for a review because they will be much more familiar with the HogQL code

posthog/errors.py Outdated Show resolved Hide resolved
posthog/hogql/query.py Outdated Show resolved Hide resolved
@namit-chandwani
Copy link
Contributor Author

Hey @daibhin @webjunkie, thanks for your insightful reviews!

I’ve pushed the required changes now. Please take a look when you get a chance

53: ErrorCodeMeta("TYPE_MISMATCH", user_safe=True),
53: ErrorCodeMeta(
"TYPE_MISMATCH",
user_safe="One or more property values in your filters are in an incorrect format."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to stay as boolean.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

user_safe: bool | str = False
"""Whether this error code is safe to show to the user and couldn't be caught at HogQL level.
If a string is set, it will be used as the error message instead of the ClickHouse one.
"""

posthog/posthog/errors.py

Lines 303 to 306 in bb997bc

241: ErrorCodeMeta(
"MEMORY_LIMIT_EXCEEDED",
user_safe="Query exceeds memory limits. Try reducing its scope by changing the time range.",
),

@webjunkie I observed that user_safe is being assigned a string value in a few other scenarios, which made me think this might be the ideal way to handle it. I'll go ahead and change user_safe to a boolean as suggested

Can you please let me know the best way to provide a more descriptive error message in that case?
Should I handle this by overriding the message in the SessionRecordingViewSet where the ValidationError is raised, or is there another approach you'd recommend?

TIA

CC: @daibhin

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we will want to capture the error on the frontend and display it there. If you look at the frontend/src/scenes/session-recordings/playlist/sessionRecordingsPlaylistLogic.ts file you should be able to add a new listener for loadSessionRecordingsFailure and maybe call lemonToast.error with the user safe string

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks! Will give it a try

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we will want to capture the error on the frontend and display it there. If you look at the frontend/src/scenes/session-recordings/playlist/sessionRecordingsPlaylistLogic.ts file you should be able to add a new listener for loadSessionRecordingsFailure and maybe call lemonToast.error with the user safe string

Have pushed this change now via commit: b9e1257

try:
return list_recordings_response(filter, request, self.get_serializer_context())
except (ExposedHogQLError, ExposedCHQueryError) as e:
raise ValidationError(str(e), getattr(e, "code_name", None))
Copy link
Contributor Author

@namit-chandwani namit-chandwani Aug 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept the section that raises ValidationError because I believe ExposedHogQLError and ExposedCHQueryError are client-side errors. Therefore, a 4xx HTTP status code, particularly 400 (bad request), seems more fitting than the 500 (internal server error) status code that was originally being used.

Please let me know if this assumption is incorrect and if I should adjust or remove this implementation

Comment on lines +366 to +376
try {
response = await api.recordings.list(params)
} catch (error: any) {
captureException(error)
actions.loadSessionRecordingsFailure(error.message, error)

return {
has_next: false,
results: [],
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the exception isn't handled in this code section, it will be processed by the loadersPlugin in initKea.ts. This will trigger the onFailure function, which already includes error handling with lemonToast.error, resulting in a duplicate error toast.

For reference, here’s the relevant code:

if (errorMessage) {
lemonToast.error(`${identifierToHuman(actionKey)} failed: ${errorMessage}`)
}

I’ve added exception handling above to prevent the duplication of error toasts.

@webjunkie webjunkie enabled auto-merge (squash) August 28, 2024 09:18
@posthog-bot
Copy link
Contributor

This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week. If you want to permanentely keep it open, use the waiting label.

@Twixes Twixes self-requested a review September 6, 2024 15:33
@posthog-bot posthog-bot removed the stale label Sep 9, 2024
@posthog-bot
Copy link
Contributor

This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week. If you want to permanentely keep it open, use the waiting label.

if (errorObject?.data?.type === 'validation_error' && errorObject?.data?.code === 'type_mismatch') {
lemonToast.error(
'One or more property values in your filters are in an incorrect format.' +
' Please check and adjust them to match the required data types.'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the perfect thing here would be for the toast to link to the data management properties page... iirc we can set a button on the toast so the person has an onward journey

(even better if it's to the specific property)

@posthog-bot posthog-bot removed the stale label Sep 23, 2024
Copy link
Collaborator

@Twixes Twixes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Just some PR checks are red – can you resolve this @namit-chandwani?

@posthog-bot
Copy link
Contributor

This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week. If you want to permanentely keep it open, use the waiting label.

@posthog-bot
Copy link
Contributor

This PR was closed due to lack of activity. Feel free to reopen if it's still relevant.

@posthog-bot posthog-bot closed this Oct 8, 2024
auto-merge was automatically disabled October 8, 2024 07:31

Pull request was closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Filter recordings by person properties that are Numeric but using Strings fail
6 participants