fix(library): add error field truncating #231

ecrupper · 2022-01-31T19:40:45Z

On rare occasions, the error generated by a step exceeds our database limit of 500 characters. When this happens, the build hangs and does not complete. This PR makes sure we don't set the error field to a value we can't support in the database.

codecov · 2022-01-31T19:41:59Z

Codecov Report

Merging #231 (1ecd2a6) into master (00b58d8) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master     #231   +/-   ##
=======================================
  Coverage   97.06%   97.06%           
=======================================
  Files          53       53           
  Lines        5794     5797    +3     
=======================================
+ Hits         5624     5627    +3     
  Misses        125      125           
  Partials       45       45

Impacted Files	Coverage Δ
library/hook.go	`100.00% <ø> (ø)`
database/build.go	`100.00% <100.00%> (ø)`

cognifloyd

Which part of the error is most likely to include the useful bits? In python, I always want the end of the traceback. I'm still too new to go to say for sure.

What if you took the first and last 250 characters, dropping everything in the middle?

ecrupper · 2022-01-31T20:56:31Z

Which part of the error is most likely to include the useful bits? In python, I always want the end of the traceback. I'm still too new to go to say for sure.

What if you took the first and last 250 characters, dropping everything in the middle?

I like this idea. Most errors contain the vital information in the beginning or end.

kneal · 2022-02-01T17:21:49Z

Also, wonder if in addition to character limits if we should just support more characters for the field in the database. I don't think it would be super harmful to allow 1000 characters.

GregoryDosh · 2022-02-02T16:01:16Z

Is it possible to place the truncation code to the "last responsible moment" so that within any libraries or other bits of code that want to use the full error, they can, but when we go to place the data into the database we're only saving the important bits? I don't have context into if you've already explored that, but could reduce the code duplication and not preemptively truncate before we need to do so.

Otherwise it looks good and I don't see any obvious issues. Thanks for your contribution!

GregoryDosh · 2022-02-02T17:00:42Z

constants/limit.go

@@ -24,6 +24,9 @@ const (
 	// BuildTimeoutDefault defines the default value in minutes for repo build timeout.
 	BuildTimeoutDefault = 30

+	// ErrorLimit defines the maximum size in characters for resource error fields.


Oh, one small thing, this comment is helpful, but it doesn't let anyone know what happens if the limit is exceeded. They'd have to dig into the code to understand that half of the beginning and half of the end are kept. It might be beneficial for future folks looking in the codebase to understand the intent of that and what happens if it gets exceeded.

ecrupper · 2022-02-02T21:53:26Z

I relocated these changes to be within the crop function of the build database object. It seems like that's where it belongs. I decided to leave out hook, step, and service for now. When trying to reproduce the error from the issue with just the build crop updated, it seemed to work just fine.

GregoryDosh

Nice catch! This all looks good to me.

jbrockopp · 2022-02-04T21:08:41Z

database/build.go

@@ -27,6 +27,8 @@ const (
 	maxTitleLength = 1000
 	// Maximum message field length.
 	maxMessageLength = 2000
+	// Maximum error field length.
+	maxErrorLength = 500


I agree with @kneal 👍

We should explore expanding the number of chars we allow for the error field across all resources.

Doubling it from 500 -> 1000 sounds like a fair place to start 😃

it's currently a somewhat rare occurrence (afaik) so handling the outliers this way instead of having the outliers shape the data size seems fine to me. do we expect this be more commonplace?

My personal view would be I don't think that field has ever changed in size. So, maybe this is showing errors that could be growing in size. I also don't think 1000 is that big of a concern and then would save us having to parse errors and hopefully not cut off the important part. If we had to go beyond 1000 then I think truncating could be useful.

I'm also not sure if we use error wrapping everywhere in the worker/runtime to make these errors longer so depending how errors are handled or want to be handled long term in that codebase 1000 might be the norm but I honestly don't know. The error: field in the build resource is entirely for admins because thats where we populate a platform error. So, cutting off the data really only hurts ourselves for richer research.

We should explore expanding the number of chars we allow for the error field across all resources.

@jbrockopp probably not a bad idea since I don't think that's ever been done. I assume most of these char lengths are there original defaults.

good callout on error wrapping! in other words, it has good chances of becoming larger. I'm good with bumping to 1000 then 👍🏼

So if we bump the limit to 1000, do we still want to crop the value if it exceeds that limit? Or do we have another idea for an approach to that situation?

The only other idea I have would be compression if we wanna keep the field small but keep the full error messages.

Would you support compressing the other fields currently being processed by the crop method? (message and title)

I don't know the answer to that one. I think for error it might make sense to compress because that field could be quite variable depending on how the error wrapping works.

Title and Message might make sense to crop or inherit because those come from the SCM system attached and I would assume GitHub has limits on those. So, it might make sense to just mirror those limits.

@ecrupper

So if we bump the limit to 1000, do we still want to crop the value if it exceeds that limit? Or do we have another idea for an approach to that situation?

I'm onboard with the idea of a "simple" cropping strategy similar to title and message.

In the future, if this becomes a problem with the 1000 char limit, then we should introduce compression.

cc: @kneal @wass3r

jbrockopp · 2022-02-04T21:11:26Z

database/build.go

+	if len(b.Error.String) > maxErrorLength {
+		front := maxErrorLength - (maxErrorLength / 2)
+		end := len(b.Error.String) - (maxErrorLength / 2)
+		str := b.Error.String[:front] + b.Error.String[end:]
+		b.Error = sql.NullString{String: str, Valid: true}
+	}


Are we sure we're taking the right approach here? 😅

Personally, I'd rather start with increasing the number of characters before going down this route 👍

If we feel that's insufficient, then we should use compression for the error field.

I understand not wanting to process string logic like this, but I am curious as to why the message and title fields get cropped in a similar fashion?

Also, I do wonder if it may be more efficient to handle larger errors by referencing the log from which it came rather than attempt to store some truncated version.

To your point, I think we might put the errors in STDOUT but I can't be sure about that. A big advantage to why we added the error field was to track platform errors that occur in the runtime and have them uniquely identified vs the user error in a container.

Having the field exposed here also in the past made it easier for admins to do their own support because they didn't have to go digging through logs to find the error or crash log. They could just see the information directly in the build and troubleshoot that way.

jbrockopp

LGTM

fix(library): add error field truncating

9e99b03

ecrupper requested a review from a team as a code owner January 31, 2022 19:40

ecrupper self-assigned this Jan 31, 2022

cognifloyd reviewed Jan 31, 2022

View reviewed changes

make truncating capture both ends

528e33c

make truncating completely dynamic with error limit

1e22bbd

GregoryDosh reviewed Feb 2, 2022

View reviewed changes

change implementation to db package

53ec15a

GregoryDosh previously approved these changes Feb 2, 2022

View reviewed changes

jbrockopp requested changes Feb 4, 2022

View reviewed changes

change max error to 1000 and simplify crop

e3620c7

ecrupper dismissed GregoryDosh’s stale review via e3620c7 February 10, 2022 19:49

ecrupper mentioned this pull request Feb 10, 2022

fix(build)!: increase error limit to 1000 go-vela/server#584

Merged

Merge branch 'master' into fix/error-field-limiting

1ecd2a6

jbrockopp approved these changes Feb 11, 2022

View reviewed changes

wass3r approved these changes Feb 16, 2022

View reviewed changes

ecrupper merged commit 0ee0f1e into master Feb 16, 2022

ecrupper deleted the fix/error-field-limiting branch February 16, 2022 16:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(library): add error field truncating #231

fix(library): add error field truncating #231

ecrupper commented Jan 31, 2022 •

edited

Loading

codecov bot commented Jan 31, 2022 •

edited

Loading

cognifloyd left a comment

ecrupper commented Jan 31, 2022

kneal commented Feb 1, 2022 •

edited

Loading

GregoryDosh commented Feb 2, 2022

GregoryDosh Feb 2, 2022

ecrupper commented Feb 2, 2022

GregoryDosh left a comment

jbrockopp Feb 4, 2022

wass3r Feb 4, 2022

kneal Feb 7, 2022 •

edited

Loading

wass3r Feb 7, 2022

ecrupper Feb 7, 2022

kneal Feb 7, 2022

ecrupper Feb 7, 2022

kneal Feb 7, 2022 •

edited

Loading

jbrockopp Feb 8, 2022 •

edited

Loading

jbrockopp Feb 4, 2022 •

edited

Loading

ecrupper Feb 7, 2022

kneal Feb 7, 2022

jbrockopp left a comment

fix(library): add error field truncating #231

fix(library): add error field truncating #231

Conversation

ecrupper commented Jan 31, 2022 • edited Loading

codecov bot commented Jan 31, 2022 • edited Loading

Codecov Report

cognifloyd left a comment

Choose a reason for hiding this comment

ecrupper commented Jan 31, 2022

kneal commented Feb 1, 2022 • edited Loading

GregoryDosh commented Feb 2, 2022

Choose a reason for hiding this comment

ecrupper commented Feb 2, 2022

GregoryDosh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kneal Feb 7, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kneal Feb 7, 2022 • edited Loading

Choose a reason for hiding this comment

jbrockopp Feb 8, 2022 • edited Loading

Choose a reason for hiding this comment

jbrockopp Feb 4, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbrockopp left a comment

Choose a reason for hiding this comment

ecrupper commented Jan 31, 2022 •

edited

Loading

codecov bot commented Jan 31, 2022 •

edited

Loading

kneal commented Feb 1, 2022 •

edited

Loading

kneal Feb 7, 2022 •

edited

Loading

kneal Feb 7, 2022 •

edited

Loading

jbrockopp Feb 8, 2022 •

edited

Loading

jbrockopp Feb 4, 2022 •

edited

Loading