Switch over error types to use `thiserror` #124

jmeggitt · 2023-09-12T21:37:29Z

This pull request converts reviews the error types being used and converts them over to use the thiserror crate. I would like to note though, that the conversion involved many subjective decisions.

Changes

Remove OneIoError and FilterError variants from ParserError. My reasoning for this was that these errors can only occur when BgpkitParser is being initialized and can instead be deferred to the caller at that time.
Change add_filter from accepting strings which require parsing to accepting Filters. For variants which accepted a wider range of string values (such as TsStart and TsEnd), I added constructors to Filter which parsed a given variant from a string.
Update RecordIterator and ElemIterator so errors now get propagated to the caller. They iterate over Result<MrtRecord, ParserError> and Result<BgpElem, ParserError> respectively.
Errors are no longer ignored. Some errors were logged, but then ignored. This made it difficult to determine if all records were actually being parsed and received correctly (Especially in the case of the iterators). If an error causes an iterator to exit prematurely, then that error will be returned to the user before returning None on the next iteration.
Remove ParserErrorWithBytes. My reasoning for this is that if the user wants the bytes related to the error, then they can call parse_common_header and parse_mrt_body themselves. This should give them greater flexibility over the data while removing the overhead involved with ParserErrorWithBytes.
Add a new integration test which simply reads a sample rib dump and update dump to verify that no errors occurred during parsing.
Switch from using Buf trait (get_u8, get_u16, etc.) and Bytes functions (advance and split_to) directly to using ReadUtils. The motivation for this is that the Buf and Bytes functions panic upon running out of data. While not strictly necessary due to extra precautions taken while parsing, this switch should protect against unforeseen panics occurring.
Remove BgpModelsError. It only had a single variant which wrapped another error so I decided to remove it for simplicity.

Notes

While I did convert RisliveError to use thiserror, I did not review it with nearly as much scrutiny as ParserError. Much of the error handling involves parsing messages and most of that should really be implemented as part of the serde deserialize. This would remove the need for many of the error variants, but would take a bit of work to implement.
I left a number of TODOs in the code in places where I was unsure how to proceed. This largely includes warnings which could potentially be converted to hard errors.
I need to finish adding documentation to the ParserError variants.

…tedAttributeType

jmeggitt · 2023-09-12T21:38:25Z

Blocked by #123

digizeph

Left a few comments. Overall, it looks great!

digizeph · 2023-09-17T21:35:31Z

src/parser/bgp/attributes/mod.rs

+                // TODO: Should it be treated as a raw attribute instead?
+                _ => Err(ParserError::UnsupportedAttributeType(attr_type)),


Unsupported is fine in my opinion. We can add raw attribute support later.

I think you already implemented raw attribute support. It appears to only execute this line if it finds an attribute type it recognizes (has an enum variant), but does not get parsed.

How about?

Suggested change

// TODO: Should it be treated as a raw attribute instead?

_ => Err(ParserError::UnsupportedAttributeType(attr_type)),

_ => Ok(AttributeValue::Unknown(AttrRaw{ attr_type, bytes: Vec::from(attr_data) }))

We could probably introduce a new AttributeValue type Unsupported(AttrRaw), but I feel that's a bit unnecessary.

digizeph · 2023-09-17T21:39:40Z

src/parser/bgp/attributes/mod.rs

+                    // TODO: Is this correct? If we don't have enough bytes, split_to would panic.
+                    // it's ok to have errors when reading partial bytes
+                    warn!("PARTIAL: {}", e);


This should be fine. However, if I recall correctly, the previous version would not panic because it maintains the number of bytes it holds available to read. we should probably double-check to make sure we don't panic in this case.

You are right, it does not panic. However, it propagates the error that there are not enough bytes remaining before it reaches this code.

I don't have a good idea on this issue. My suggestion is to keep this note here and come back to this if we see panics later on with the PARTIAL warning.

digizeph · 2023-09-17T21:40:32Z

src/parser/bgp/messages.rs

    }

+    // TODO: Why do we sometimes change our length estimate?


Need more context for this question.

My confusion comes from this piece of code. Shouldn't the total size always greater or equal to the length of the message?

let bgp_msg_length = if (length as usize) > total_size { total_size - 19 } else { length as usize - 19 };

Could this be related to partial attributes being cutoff?

There is no guarantee that the number of bytes available matches the number of bytes encoded in MRT message.

length is read from the MRT message:

let length = data.read_u16()?;

It is completely possible that the value does not match the MRT message size we actually have. Therefore we have that check here to identify potential data issues. Here as we have already read all the bytes of the given MRT record into message, the length mismatch does not impact the parsing of the following records, and thus we made warnings instead of hard errors in the current version.

digizeph · 2023-09-17T21:42:12Z

src/parser/bgp/messages.rs


    if data.remaining() != bgp_msg_length {
+        // TODO: Why is this not a hard error?


It should be a hard error. We should also make sure we can continue parsing other messages after this (it probably does that).

When I change this to be an error, it fails parser::bmp::tests::test_route_monitoring (ParseError(InconsistentFieldLength { name: "BGP message length", expected: 109, found: 110 })). Do you have any ideas why this happens? All of the other tests passed, including the integration tests I added which parse through an entire update file and RIB dump to check for errors.

Honestly, I don't have much ideas on why this happens. The test was created by dumping some BMP raw messages from RouteViews BMP stream. This might indicate some potential issues with the BMP parsing that we previously were unaware of. I wonder if you could successfully run the real-time-routeviews-kafka-openbmp example with your updated code? That example will pull live data from RouteViews Kafka BMP stream.

digizeph · 2023-09-17T21:42:49Z

src/parser/bgp/messages.rs


    // let pos_end = input.position() + opt_params_len as u64;
    if input.remaining() != opt_params_len as usize {
+        // TODO: This seems like it should become a hard error


Yes. But it shouldn't stop the parsing of messages after this.

We could add the same error that you introduced previously (InconsistentFieldLength) here.

digizeph · 2023-09-17T21:44:01Z

src/parser/bgp/messages.rs

@@ -196,6 +189,7 @@ fn read_nlri(
        return Ok(vec![]);
    }
    if length == 1 {
+        // TODO: Should this become a hard error?


Yes. But it shouldn't stop the parsing of messages after this.

digizeph · 2023-09-17T21:46:28Z

src/parser/mrt/messages/table_dump_v2_message.rs


    let peer_index = input.read_u16()?;
    let originated_time = input.read_u32()?;
    if add_path {
+        // TODO: Why is this value unused?


I don't exactly remember why it has path_id here, but the read_nlri_prefix function will read the path_id and override it anyway, so it is ignored here.

digizeph · 2023-09-17T21:46:53Z

src/parser/mrt/mrt_record.rs


    let microsecond_timestamp = match &entry_type {
        EntryType::BGP4MP_ET => {
+            // TODO: Error if length < 4


jmeggitt added 11 commits September 6, 2023 10:02

Defer errors that only occur during initialization to user

084f1cd

Update RecordIterator and ElemIterator to propogate errors to the user

d7a9b74

Replace ParserError::Unsupported with UnsupportedMrtType and Unsuppor…

71f6f2a

…tedAttributeType

Replace IoNotEnoughBytes and TrunatedError with TruncatedField

4fad940

Replace ParseError variant of ParserError

8217a73

Add integration test to check no errors occur while parsing

b35bdf2

Error if attribute does not consume full attribute length

35bee4f

Convert some log warnings to errors

fea6797

Remove BgpModelsError

62f2e40

Refactor ParserBmpError to use thiserror

3cb0d7f

Refactor ParserRisliveError to use thiserror

c052183

jmeggitt mentioned this pull request Sep 12, 2023

Performance Improvements #125

Draft

Merge branch 'main' of github.com:jmeggitt/bgpkit-parser into thiserror

0cb83b5

digizeph reviewed Sep 17, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch over error types to use `thiserror` #124

Switch over error types to use `thiserror` #124

jmeggitt commented Sep 12, 2023

jmeggitt commented Sep 12, 2023

digizeph left a comment

digizeph Sep 17, 2023

jmeggitt Sep 28, 2023 •

edited

Loading

digizeph Oct 2, 2023

digizeph Sep 17, 2023

jmeggitt Sep 28, 2023 •

edited

Loading

digizeph Oct 2, 2023

digizeph Sep 17, 2023

jmeggitt Sep 28, 2023

digizeph Oct 2, 2023

digizeph Sep 17, 2023

jmeggitt Sep 28, 2023 •

edited

Loading

digizeph Oct 2, 2023

digizeph Sep 17, 2023

digizeph Oct 2, 2023

digizeph Sep 17, 2023

digizeph Sep 17, 2023

digizeph Sep 17, 2023

		// TODO: Should it be treated as a raw attribute instead?
		_ => Err(ParserError::UnsupportedAttributeType(attr_type)),

	// TODO: Should it be treated as a raw attribute instead?
	_ => Err(ParserError::UnsupportedAttributeType(attr_type)),
	_ => Ok(AttributeValue::Unknown(AttrRaw{ attr_type, bytes: Vec::from(attr_data) }))


		if data.remaining() != bgp_msg_length {
		// TODO: Why is this not a hard error?

Switch over error types to use thiserror #124

Are you sure you want to change the base?

Switch over error types to use thiserror #124

Conversation

jmeggitt commented Sep 12, 2023

Changes

Notes

jmeggitt commented Sep 12, 2023

digizeph left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmeggitt Sep 28, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmeggitt Sep 28, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmeggitt Sep 28, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Switch over error types to use `thiserror` #124

Switch over error types to use `thiserror` #124

jmeggitt Sep 28, 2023 •

edited

Loading

jmeggitt Sep 28, 2023 •

edited

Loading

jmeggitt Sep 28, 2023 •

edited

Loading