Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Brim hides the string value "(empty)" #832

Closed
mikesbrown opened this issue May 28, 2020 · 3 comments · Fixed by #1603
Closed

Brim hides the string value "(empty)" #832

mikesbrown opened this issue May 28, 2020 · 3 comments · Fixed by #1603
Labels
bug Something isn't working good first issue

Comments

@mikesbrown
Copy link
Contributor

For #828, I thought of any string Brim might treat specially. "(empty)" is one of them. Given a toy data set containing a string value of "(empty)", Brim will not render the string. This is true of both scalars and container members.

Consider empty.log, which looks like:

#separator \x09
#set_separator	,
#empty_field	(empty)
#unset_field	-
#path	string
#fields	ts	id	scalar	record.scalar	record.set	vector
#types	time	count	string	string	set[string]	vector[string]
1582646597.838527	15	(empty)	-	-	-
1582646597.838527	30	-	(empty)	-	-
1582646597.838527	44	-	-	(empty)	-
1582646597.838527	48	-	-	",',*,;,(empty),\xe2\x88\xab\xc2\xa3\xc5\x93\xc3\x9f\xc3\xbc\xe2\x84\xa2	-
1582646597.838527	49	-	-	-	(empty)
1582646597.838527	53	-	-	-	(empty),",',;,\xe2\x88\xab\xc2\xa3\xc5\x93\xc3\x9f\xc3\xbc\xe2\x84\xa2,*

It contains strings with the value (empty) and also empty fields, using the default Zeek empty_field (empty).

Here's the TZNG, and notice the only time (empty) appears is as string values. Empty fields in TZNG are [] (on lines where id=44 or id=49).

#0:record[_path:string,ts:time,id:uint64,scalar:bstring,record:record[scalar:bstring,set:set[bstring]],vector:array[bstring]]
0:[string;1582646597.838527;15;(empty);[-;-;]-;]
0:[string;1582646597.838527;30;-;[(empty);-;]-;]
0:[string;1582646597.838527;44;-;[-;[]]-;]
0:[string;1582646597.838527;48;-;[-;[";';*;\x3b;(empty);∫£œßü™;]]-;]
0:[string;1582646597.838527;49;-;[-;-;][]]
0:[string;1582646597.838527;53;-;[-;-;][(empty);";';\x3b;∫£œßü™;*;]]

This special (empty) handling seems to be a pre-ZNG relic. There's nothing special about (empty) in the ZNG spec. If Brim is reading ZNG, then it doesn't need to special-case the string (empty) anymore.

Here's what happens in the app:

image

"(empty)" is found in string searches, but is not displayed.

@philrz
Copy link
Contributor

philrz commented May 31, 2020

@mikesbrown: You've definitely stumbled onto something legit here. However, having worked with some of these special Zeek values in the past, I did a bit more tinkering and have a bit more color to add. Maybe this should turn into more than one issue.

This is somewhat foggy territory since the Zeek TSV format lacks a formal specification (other than the implicit "the code's behavior is the spec"). But based on black box testing, I think we can make some conclusions.

The important one is that, if we assume Zeek TSV files are ones that could be generated by Zeek itself, I don't think (empty) is a valid value for a logged string type. Consider the following Zeek script:

module Empty;

export {
    redef enum Log::ID += { LOG };

    type Info: record {
        ts:            time &log;
        my_str:        string &log;
        };
    }

event zeek_init()
    {
    Log::create_stream(Empty::LOG, [$columns=Empty::Info, $path="empty"]);

    Log::write( Empty::LOG, [$ts=network_time(),
                             $my_str="(empty)"]);
    }

Run through Zeek v3.1.3, it produces the following output empty.log:

#separator \x09
#set_separator	,
#empty_field	(empty)
#unset_field	-
#path	empty
#open	2020-05-31-15-41-52
#fields	ts	my_str
#types	time	string
1590964912.290580	\x28empty)
#close	2020-05-31-15-41-52

That is, since (empty) is a reserved word to be used specifically for the contexts you identified regarding empty set and vector types, when appearing as a value of a string type, Zeek does the right thing and escapes it.

Your issue is still very much legit since when zq renders it as TZNG, it strips away any special escaping and turns it back into (empty). That makes sense because the value has no special meaning outside the Zeek TSV format, but it does still trigger the issue you cited where the Brim app ends up hiding them.

$ zq -t empty.log 
#0:record[_path:string,ts:time,my_str:bstring]
0:[empty;1590964912.29058;(empty);]

Going back to the topic of what's legit in Zeek TSV, when zq is asked to print a string of this value in Zeek format, it should probably mimic Zeek's behavior and escape it. Right now it's just printing it as-is and therefore creating output that would not have come from Zeek itself.

$ zq -t empty.log | zq -f zeek -
#separator \x09
#set_separator	,
#empty_field	(empty)
#unset_field	-
#path	empty
#fields	ts	my_str
#types	time	string
1590964912.290580	(empty)

mikesbrown added a commit that referenced this issue Jun 1, 2020
In #832 @philrz explains why these should be changed
mikesbrown added a commit that referenced this issue Jun 1, 2020
In #832 @philrz explains why these should be changed
@mikesbrown
Copy link
Contributor Author

Run through Zeek v3.1.3, it produces the following output empty.log:

#separator \x09
#set_separator	,
#empty_field	(empty)
#unset_field	-
#path	empty
#open	2020-05-31-15-41-52
#fields	ts	my_str
#types	time	string
1590964912.290580	\x28empty)
#close	2020-05-31-15-41-52

Thanks @philrz . I fixed this in #840. I realize it was an oversight not to include you as a reviewer in #828 .

@philrz
Copy link
Contributor

philrz commented Aug 30, 2022

I've verified that this issue is no longer with us. I got curious and did a binary search which confirmed that it went away as of commit f7dabb8, which was associated with #1603. Specifically, it appears that #1603 (comment) did the trick. The screenshot below shows Brim at that commit successfully displaying this log that contains (empty):

#separator \x09
#set_separator	,
#empty_field	(empty)
#unset_field	-
#path	empty
#open	2020-05-31-15-41-52
#fields	ts	my_str
#types	time	string
1590964912.290580	(empty)
#close	2020-05-31-15-41-52

image

@philrz philrz closed this as completed Aug 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants