Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI polish and fixes for deployment #44

Closed
4 tasks done
tegefaulkes opened this issue Oct 25, 2023 · 18 comments · Fixed by MatrixAI/Polykey#611 or #50
Closed
4 tasks done

CLI polish and fixes for deployment #44

tegefaulkes opened this issue Oct 25, 2023 · 18 comments · Fixed by MatrixAI/Polykey#611 or #50
Assignees
Labels
development Standard development r&d:polykey:core activity 1 Secret Vault Sharing and Secret History Management

Comments

@tegefaulkes
Copy link
Contributor

tegefaulkes commented Oct 25, 2023

Specification

While testing hole punching with the testnet, we found a few issues with the CLI.

  1. Nodes Get All is throwing an error
  2. We are getting a weird output format for nodes connections what is the 0 and what is this NA for?
  3. I think nodes connections is better suited for dictionary format - list format or table format (can you check which ones done robustly)
  4. The output changes sometimes, to just join the NA after the ip address, so no port is available
  5. Summary - the nodes connections command is just very non-deterministic
  6. Secret leak is occurring on agent service, there should be no authorisation metadata see --format=json
  7. What's usageCount

Automatic Encoding

For these two formats, encoding is applied by default so that special characters will not mess up the padding for any given output.

  • dict format
    • will automatically encase any string in a \"\" and apply encoding to the input
  • table format
    • will automatically encase any string in a \"\" and apply encoding to the input

Manual Encoding with Utility Functions

For any other format passed into outputFormatter, there will be no encoding of escapable characters by default.

It is up to the user to use the following functions in src/utils/utils.ts to encode parts of their input:

/**
 * This function:
 *
 * 1. Keeps regular spaces, only ' ', as they are.
 * 2. Converts \\n \\r \\t to escaped versions, \\\\n \\\\r and \\\\t.
 * 3. Converts other control characters to their Unicode escape sequences.\
 * 4. Converts ' \` " to escaped versions, \\\\' \\\\\` and \\\\"
 *
 * Unless you're using this in a `JSON.stringify` replacer, you probably want to use {@link encodeEscapedWrapped} instead.
 */
function encodeEscaped(str: string): string
/**
 * This function:
 * 1. Keeps regular spaces, only ' ', as they are.
 * 2. Converts \\n \\r \\t to escaped versions, \\\\n \\\\r and \\\\t.
 * 3. Converts other control characters to their Unicode escape sequences.
 * 4. Converts ' \` " to escaped versions, \\\\' \\\\\` and \\\\"
 * 5. Wraps the whole thing in `""` if any characters have been encoded.
 */
function encodeEscapedWrapped(str: string): string

To use this:

const v = `test ${encodeEscapedWrapped("\u0000\"\n")}` // returns '"\\u0000\\"\\n"'

Note that any encoded strings must be encased in "" anyways. This is so that when the user copy-pastes the output as the input of another command, or pipes the output to another command, those escaped characters will be collapsed correctly.

Additional context

Tasks

  • 1. Fix up nodes getall command, it's throwing an error relating to host not being defined.
  • 2. Review and fix up the nodes connections command output formatting. The output is very unclear.
  • 3. client call metadata is being included in the nodes connections --format json output. This should not be the case.
  • 4. agent status command outputs too much information. We should strip out the key and certificate information.
@tegefaulkes tegefaulkes added the development Standard development label Oct 25, 2023
@tegefaulkes tegefaulkes self-assigned this Oct 25, 2023
@tegefaulkes
Copy link
Contributor Author

tegefaulkes commented Oct 25, 2023

The NodesGetAll handler needs to be fixed. I did a monkey patch while testing and it needs to be applied to Polykey.

// client/handlers/NodesgetAll.ts

for await (const [index, bucket] of nodeGraph.getBuckets()) {
            for (const [id, info] of bucket) {
                const encodedId = nodesUtils.encodeNodeId(id);
                // For every node in every bucket, add it to our message
                if (ctx.signal.aborted)
                    throw ctx.signal.reason;
                yield {
                    bucketIndex: index,
                    nodeIdEncoded: encodedId,
                    host: info.address.host,
                    port: info.address.port,
                };
            }
        }

@tegefaulkes
Copy link
Contributor Author

Looking at the tests, we're missing tests for CommandGetAll.ts and CommandConnections.ts

@addievo addievo mentioned this issue Oct 25, 2023
11 tasks
@CMCDragonkai
Copy link
Member

Some of this overlaps with issues already created in #40. @addievo please review them and add in the related issues to this.

@addievo
Copy link
Contributor

addievo commented Oct 26, 2023

About

Fix up nodes getall command, it's throwing an error relating to host not being defined.

The NodesGetAll handler needs to be fixed. I did a monkey patch while testing and it needs to be applied to Polykey.

// client/handlers/NodesgetAll.ts

for await (const [index, bucket] of nodeGraph.getBuckets()) {
            for (const [id, info] of bucket) {
                const encodedId = nodesUtils.encodeNodeId(id);
                // For every node in every bucket, add it to our message
                if (ctx.signal.aborted)
                    throw ctx.signal.reason;
                yield {
                    bucketIndex: index,
                    nodeIdEncoded: encodedId,
                    host: info.address.host,
                    port: info.address.port,
                };
            }
        }

@tegefaulkes did you push this up?

@tegefaulkes
Copy link
Contributor Author

No, It needs to be done in polykey.

@tegefaulkes
Copy link
Contributor Author

We also need to create tests for the nodes getall and nodes connections commands.

@addievo
Copy link
Contributor

addievo commented Oct 27, 2023

Issues still to be resolved :

nodesConnection is is sometimes lising the IP as as a ipv6 mapped ipv4 address. This needs to be looked into.

@CMCDragonkai
Copy link
Member

Don't cross out any of these tasks.

@CMCDragonkai CMCDragonkai assigned addievo and unassigned tegefaulkes Oct 30, 2023
@CMCDragonkai
Copy link
Member

@addievo If you want, make another issue for 3., but make sure to spec it out.

@addievo
Copy link
Contributor

addievo commented Oct 31, 2023

  1. nodesConnection is is sometimes lising the IP as as a ipv6 mapped ipv4 address. This needs to be looked into.

A new issue has been created regarding this in Polykey

MatrixAI/Polykey#614

Removing the same from this issue.

@addievo addievo closed this as completed Oct 31, 2023
@addievo
Copy link
Contributor

addievo commented Oct 31, 2023

Overview of what was fixed:

  1. nodes getall was fixed and the correct output is now displayed. Although, this was fixed in Polykey, in Fix nodesGetAll handler Polykey#611

  2. nodes connections now uses the new modified table format with TSV and stream based padding. Also customisable labels can be added to all table layouts, and rows can be numbered with the flick of a boolean. The outputs look like so now:

image

  1. nodes connections with --format json was outputting Auth token and sig tokens, this was definitely not required and a bug, now this is sanitised in JSON.

image

  1. agent status used to have PEM and JWK, which was unnecessary in status, as you can just use keys cert to output the same, consequently they were stripped from agent status.

image

@amydevs
Copy link
Member

amydevs commented Nov 9, 2023

the spec has been moved here from #50

@CMCDragonkai
Copy link
Member

What if there's other kinds non-printable characters? Usually when quoted in JS, they appear as slash unicodes. Also I was wondering if we could have just used the standard console.log and captured it's output somehow. JS seems to have native quoting when it comes to strings. Did you have a look?

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Nov 9, 2023

Some things to consider here if it hasn't been properly considered:

  1. PK commands are likely to be combined in various ways in the shell. This means things like cmd $(pk ...), cmd <(pk ...), pk ... | cmd, cmd | pk ..., pk ... > ./blah, pk ... <<<blahblah. This means at any point in time where this occurs, we don't actually want to quote things unless the expected output in the case of dict and table fundamentally requires disambiguation between the usage of \n and \t formatting characters.
  2. There is only 3 output formats that make use of formatting characters, and that's list, dict and table. In all 3 cases, they \n, and for dict and table they use \t. Our table format should use the Linear TSV specification: http://dataprotocols.org/linear-tsv/. The dict format however does not an equivalent, and we just use \t to separate between keys and values, and to pad the keys, and each line is then terminated with a \n. Quotation is needed for the keys and values if they contain control or non-printable characters.
  3. Beware of unicode non-printable characters, they tend to appear like \u134908. However a question must be answered here, how do we deal with these? In some cases it make sense to always print them with the \u quotation, otherwise they are not printable in any sense.
  4. List output only uses \n as delimiter. What if the list value itself has a \n internally? Then it likely needs to be quoted too. We could use "\n" in that scenario. Consider that when printing out the list of certificates, the certificate PEM string itself may already have \n in it. When printing, do we print one line at a time for each certificate? It would look strange. Do we just print the \n and thus realise that on the input side, there's no guarantee that 1 line is 1 field.

The main issue I'm wrestling with here is that when quotation is used, it impacts piping, composition, as in it complicates it requiring the input side to realise how to deal with the quoted form.

I think the main problem is that we have a CLI with a STDOUT that could be going to the terminal - an interactive thing, or to non-interactive outputs.

We can actually detect if the output fd is interactive or non-interactive. See: https://github.com/sindresorhus/is-interactive and we can make decisions here.


I have a feeling we will need to revisit this.

@CMCDragonkai
Copy link
Member

I think we should study the linear TSV specification and see if it can apply to the dict output. Because I can imagine a dict output is the equivalent of a 2-column table output. And if linear TSV works for the table output, it should work for the dict output too.

@CMCDragonkai
Copy link
Member

My other issue is dealing with \n inside fields. Sure we can quote it to make it easier, but will it affect the readability in some cases? Like seeing all the certificate chain in PEM format.

There might be case-by-case exceptions?

@CMCDragonkai
Copy link
Member

The UI/UX of terminal commands is its own field of expertise, it needs to be considered carefully since it's our initial usecase and targetted to developers and system operators.

@CMCDragonkai
Copy link
Member

These points needs to be turned into a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Standard development r&d:polykey:core activity 1 Secret Vault Sharing and Secret History Management
4 participants