Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

main: use seperate stream for control characters #4

Merged
merged 10 commits into from
May 21, 2024

Conversation

mofosyne
Copy link

Stab at this idea for ggerganov#6923

One thing you could do is print the special tokens out of band to file descriptor 3. Then if a shell script doesn't want them, it could either pass a flag to disable them, or simply say 3>/dev/null.

Copy link

@mrdomino mrdomino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not 100% sure about the overall approach yet, but the implementation looks basically sound aside from the comments.

examples/main/main.cpp Outdated Show resolved Hide resolved
examples/main/main.cpp Show resolved Hide resolved
examples/main/main.cpp Show resolved Hide resolved
examples/main/main.cpp Outdated Show resolved Hide resolved
examples/main/main.cpp Outdated Show resolved Hide resolved
examples/main/main.cpp Outdated Show resolved Hide resolved
@mofosyne
Copy link
Author

Fixed @mrdomino ready for you to test

@mofosyne mofosyne requested a review from mrdomino May 20, 2024 14:45
examples/main/main.cpp Outdated Show resolved Hide resolved
common/common.cpp Outdated Show resolved Hide resolved
if (!params.conversation && sparams.grammar.empty())
{
// Stream Control Token To Standard Output as long as we are not in a conversation or grammar output
fprintf(stdout, "%s", token_str.c_str());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stdout is buffered, so if you are going to be using it, you will need to make sure fflush(stdout) is called before every write to the token fd. Otherwise the tokens will not interleave correctly.

You could also switch to calling write on fd 1 and make the output fully unbuffered, but I think this would be suboptimal.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added fflush(stdout) before every fprintf() now

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wouldn't you flush after each print?

Copy link

@mrdomino mrdomino May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jart my theory was that you'd want to take advantage of the buffering. If you don't care about buffering output then absolutely, or even setbuf(stdout, NULL) at the start of the program.

ETA: the other thing is that if you have anything else doing anything to stdout it can mess with the interleaving... it just seemed simpler to me to constrain it to where you were doing the token write

examples/main/main.cpp Outdated Show resolved Hide resolved
@mofosyne mofosyne force-pushed the gramma-token-seperate-stream branch from 6c3890f to 5032f18 Compare May 20, 2024 18:48
@mofosyne mofosyne requested a review from mrdomino May 20, 2024 18:57
@mofosyne
Copy link
Author

@mrdomino thanks for the command string you used to test.

This took a while, but I realised now that the C file open and bash are using the same file descriptor... so we really actually needed to check right at the start of the program to fully check if the file descriptor was already claimed by bash.

Anyway this is the full test process to prove that this is behaving as I was expecting, you may want to probe it a bit further, but I'm a bit more confident now that I know what I'm doing now here.

cmake -B build -DCMAKE_BUILD_TYPE=Debug
cmake --build build

echo "== Expect Control Token To Shared Console 3>&1 =="
./build/bin/main --hf-repo TheBloke/phi-2-GGUF --hf-file phi-2.Q6_K.gguf --grammar 'root ::= "yes" | "no"' --temp 0 -c 0 --no-display-prompt --log-disable -p "<|user|>
Say yes
<|assistant|>" 2>/dev/null 3>&1
echo
echo "== Expect No Control Token To Console because 3>/dev/null =="
./build/bin/main --hf-repo TheBloke/phi-2-GGUF --hf-file phi-2.Q6_K.gguf --grammar 'root ::= "yes" | "no"' --temp 0 -c 0 --no-display-prompt --log-disable -p "<|user|>
Say yes
<|assistant|>" 2>/dev/null 3>/dev/null
echo
echo "== Expect No Control Token To Console because 3>&- =="
./build/bin/main --hf-repo TheBloke/phi-2-GGUF --hf-file phi-2.Q6_K.gguf --grammar 'root ::= "yes" | "no"' --temp 0 -c 0 --no-display-prompt --log-disable -p "<|user|>
Say yes
<|assistant|>" 2>/dev/null 3>&-
echo
echo == Expect No Control Token To Console as we are still in grammar mode ==
./build/bin/main --hf-repo TheBloke/phi-2-GGUF --hf-file phi-2.Q6_K.gguf --grammar 'root ::= "yes" | "no"' --temp 0 -c 0 --no-display-prompt --log-disable -p "<|user|>
Say yes
<|assistant|>" 2>/dev/null
echo
echo == Expect Control Token To Console as we are in normal completion mode ==
./build/bin/main --hf-repo TheBloke/phi-2-GGUF --hf-file phi-2.Q6_K.gguf --temp 0 -c 0 --no-display-prompt --log-disable -p "<|user|>
Hi
<|assistant|>" 2>/dev/null 3>&1
echo

Which would result in this output

== Expect Control Token To Shared Console 3>&1 ==
yes<|endoftext|>
== Expect No Control Token To Console because 3>/dev/null ==
yes
== Expect No Control Token To Console because 3>&- ==
yes
== Expect No Control Token To Console as we are still in grammar mode ==
yes
== Expect Control Token To Console as we are in normal completion mode ==


I'm glad you found the information helpful. If you have any more questions or need further assistance, feel free to ask.

Best regards,
<|assistant|>
<|endoftext|>

@mofosyne mofosyne requested a review from jart May 21, 2024 05:19
examples/main/main.cpp Outdated Show resolved Hide resolved
examples/main/main.cpp Outdated Show resolved Hide resolved
examples/main/main.cpp Outdated Show resolved Hide resolved
if (!params.conversation && sparams.grammar.empty())
{
// Stream Control Token To Standard Output as long as we are not in a conversation or grammar output
fprintf(stdout, "%s", token_str.c_str());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wouldn't you flush after each print?

common/common.cpp Outdated Show resolved Hide resolved
@mofosyne mofosyne force-pushed the gramma-token-seperate-stream branch from 86b9e96 to 8f76ba5 Compare May 21, 2024 06:06
@mofosyne mofosyne requested a review from jart May 21, 2024 06:09
#ifndef _MSC_VER
if (control_token_file_descriptor_is_attached) {
// Stream Control Token To Special Token Output. Useful for debugging control token behaviour
(void)! write(SPECIAL_FILENO, token_str.c_str(), token_str.length());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extraneous ! here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also I'm not sure why you'd have (void) before write but not fprintf

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intentional or it breaks make command

(void) write(SPECIAL_FILENO, token_str.c_str(), token_str.length());
examples/main/main.cpp: In function ‘int main(int, char**)’:
examples/main/main.cpp:767:37: warning: ignoring return value of ‘ssize_t write(int, const void*, size_t)’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
  767 |                         (void) write(SPECIAL_FILENO, token_str.c_str(), token_str.length());
      |                                ~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, apparently gcc and glibc have collaborated to do an extremely dumb thing. fine, keep it.

Copy link
Author

@mofosyne mofosyne May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Context: https://bugs.llvm.org/show_bug.cgi?id=51228
Context: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425

A possible appproach according to @mrdomino

#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wunused-result"
write(...);
#pragma GCC diagnostic pop

@mofosyne mofosyne requested a review from mrdomino May 21, 2024 06:39
Copy link

@mrdomino mrdomino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@jart jart merged commit 178d2b3 into jart:grammar-token May 21, 2024
3 of 6 checks passed
@mofosyne mofosyne deleted the gramma-token-seperate-stream branch May 21, 2024 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants