Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jq hangs with 100% CPU #1033

Closed
yurivict opened this issue Dec 3, 2015 · 13 comments
Closed

jq hangs with 100% CPU #1033

yurivict opened this issue Dec 3, 2015 · 13 comments
Assignees
Milestone

Comments

@yurivict
Copy link

yurivict commented Dec 3, 2015

This script hangs

#!/bin/sh

jq -s '((.[0]["packageDependencies"] | with_entries({key:(.key), value:(.value|gsub("(.*)";"\\1"; "x"))})) | {packageDependencies:.})' package1.json

when run on this package1.json

{
  "packageDependencies": {
    "language-yaml": "0.24.1"
  }
}

Since jq is supposed to work in a stream fashion, I don't think it should ever enter the endless cycle.

jq-1.5 on FreeBSD-10.2

@nicowilliams
Copy link
Contributor

The local function mysub in the implementation of sub recurses forever. There's also a bug in the filtering out of the g regexp flag in the subg local function of sub.

@yurivict
Copy link
Author

yurivict commented Dec 3, 2015

It also doesn't look like the group replacement tags like "\1" ever work.

@pkoppstein
Copy link
Contributor

@yurivict - See https://github.com/stedolan/jq/wiki/FAQ#support-for-regular-expressions regarding "named captures". It was never the intention to support "\1" (which as you note would have to be "\\1").

@nicowilliams
Copy link
Contributor

@pkoppstein Please review this diff:

diff --git a/src/builtin.jq b/src/builtin.jq
index 4add681..5703e0a 100644
--- a/src/builtin.jq
+++ b/src/builtin.jq
@@ -125,7 +125,7 @@ def sub($re; s):
 #
 # If s contains capture variables, then create a capture object and pipe it to s
 def sub($re; s; flags):
-  def subg: explode | select(. != 103) | implode;
+  def subg: [explode[] | select(. != 103)] | implode;
 # "fla" should be flags with all occurrences of g removed; gs should be non-nil if flags has a g
   def sub1(fla; gs):
     def mysub:
@@ -139,7 +139,7 @@ def sub($re; s; flags):
             ({}; . + $pair)
         | $in[0:$edit.offset]
           + s
-          + ($in[$len:] | if gs then mysub else . end)
+          + ($in[$len:] | select(length>0) | if gs then mysub else . end)
         end ;
     mysub ;
     (flags | index("g")) as $gs

The first hunk is clear: without this fix subg is the identity function for the inputs that it gets.

The second hunk adds a termination condition to the recursion in mysub.

@nicowilliams nicowilliams self-assigned this Dec 3, 2015
@nicowilliams
Copy link
Contributor

@yurivict What @pkoppstein is saying is that you need to use named capture groups.

@yurivict
Copy link
Author

yurivict commented Dec 3, 2015

Ok, thanks, I got it.

@nicowilliams
Copy link
Contributor

@wtlangford How hard would it be to support numbered backrefs?

@yurivict
Copy link
Author

yurivict commented Dec 3, 2015

I think backrefs are more common and familiar.

@pkoppstein
Copy link
Contributor

@nicowilliams wrote:

The second hunk adds a termination condition to the recursion in mysub.

Yes, but your version will zap the result. The following passes the test:

("" | gsub( "(.*)"; "";  "x") )

(amongst others):

 ($in[$len:] | if length > 0 and gs then mysub else . end)

@yurivict - Regarding "\\1" -- that doesn't really fit in with jq very well. E.g., jq string interpolation is built around \(...).

@pkoppstein
Copy link
Contributor

@yurivict - In case you're not aware that one can easily override jq-defined builtins by simply using 'def', here is a version which addresses the problems:

def sub($re; s; flags):
    def subg: [explode[] | select(. != 103)] | implode;
    # "fla" should be flags with all occurrences of g removed; gs should be non-nil if flags has a g
    def sub1(fla; gs):
      def mysub:
        . as $in
        | [match($re; fla)]
        | if length == 0 then $in
          else .[0] as $edit
          | ($edit | .offset + .length) as $len
            # create the "capture" object:
          | reduce ( $edit | .captures | .[] | select(.name != null) | { (.name) : .string } ) as $pair
              ({}; . + $pair)
          | $in[0:$edit.offset]
            + s
            + ($in[$len:] | if length > 0 and gs then mysub else . end)
          end ;
      mysub ;

    (flags | index("g")) as $gs
    | (flags | if $gs then subg else . end) as $fla
    | sub1($fla; $gs);

@yurivict
Copy link
Author

yurivict commented Dec 5, 2015

Thanks! I wasn't aware.

@yurivict yurivict closed this as completed Dec 5, 2015
@yurivict yurivict reopened this Dec 5, 2015
@nicowilliams
Copy link
Contributor

Right, this stays open till we fix it.

@nicowilliams
Copy link
Contributor

Fixed by @pkoppstein with ad8d1a8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants