Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Bash trap framework basics, refactored stack trace #8527

Merged
merged 2 commits into from
Jun 22, 2016

Conversation

stevekuznetsov
Copy link
Contributor

This will begin the series of pulls that hopefully will bring in the trap framework.

@Miciah please (re-)review the Bash bits
@smarterclayton please review the last commit

@@ -29,7 +29,8 @@ fi

# Build the primary client/server for all platforms
OS_BUILD_PLATFORMS=("${platforms[@]}")
os::build::build_binaries "${OS_CROSS_COMPILE_TARGETS[@]}"
# Create a sub-shell so that we don't pollute the outer environment
( os::build::build_binaries "${OS_CROSS_COMPILE_TARGETS[@]}" )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make build_binaries itself run its work in a subshell, rather than propagating up and out?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My question was, also, if we have a work-around for the $GOBIN thing for cross-compiles, why not use that work-around for everything and never leak a changed $GOBIN, thereby not needing the subshell at all?

@openshift-bot openshift-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 18, 2016
@stevekuznetsov
Copy link
Contributor Author

@smarterclayton addressed comments.

New stack trace:

$ hack/build-go.sh 
++ Building go targets for linux/amd64: cmd/openshift cmd/oc
# github.com/openshift/origin/pkg/cmd/util
pkg/cmd/util/env.go:16: syntax error: unexpected Int, expecting (
pkg/cmd/util/env.go:25: Env redeclared in this block
    previous declaration at pkg/cmd/util/env.go:16
[ERROR] PID 19547: hack/common.sh:311: `GOOS=${platform%/*} GOARCH=${platform##*/} go install "${goflags[@]:+${goflags[@]}}" -ldflags "${version_ldflags}" "${nonstatics[@]}"` exited with status 2.
[INFO]      Stack Trace: 
[INFO]        1: hack/common.sh:311: `GOOS=${platform%/*} GOARCH=${platform##*/} go install "${goflags[@]:+${goflags[@]}}" -ldflags "${version_ldflags}" "${nonstatics[@]}"`
[INFO]        2: hack/common.sh:255: os::build::internal::build_binaries
[INFO]        3: hack/build-go.sh:36: os::build::build_binaries
[INFO]   Exiting with code 2.
[ERROR] PID 19521: hack/common.sh:253: `( os::build::internal::build_binaries "$@" )` exited with status 2.
[INFO]      Stack Trace: 
[INFO]        1: hack/common.sh:253: `( os::build::internal::build_binaries "$@" )`
[INFO]        2: hack/build-go.sh:36: os::build::build_binaries
[INFO]   Exiting with code 2.

Old stack trace:

$ hack/build-go.sh 
++ Building go targets for linux/amd64: cmd/openshift cmd/oc
# github.com/openshift/origin/pkg/cmd/util
pkg/cmd/util/env.go:16: syntax error: unexpected Int, expecting (
pkg/cmd/util/env.go:25: Env redeclared in this block
    previous declaration at pkg/cmd/util/env.go:16
!!! Error in hack/../hack/common.sh:301
    'GOOS=${platform%/*} GOARCH=${platform##*/} go install "${goflags[@]:+${goflags[@]}}" -ldflags "${version_ldflags}" "${nonstatics[@]}"' exited with status 2
Call stack:
    1: hack/../hack/common.sh:301 os::build::build_binaries(...)
    2: hack/build-go.sh:32 main(...)
Exiting with status 1
!!! Error in hack/../hack/common.sh:253
    '( os::build::setup_env; local version_ldflags; version_ldflags=$(os::build::ldflags); local goflags; eval "goflags=(${OS_GOFLAGS:-})"; local arg; for arg in "$@";
do
    if [[ "${arg}" == -* ]]; then
        goflags+=("${arg}");
    fi;
done; os::build::export_targets "$@"; local -a nonstatics=(); local -a tests=(); for binary in "${binaries[@]}";
do
    if [[ "${binary}" =~ ".test"$ ]]; then
        tests+=($binary);
    else
        nonstatics+=($binary);
    fi;
done; local host_platform=$(os::build::host_platform); local platform; for platform in "${platforms[@]}";
do
    echo "++ Building go targets for ${platform}:" "${targets[@]}"; mkdir -p "${OS_OUTPUT_BINPATH}/${platform}"; if [[ $platform == $host_platform ]]; then
        export GOBIN="${OS_OUTPUT_BINPATH}/${platform}";
    else
        unset GOBIN;
    fi; if [[ ${#nonstatics[@]} -gt 0 ]]; then
        GOOS=${platform%/*} GOARCH=${platform##*/} go install "${goflags[@]:+${goflags[@]}}" -ldflags "${version_ldflags}" "${nonstatics[@]}"; if [[ $platform != $host_platform ]]; then
            local platform_src="/${platform//\//_}"; mv "${OS_TARGET_BIN}/${platform_src}/"* "${OS_OUTPUT_BINPATH}/${platform}/";
        fi;
    fi; for test in "${tests[@]:+${tests[@]}}";
    do
        local outfile="${OS_OUTPUT_BINPATH}/${platform}/$(basename ${test})"; GOOS=${platform%/*} GOARCH=${platform##*/} go test -c -o "${outfile}" "${goflags[@]:+${goflags[@]}}" -ldflags "${version_ldflags}" "$(dirname ${test})";
    done;
done )' exited with status 1
Call stack:
    1: hack/../hack/common.sh:253 os::build::build_binaries(...)
    2: hack/build-go.sh:32 main(...)
Exiting with status 1

@stevekuznetsov
Copy link
Contributor Author

[test]me

@openshift-bot openshift-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 18, 2016
@stevekuznetsov
Copy link
Contributor Author

Flake on #7429

re[test]

@smarterclayton
Copy link
Contributor

Conformance flake is #8491

On Mon, Apr 18, 2016 at 2:25 PM, OpenShift Bot [email protected]
wrote:

continuous-integration/openshift-jenkins/test FAILURE (
https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/3067/)


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#8527 (comment)

local bash_source="$( os::util::repository_relative_path "${BASH_SOURCE[$i]}" )"
if [[ -z "${preamble_finished-}" ]]; then
preamble_finished=true
os::log::error "PID ${BASHPID:-$$}: ${bash_source}:${BASH_LINENO[$i-1]}: \`${last_command}\` exited with status ${return_code}." >&2
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Miciah I understand ${BASHPID:-$$} isn't technically a loss-less default, but I'm not particularly interested in making this nice for people using 12-year-old-Bash, so instead of a clear indication that each stacktrace is for a failed process, they'll just always see the parent PID.

@stevekuznetsov
Copy link
Contributor Author

integration flake: #8573

re[test]

@stevekuznetsov
Copy link
Contributor Author

@Miciah were you able to take a look? If not, @eparis do you mind doing a review for this?

local i
for (( i = stack_begin_index; i < ${#BASH_SOURCE[@]}; i++ )); do
local bash_source="$( os::util::repository_relative_path "${BASH_SOURCE[$i]}" )"
if [[ -z "${preamble_finished-}" ]]; then
Copy link
Member

@eparis eparis Apr 27, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing : ? (it doesn't actually matter I guess, but it is more consistent)

( os::build::internal::build_binaries "$@" )
}

# Build binaries targets specified. Should always be run in a sub-shell so we don't leak GOBIN
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like "binaries targets" here and on line 261 should be "binary targets" instead.

@Miciah
Copy link
Contributor

Miciah commented Apr 27, 2016

Nothing uses os::util::trap::init, nothing uses os::util::trap::init_exit except for os::util::trap::init, and nothing sets START_TIME. I take it users will come in future commits?

source "${OS_ROOT}/hack/lib/util/trap.sh"
source "${OS_ROOT}/hack/lib/log/stacktrace.sh"

trap os::test::junit::reconcile_output EXIT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a fix that is not directly related to the other changes, or am I mistaken?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good eyes - I just caught this while I was fixing preambles.

@Miciah
Copy link
Contributor

Miciah commented Apr 27, 2016

Reviewed, commented on some minor things. Sorry for the delay!

@stevekuznetsov
Copy link
Contributor Author

AWS Infra flake:

The instance never became "ready" in AWS.

re[test]

@stevekuznetsov
Copy link
Contributor Author

re[test]

This changes the prior implementation of the stacktrace to use the
new stacktrace implementation as well as the newly-implemented Bash
trap framework for the ERR signal.

In order to make the stacktrace output more useful, especially for
someone attempting to re-run the commands used, we evaluate all of
the variables in the $BASH_COMMAND as best we can. We will not be
able to ever correctly evaluate positional variables ($1, $2, $@)
or variables created in the local scopes of the trap handler or in
the stacktrace function such as $return_code, $last_command, or
$errexit_set. For this reason, we print both $BASH_COMMAND and our
attempt to substitute variables.

Signed-off-by: Steve Kuznetsov <[email protected]>
By changing the amount of code that is literally contained
in the sub-shell `()` braces, we allow the stacktrace that
is generated to contain much less text on any one line, so
that it is read-able and useful. Furthermore, by naming the
positional variables we pass around we allow the new stack
trace code to correctly evaluate them and show us what went
wrong, exactly.

Signed-off-by: Steve Kuznetsov <[email protected]>
@stevekuznetsov
Copy link
Contributor Author

@smarterclayton this is ready
On Jun 15, 2016 5:50 PM, "OpenShift Bot" [email protected] wrote:

continuous-integration/openshift-jenkins/test SUCCESS (
https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/4941/)


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#8527 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AG_ScmgzbUOB93vrblkzslv4RJoVrbNtks5qMHOtgaJpZM4IIccN
.

@stevekuznetsov
Copy link
Contributor Author

stevekuznetsov commented Jun 16, 2016

Well, there was a green run before. Not sure what kicked this one.

e2e flake:

FAILURE after 0.309s: test/end-to-end/core.sh:351: executing 'docker pull gcr.io/google_containers/pause' expecting success: the command returned the wrong error code
Standard output from the command:
Using default tag: latest
Trying to pull repository gcr.io/google_containers/pause ... 

Standard error from the command:
name invalid: Requested repository does not match bearer token resource: google_containers/pause

test-cmd flake:

FAILURE after 0.214s: test/cmd/router.sh:32: executing 'oadm router --dry-run --host-network=false --stats-port=1937 -o yaml' expecting failure and text 'hostPort: 1937': the output content test failed
Standard output from the command:
...
          - containerPort: 1937
            hostPort: 1937
...

Standard error from the command:
error: router could not be created; service account "router" is not allowed to access host ports on nodes, grant access with oadm policy add-scc-to-user hostnetwork -z router

re[test]

@stevekuznetsov
Copy link
Contributor Author

Looks like a new flake in conformance and integration:

curl: (7) Failed connect to copr.fedorainfracloud.org:443; Connection refused

#9379

re[test]

@stevekuznetsov
Copy link
Contributor Author

Extended failures:

test-cmd:

FAILURE after 0.415s: test/cmd/admin.sh:308: executing 'oadm registry --dry-run' expecting failure and text 'does not exist': the output content test failed
There was no output from the command.
Standard error from the command:
error: error getting client: Get https://127.0.0.1:28443/api: dial tcp 127.0.0.1:28443: getsockopt: connection refused

@smarterclayton you seen this before? ^^

re[test]

@stevekuznetsov
Copy link
Contributor Author

@mfojtik @smarterclayton this looks like a rebase issue?

I0616 16:05:52.554723   19712 meta.go:46] Calling Accessor on non-internal object: *extensions.DaemonSetList
panic: interface conversion: interface {} is cache.DeletedFinalStateUnknown, not *extensions.DaemonSet

goroutine 5893 [running]:
panic(0x35dca60, 0xc82c065e80)
    /usr/lib/golang/src/runtime/panic.go:481 +0x3e6
k8s.io/kubernetes/pkg/controller/daemon.NewDaemonSetsController.func5(0x3611500, 0xc8274fe840)
    /data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/controller/daemon/controller.go:172 +0x18e
k8s.io/kubernetes/pkg/controller/framework.ResourceEventHandlerFuncs.OnDelete(0xc8250a00a0, 0xc8250a00b0, 0xc8250a00c0, 0x3611500, 0xc8274fe840)
    /data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/controller/framework/controller.go:184 +0x3a
k8s.io/kubernetes/pkg/controller/framework.(*ResourceEventHandlerFuncs).OnDelete(0xc82785aea0, 0x3611500, 0xc8274fe840)
    <autogenerated>:25 +0xb5
k8s.io/kubernetes/pkg/controller/framework.NewInformer.func1(0x36116c0, 0xc8274feca0, 0x0, 0x0)
    /data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/controller/framework/controller.go:254 +0x56d
k8s.io/kubernetes/pkg/controller/framework.(*Controller).processLoop(0xc8271d90a0)
    /data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/controller/framework/controller.go:128 +0x6f
k8s.io/kubernetes/pkg/controller/framework.(*Controller).(k8s.io/kubernetes/pkg/controller/framework.processLoop)-fm()
    /data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/controller/framework/controller.go:103 +0x20
k8s.io/kubernetes/pkg/util/wait.JitterUntil.func1(0xc8280fff78)
    /data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/util/wait/wait.go:66 +0x4b
k8s.io/kubernetes/pkg/util/wait.JitterUntil(0xc8280fff78, 0x3b9aca00, 0x0, 0xc82005ae40)
    /data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/util/wait/wait.go:67 +0x68
k8s.io/kubernetes/pkg/util/wait.Until(0xc8280fff78, 0x3b9aca00, 0xc82005ae40)
    /data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/util/wait/wait.go:47 +0x3e
k8s.io/kubernetes/pkg/controller/framework.(*Controller).Run(0xc8271d90a0, 0xc82005ae40)
    /data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/controller/framework/controller.go:103 +0x1c2
created by k8s.io/kubernetes/pkg/controller/daemon.(*DaemonSetsController).Run
    /data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/controller/daemon/controller.go:224 +0xa6

@mfojtik
Copy link
Contributor

mfojtik commented Jun 16, 2016

it seems so, label it as post rebase, I will have a look tomorrow.

Sent from my phone

On 16 Jun 2016, at 23:42, Steve Kuznetsov [email protected] wrote:

@mfojtik https://github.com/mfojtik @smarterclayton
https://github.com/smarterclayton this looks like a rebase issue?

I0616 16:05:52.554723 19712 meta.go:46] Calling Accessor on
non-internal object: *extensions.DaemonSetList
panic: interface conversion: interface {} is
cache.DeletedFinalStateUnknown, not *extensions.DaemonSet

goroutine 5893 [running]:
panic(0x35dca60, 0xc82c065e80)
/usr/lib/golang/src/runtime/panic.go:481
+0x3e6k8s.io/kubernetes/pkg/controller/daemon.NewDaemonSetsController.func5(0x3611500,
0xc8274fe840)
/data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/controller/daemon/controller.go:172
+0x18ek8s.io/kubernetes/pkg/controller/framework.ResourceEventHandlerFuncs.OnDelete(0xc8250a00a0,
0xc8250a00b0, 0xc8250a00c0, 0x3611500, 0xc8274fe840)
/data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/controller/framework/controller.go:184
+0x3ak8s.io/kubernetes/pkg/controller/framework.(_ResourceEventHandlerFuncs).OnDelete(0xc82785aea0,
0x3611500, 0xc8274fe840)
:25
+0xb5k8s.io/kubernetes/pkg/controller/framework.NewInformer.func1(0x36116c0,
0xc8274feca0, 0x0, 0x0)
/data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/controller/framework/controller.go:254
+0x56dk8s.io/kubernetes/pkg/controller/framework.(_Controller).processLoop(0xc8271d90a0)
/data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/controller/framework/controller.go:128
+0x6fk8s.io/kubernetes/pkg/controller/framework.(_Controller).(k8s.io/kubernetes/pkg/controller/framework.processLoop)-fm()
/data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/controller/framework/controller.go:103
+0x20k8s.io/kubernetes/pkg/util/wait.JitterUntil.func1(0xc8280fff78)
/data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/util/wait/wait.go:66
+0x4bk8s.io/kubernetes/pkg/util/wait.JitterUntil(0xc8280fff78,
0x3b9aca00, 0x0, 0xc82005ae40)
/data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/util/wait/wait.go:67
+0x68k8s.io/kubernetes/pkg/util/wait.Until(0xc8280fff78, 0x3b9aca00,
0xc82005ae40)
/data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/util/wait/wait.go:47
+0x3ek8s.io/kubernetes/pkg/controller/framework.(_Controller).Run(0xc8271d90a0,
0xc82005ae40)
/data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/controller/framework/controller.go:103
+0x1c2
created by k8s.io/kubernetes/pkg/controller/daemon.(*DaemonSetsController).Run
/data/src/github.com/openshift/origin/Godeps/_workspace/src/k8s.io/kubernetes/pkg/controller/daemon/controller.go:224
+0xa6


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#8527 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AACsaMRifV5pIprQ26l1PfW3r2PNOulNks5qMcMvgaJpZM4IIccN
.

@stevekuznetsov
Copy link
Contributor Author

re[test]

@stevekuznetsov
Copy link
Contributor Author

flake 2x #8571

re[test]

@stevekuznetsov
Copy link
Contributor Author

Hit #8571 again

re[test]

@stevekuznetsov
Copy link
Contributor Author

hit #9457
re[test]

@stevekuznetsov
Copy link
Contributor Author

hit #5448

re[test]

@openshift-bot
Copy link
Contributor

Evaluated for origin test up to 62e2e0a

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/5194/)

@smarterclayton
Copy link
Contributor

LGTM [merge]

@openshift-bot
Copy link
Contributor

Evaluated for origin merge up to 62e2e0a

@openshift-bot
Copy link
Contributor

openshift-bot commented Jun 22, 2016

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/5194/) (Image: devenv-rhel7_4437)

@stevekuznetsov
Copy link
Contributor Author

hit #5448
@smarterclayton please re-tag

@openshift-bot openshift-bot merged commit 96e291c into openshift:master Jun 22, 2016
@stevekuznetsov
Copy link
Contributor Author

I never thought the day would come.
🎆 🎊 🎉 🎆 🎊 🎉 🎆 🎊 🎉 🎆 🎊 🎉

@mfojtik
Copy link
Contributor

mfojtik commented Jun 23, 2016

@stevekuznetsov I know that feeling, congrats!

@rhcarvalho
Copy link
Contributor

@stevekuznetsov
Copy link
Contributor Author

@rhcarvalho good lord ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants