Skip to content

Commit

Permalink
Feature: coredump implementation (#22)
Browse files Browse the repository at this point in the history
* feat(cli): add command implementation draft for createdump tool

Signed-off-by: ArtemTrofimushkin <[email protected]>

* chore(cli): tune some docs

Signed-off-by: ArtemTrofimushkin <[email protected]>

* chore(cli): refactor examples for cli commands

Signed-off-by: ArtemTrofimushkin <[email protected]>

* feat(*): add additional path to $PATH variable for dumper

Signed-off-by: ArtemTrofimushkin <[email protected]>

* feat(*): add flags & args formatting for createdump

Signed-off-by: ArtemTrofimushkin <[email protected]>

* chore(tests): implement unit test for createdump

Signed-off-by: ArtemTrofimushkin <[email protected]>

* chore(tests): implement integration test for createdump

Signed-off-by: ArtemTrofimushkin <[email protected]>

* feat(dumper): add command for createdump

Signed-off-by: ArtemTrofimushkin <[email protected]>

* feat(dumper): add terminationmessagepolicy for worker pod

Signed-off-by: ArtemTrofimushkin <[email protected]>

* chore(flags): remove unused method

Signed-off-by: ArtemTrofimushkin <[email protected]>

* chore(*): rename method to better describe intention

Signed-off-by: ArtemTrofimushkin <[email protected]>

* chore(*): remove Formatter interface & update usages for more clear args fomatting

Signed-off-by: ArtemTrofimushkin <[email protected]>

* chore(tests): fix broken integration tests

Signed-off-by: ArtemTrofimushkin <[email protected]>

* chore(*): add command to makefile to build all components

Signed-off-by: ArtemTrofimushkin <[email protected]>

* chore(flags): split interfaces & better naming

Signed-off-by: ArtemTrofimushkin <[email protected]>

* chore(flags): make implementation private

Signed-off-by: ArtemTrofimushkin <[email protected]>

* chore(flags): rename method & drop aux interface

Signed-off-by: ArtemTrofimushkin <[email protected]>

* chore(flags): remove code duplication for createdump

Signed-off-by: ArtemTrofimushkin <[email protected]>

* chore(flags): replace action with formatargstype enum

Signed-off-by: ArtemTrofimushkin <[email protected]>

* chore(flags): split file

Signed-off-by: ArtemTrofimushkin <[email protected]>

* feat(flags): use different flags for tool and binary in createdump

Signed-off-by: ArtemTrofimushkin <[email protected]>

* chore(tests): update unit tests for args formatting

Signed-off-by: ArtemTrofimushkin <[email protected]>

* feat(flags): add IsPrivileged option & override for createdump command

Signed-off-by: ArtemTrofimushkin <[email protected]>

* feat(cli): add privileged options to job spec

Signed-off-by: ArtemTrofimushkin <[email protected]>

* chore(tests): implement unit tests on full job spec

Signed-off-by: ArtemTrofimushkin <[email protected]>

* feat(dumper): add privileged options support

Signed-off-by: ArtemTrofimushkin <[email protected]>

* chore(*): rename create_dump -> core_dump & corresponding tool

Signed-off-by: ArtemTrofimushkin <[email protected]>

* feat(flag): add coredump type option

Signed-off-by: ArtemTrofimushkin <[email protected]>

* chore(tests): tidy tests & add coredump type tests

Signed-off-by: ArtemTrofimushkin <[email protected]>

* chore(tests): add custom type for coredump

Signed-off-by: ArtemTrofimushkin <[email protected]>

* docs(*): generate docs

Signed-off-by: ArtemTrofimushkin <[email protected]>

* docs(*): update README

Signed-off-by: ArtemTrofimushkin <[email protected]>

* fix: bump dumper image version

Signed-off-by: ArtemTrofimushkin <[email protected]>

* doc(*): small fixes in readme

Signed-off-by: ArtemTrofimushkin <[email protected]>
  • Loading branch information
ArtemTrofimushkin authored Mar 17, 2022
1 parent fbca829 commit 4c04363
Show file tree
Hide file tree
Showing 38 changed files with 931 additions and 629 deletions.
4 changes: 4 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,9 @@ lint:
tidy:
go mod tidy -v

.PHONY: build
build: build-cli build-dumper

.PHONY: build-cli
build-cli:
go build -v -o ./cli/bin/kubectl-shovel ./cli
Expand Down Expand Up @@ -65,6 +68,7 @@ help:
@echo " ${YELLOW}doc ${RESET} Run doc generation"
@echo " ${YELLOW}lint ${RESET} Run linters via golangci-lint"
@echo " ${YELLOW}tidy ${RESET} Run tidy for go module to remove unused dependencies"
@echo " ${YELLOW}build ${RESET} Build all components"
@echo " ${YELLOW}build-cli ${RESET} Build cli component of shovel"
@echo " ${YELLOW}build-dumper ${RESET} Build dumper component of shovel"
@echo " ${YELLOW}setup ${RESET} Setup local environment. Create kind cluster"
Expand Down
21 changes: 14 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ At the moment the following diagnostic tools are supported:
* `dotnet-gcdump`
* `dotnet-trace`
* `dotnet-dump`
* `createdump`

Inspired by [`kubectl-flame`](https://github.com/VerizonMedia/kubectl-flame).

Expand Down Expand Up @@ -49,10 +50,16 @@ Or trace:
kubectl shovel trace --pod-name pod-name-74df554df7-qldq7 -o ./trace.nettrace
```

Or get full memory dump:
Or get full managed memory dump:

```shell
kubectl shovel dump --pod-name pod-name-74df554df7-qldq7 -o ./memory.dump --type full
kubectl shovel dump --pod-name pod-name-74df554df7-qldq7 -o ./memory.dump --type Full
```

Or get full (managed and unmanaged) memory dump with [createdump](https://github.com/dotnet/runtime/blob/main/docs/design/coreclr/botr/xplat-minidump-generation.md) utility:

```shell
kubectl shovel coredump --pod-name pod-name-74df554df7-qldq7 -o ./coredump.dump --type Full
```

Most of dotnet tools flags supported as well to use, e.g `--duration` and `--format` for `trace`.
Expand All @@ -67,7 +74,7 @@ So it requires permissions to get pods and create jobs and allowance to mount `/

To run all kinds of checks and generators please use:

```
```bash
make prepare
```

Expand All @@ -79,17 +86,17 @@ make prepare

### Testing

#### Unit tests:
#### Unit tests

```
```bash
make test-unit
```

#### Integration tests

> kind-clusters use containerd as container runtime, so functionality with docker-runtime won't be covered.
* Integration tests require running kind-cluster. You can create it with `kind create cluster`. Also you can specify some version for cluster: `kind create cluster --image=kindest/node:<version>`, e.g v1.19.1 version.
* Integration tests require running kind-cluster. You can create it with `make setup`. Also you can specify some version for cluster: `kind create cluster --image=kindest/node:<version>`, e.g v1.19.1 version.
* Then run integration tests with `make test-integration`. It will:
* Build docker image for dumper
* Upload it to kind-cluster
Expand All @@ -98,6 +105,6 @@ make test-unit

#### All in one

```
```bash
make test
```
2 changes: 2 additions & 0 deletions cli/cmd/command_builder.go
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,8 @@ func (options *CommonOptions) GetFlags() *pflag.FlagSet {
return fs
}

// NewCommandBuilder returns *CommandBuilder instance with specified factory flags.DotnetToolFactory
// that responsible for creation of any available flags.DotnetTool
func NewCommandBuilder(factory flags.DotnetToolFactory) *CommandBuilder {
tool := factory()

Expand Down
57 changes: 31 additions & 26 deletions cli/cmd/commands.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,56 +15,61 @@ var (
"\tkubectl shovel %[1]s --pod-name my-app-65c4fc589c-gznql -o ./myapp.%[1]s\n\n" +
"Also use `-n`/`--namespace` if your pod is not in current context's namespace:\n\n" +
"\tkubectl shovel %[1]s --pod-name my-app-65c4fc589c-gznql -n default"
descriptionTemplate = "This subcommand will run %s tool for running in k8s application.\n" +
"Result will be saved locally (or on host) so you'll be able to analyze it with appropriate instruments.\n" +
"Tool specific additional arguments are also supported.\n" +
"You can find more info about this tool by the following links:\n\n" +
"\t* %s\n" +
"\t* %s"
)

// NewGCDumpCommand return command that start dumper with dotnet-gcdump tool
func NewGCDumpCommand() *cobra.Command {
builder := NewCommandBuilder(flags.NewDotnetGCDump)
return builder.Build(
"Get dotnet-gcdump results",
"This subcommand will run dotnet-gcdump tool for running in k8s application.\n"+
"Result will be saved locally so you'll be able to analyze it with appropriate tools.\n"+
"You can find more info about dotnet-gcdump tool by the following links:\n\n"+
"\t* https://devblogs.microsoft.com/dotnet/collecting-and-analyzing-memory-dumps/\n"+
"\t* https://docs.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-gcdump",
fmt.Sprintf(descriptionTemplate,
builder.Tool(),
"https://devblogs.microsoft.com/dotnet/collecting-and-analyzing-memory-dumps",
"https://docs.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-gcdump"),
fmt.Sprintf(examplesTemplate, builder.Tool()),
)
}

// NewTraceCommand return command that start dumper with dotnet-trace tool
// revive:disable:line-length-limit, This is an extended description
func NewTraceCommand() *cobra.Command {
builder := NewCommandBuilder(flags.NewDotnetTrace)
return builder.Build(
"Get dotnet-trace results",
"This subcommand will capture runtime events with dotnet-trace tool for running in k8s application.\n"+
"Result will be saved locally in nettrace format so you'll be able to convert it and analyze with appropriate tools.\n"+
"You can find more info about dotnet-trace tool by the following links:\n\n"+
"\t* https://github.com/dotnet/diagnostics/blob/master/documentation/dotnet-trace-instructions.md\n"+
"\t* https://docs.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-trace",
fmt.Sprintf(examplesTemplate, builder.Tool())+"\n\n"+
"Use `--duration` to define duration of trace to 30 seconds:\n\n"+
"\tkubectl shovel trace --pod-name my-app-65c4fc589c-gznql -o ./myapp.trace --duration 30s\n\n"+
"Use `--format` to specify Speedscope format:\n\n"+
"\tkubectl shovel trace --pod-name my-app-65c4fc589c-gznql -o ./myapp.trace --format Speedscope\n\n"+
"And then you can analyze it with https://www.speedscope.app/\n"+
"Or convert any other format to speedscope format with:\n\n"+
"\tdotnet trace convert myapp.trace --format Speedscope",
fmt.Sprintf(descriptionTemplate,
builder.Tool(),
"https://github.com/dotnet/diagnostics/blob/master/documentation/dotnet-trace-instructions.md",
"https://docs.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-trace"),
fmt.Sprintf(examplesTemplate, builder.Tool()),
)
}

// revive:enable:line-length-limit

// NewDumpCommand return command that start dumper with dotnet-dump tool
func NewDumpCommand() *cobra.Command {
builder := NewCommandBuilder(flags.NewDotnetDump)
return builder.Build(
"Get dotnet-dump results",
"This subcommand will run dotnet-dump tool for running in k8s application.\n"+
"Result will be saved locally so you'll be able to analyze it with appropriate tools.\n"+
"You can find more info about dotnet-dump tool by the following links:\n\n"+
"\t* https://docs.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-dump\n"+
"\t* https://docs.microsoft.com/en-us/dotnet/core/diagnostics/debug-linux-dumps\n",
fmt.Sprintf(descriptionTemplate,
builder.Tool(),
"https://docs.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-dump",
"https://docs.microsoft.com/en-us/dotnet/core/diagnostics/debug-linux-dumps"),
fmt.Sprintf(examplesTemplate, builder.Tool()),
)
}

// NewCoreDumpCommand return command that start full process dump with createdump tool
func NewCoreDumpCommand() *cobra.Command {
builder := NewCommandBuilder(flags.NewCoreDump)
return builder.Build(
"Get full process dump results",
fmt.Sprintf(descriptionTemplate,
builder.Tool(),
"https://docs.microsoft.com/en-us/dotnet/core/diagnostics/debug-linux-dumps#core-dumps-with-createdump",
"https://github.com/dotnet/runtime/blob/main/docs/design/coreclr/botr/xplat-minidump-generation.md"),
fmt.Sprintf(examplesTemplate, builder.Tool()))
}
16 changes: 12 additions & 4 deletions cli/cmd/launch.go
Original file line number Diff line number Diff line change
Expand Up @@ -34,13 +34,14 @@ func (cb *CommandBuilder) args(pod *kubernetes.PodInfo, container *kubernetes.Co
args.AppendKey("store-output-on-host")
}

return args.
args.
Append("container-name", container.Name).
Append("pod-name", pod.Name).
Append("pod-namespace", pod.Namespace).
AppendCommand(cb.tool.ToolName()).
AppendFrom(cb.tool).
Get()
AppendRaw(cb.tool.ToolName())
cb.tool.FormatArgs(args, flags.FormatArgsTypeTool)

return args.Get()
}

func (cb *CommandBuilder) copyOutput(pod *kubernetes.PodInfo, output string) error {
Expand Down Expand Up @@ -100,6 +101,13 @@ func (cb *CommandBuilder) launch() error {
jobSpec.WithHostTmpVolume(cb.CommonOptions.OutputHostPath)
}

// additional spec for privileged tool command
if cb.tool.IsPrivileged() {
jobSpec.
WithPrivilegedOptions().
WithHostProcVolume()
}

fmt.Printf("Spawning diagnostics job with command:\n%s\n", strings.Join(jobSpec.Args, " "))
if err := cb.kube.RunJob(jobSpec); err != nil {
return errors.Wrap(err, "Failed to spawn diagnostics job")
Expand Down
1 change: 1 addition & 0 deletions cli/cmd/root.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ func NewShovelCommand() *cobra.Command {
cmd.AddCommand(NewGCDumpCommand())
cmd.AddCommand(NewTraceCommand())
cmd.AddCommand(NewDumpCommand())
cmd.AddCommand(NewCoreDumpCommand())

return cmd
}
1 change: 1 addition & 0 deletions cli/docs/kubectl-shovel.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ Get diagnostics from running in k8s dotnet application
### SEE ALSO

* [kubectl-shovel completion](kubectl-shovel_completion.md) - generate the autocompletion script for the specified shell
* [kubectl-shovel coredump](kubectl-shovel_coredump.md) - Get full process dump results
* [kubectl-shovel dump](kubectl-shovel_dump.md) - Get dotnet-dump results
* [kubectl-shovel gcdump](kubectl-shovel_gcdump.md) - Get dotnet-gcdump results
* [kubectl-shovel trace](kubectl-shovel_trace.md) - Get dotnet-trace results
Expand Down
74 changes: 74 additions & 0 deletions cli/docs/kubectl-shovel_coredump.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
## kubectl-shovel coredump

Get full process dump results

### Synopsis

This subcommand will run coredump tool for running in k8s application.
Result will be saved locally (or on host) so you'll be able to analyze it with appropriate instruments.
Tool specific additional arguments are also supported.
You can find more info about this tool by the following links:

* https://docs.microsoft.com/en-us/dotnet/core/diagnostics/debug-linux-dumps#core-dumps-with-createdump
* https://github.com/dotnet/runtime/blob/main/docs/design/coreclr/botr/xplat-minidump-generation.md

```
kubectl-shovel coredump [flags]
```

### Examples

```
The only required flag is `--pod-name`. So you can use it like this:
kubectl shovel coredump --pod-name my-app-65c4fc589c-gznql
Use `-o`/`--output` to define name of dump file:
kubectl shovel coredump --pod-name my-app-65c4fc589c-gznql -o ./myapp.coredump
Also use `-n`/`--namespace` if your pod is not in current context's namespace:
kubectl shovel coredump --pod-name my-app-65c4fc589c-gznql -n default
```

### Options

```
--as string Username to impersonate for the operation. User could be a regular user or a service account in a namespace.
--as-group stringArray Group to impersonate for the operation, this flag can be repeated to specify multiple groups.
--as-uid string UID to impersonate for the operation.
--cache-dir string Default cache directory (default "/home/user/.kube/cache")
--certificate-authority string Path to a cert file for the certificate authority
--client-certificate string Path to a client certificate file for TLS
--client-key string Path to a client key file for TLS
--cluster string The name of the kubeconfig cluster to use
-c, --container string Target container in pod. Required if pod run multiple containers
--context string The name of the kubeconfig context to use
-h, --help help for coredump
--image string Image of dumper to use for job (default "dodopizza/kubectl-shovel-dumper:undefined")
--insecure-skip-tls-verify If true, the server's certificate will not be checked for validity. This will make your HTTPS connections insecure
--kubeconfig string Path to the kubeconfig file to use for CLI requests.
-n, --namespace string If present, the namespace scope for this CLI request
-o, --output string Output file (default "./output.coredump")
--output-host-path string Host folder, where will be stored artifact (default "/tmp/kubectl-shovel")
--pod-name string Target pod
-p, --process-id int The process ID to collect the trace from (default 1)
--request-timeout string The length of time to wait before giving up on a single server request. Non-zero values should contain a corresponding time unit (e.g. 1s, 2m, 3h). A value of zero means don't timeout requests. (default "0")
-s, --server string The address and port of the Kubernetes API server
-t, --store-output-on-host Store output on node instead of downloading it locally
--tls-server-name string Server name to use for server certificate validation. If it is not provided, the hostname used to contact the server is used
--token string Bearer token for authentication to the API server
--type type The kinds of information that are collected from process. Supported types:
Full, Heap, Mini, Triage
Full - A full core dump with all process memory
Heap - A process dump with heap
Mini - A normal minidump with some process information
Triage - A small dump containing minimal information (default Full)
--user string The name of the kubeconfig user to use
```

### SEE ALSO

* [kubectl-shovel](kubectl-shovel.md) - Get diagnostics from running in k8s dotnet application

8 changes: 4 additions & 4 deletions cli/docs/kubectl-shovel_dump.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,14 @@ Get dotnet-dump results

### Synopsis

This subcommand will run dotnet-dump tool for running in k8s application.
Result will be saved locally so you'll be able to analyze it with appropriate tools.
You can find more info about dotnet-dump tool by the following links:
This subcommand will run dump tool for running in k8s application.
Result will be saved locally (or on host) so you'll be able to analyze it with appropriate instruments.
Tool specific additional arguments are also supported.
You can find more info about this tool by the following links:

* https://docs.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-dump
* https://docs.microsoft.com/en-us/dotnet/core/diagnostics/debug-linux-dumps


```
kubectl-shovel dump [flags]
```
Expand Down
9 changes: 5 additions & 4 deletions cli/docs/kubectl-shovel_gcdump.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,12 @@ Get dotnet-gcdump results

### Synopsis

This subcommand will run dotnet-gcdump tool for running in k8s application.
Result will be saved locally so you'll be able to analyze it with appropriate tools.
You can find more info about dotnet-gcdump tool by the following links:
This subcommand will run gcdump tool for running in k8s application.
Result will be saved locally (or on host) so you'll be able to analyze it with appropriate instruments.
Tool specific additional arguments are also supported.
You can find more info about this tool by the following links:

* https://devblogs.microsoft.com/dotnet/collecting-and-analyzing-memory-dumps/
* https://devblogs.microsoft.com/dotnet/collecting-and-analyzing-memory-dumps
* https://docs.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-gcdump

```
Expand Down
20 changes: 4 additions & 16 deletions cli/docs/kubectl-shovel_trace.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,10 @@ Get dotnet-trace results

### Synopsis

This subcommand will capture runtime events with dotnet-trace tool for running in k8s application.
Result will be saved locally in nettrace format so you'll be able to convert it and analyze with appropriate tools.
You can find more info about dotnet-trace tool by the following links:
This subcommand will run trace tool for running in k8s application.
Result will be saved locally (or on host) so you'll be able to analyze it with appropriate instruments.
Tool specific additional arguments are also supported.
You can find more info about this tool by the following links:

* https://github.com/dotnet/diagnostics/blob/master/documentation/dotnet-trace-instructions.md
* https://docs.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-trace
Expand All @@ -29,19 +30,6 @@ Use `-o`/`--output` to define name of dump file:
Also use `-n`/`--namespace` if your pod is not in current context's namespace:
kubectl shovel trace --pod-name my-app-65c4fc589c-gznql -n default
Use `--duration` to define duration of trace to 30 seconds:
kubectl shovel trace --pod-name my-app-65c4fc589c-gznql -o ./myapp.trace --duration 30s
Use `--format` to specify Speedscope format:
kubectl shovel trace --pod-name my-app-65c4fc589c-gznql -o ./myapp.trace --format Speedscope
And then you can analyze it with https://www.speedscope.app/
Or convert any other format to speedscope format with:
dotnet trace convert myapp.trace --format Speedscope
```

### Options
Expand Down
5 changes: 3 additions & 2 deletions dumper/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,11 @@ RUN dotnet tool install -g dotnet-gcdump && \
dotnet tool install -g dotnet-trace && \
dotnet tool install -g dotnet-dump

FROM mcr.microsoft.com/dotnet/runtime:6.0.2-focal
FROM mcr.microsoft.com/dotnet/runtime:6.0.3-focal

ARG DOTNET_TOOLS_PATH="/root/.dotnet/tools"
ENV PATH="${PATH}:${DOTNET_TOOLS_PATH}"
ARG DOTNET_RUNTIME_PATH="/usr/share/dotnet/shared/Microsoft.NETCore.App/6.0.3"
ENV PATH="${PATH}:${DOTNET_TOOLS_PATH}:${DOTNET_RUNTIME_PATH}"

WORKDIR /app
COPY --from=tools-install ${DOTNET_TOOLS_PATH} ${DOTNET_TOOLS_PATH}
Expand Down
5 changes: 5 additions & 0 deletions dumper/cmd/commands.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,8 @@ func NewTraceCommand(commonOptions *CommonOptions) *cobra.Command {
func NewDumpCommand(commonOptions *CommonOptions) *cobra.Command {
return NewCommandBuilder(commonOptions, flags.NewDotnetDump).Build()
}

// NewCoreDumpCommand return command that perform createdump on target process
func NewCoreDumpCommand(commonOptions *CommonOptions) *cobra.Command {
return NewCommandBuilder(commonOptions, flags.NewCoreDump).Build()
}
Loading

0 comments on commit 4c04363

Please sign in to comment.