Copilot commented on code in PR #142:
URL:
https://github.com/apache/skywalking-infra-e2e/pull/142#discussion_r3004887271
##########
internal/components/collector/compose.go:
##########
@@ -123,6 +130,41 @@ func collectContainerInspect(outputDir, service,
containerID string) error {
return nil
}
+// expandContainerGlob expands a glob pattern inside a Docker container.
+// If the path has no glob characters it is returned as-is.
+func expandContainerGlob(containerID, service, pattern string) ([]string,
error) {
+ if !containsGlob(pattern) {
+ return []string{pattern}, nil
+ }
+
+ if err := validateGlobPattern(pattern); err != nil {
+ return nil, err
+ }
+
+ cmd := fmt.Sprintf("docker exec %s sh -c 'ls -d -- %s 2>/dev/null ||
true'", containerID, pattern)
+ stdout, stderr, err := util.ExecuteCommand(cmd)
+ if err != nil {
+ logger.Log.Warnf("failed to expand glob %s in service %s: %v,
stderr: %s", pattern, service, err, stderr)
+ return nil, fmt.Errorf("glob expansion failed for %s: %v",
pattern, err)
Review Comment:
On `docker exec` failure, the returned error omits `stderr`. Elsewhere in
this package (e.g., `docker inspect`, `docker cp`) errors include stderr, which
is important for diagnosing missing shells/permissions inside the container.
Consider returning stderr in the error in addition to logging it.
```suggestion
return nil, fmt.Errorf("glob expansion failed for %s: %v,
stderr: %s", pattern, err, stderr)
```
##########
internal/components/collector/collector_test.go:
##########
@@ -122,6 +122,74 @@ func TestListPods_ResourceFormat(t *testing.T) {
}
}
+func TestContainsGlob(t *testing.T) {
+ tests := []struct {
+ path string
+ want bool
+ }{
+ {"/skywalking/logs/", false},
+ {"/tmp/dump.hprof", false},
+ {"/skywalking/logs*", true},
+ {"/tmp/*.hprof", true},
+ {"/tmp/dump-[0-9].hprof", true},
+ {"/var/log/?oo", true},
+ }
+ for _, tt := range tests {
+ t.Run(tt.path, func(t *testing.T) {
+ if got := containsGlob(tt.path); got != tt.want {
+ t.Errorf("containsGlob(%q) = %v, want %v",
tt.path, got, tt.want)
+ }
+ })
+ }
+}
+
+func TestValidateGlobPattern(t *testing.T) {
+ tests := []struct {
+ pattern string
+ wantErr bool
+ }{
+ {"/skywalking/logs*", false},
+ {"/tmp/*.hprof", false},
+ {"/tmp/dump-[0-9].hprof", false},
+ {"/var/log/app-?.log", false},
+ {"/path with spaces/*", false},
Review Comment:
This test asserts patterns with spaces (e.g. `/path with spaces/*`) are
valid, but the implementation builds unquoted `sh -c`, `kubectl cp`, and
`docker cp` command strings, so any expanded path containing spaces will be
split and fail (and may unintentionally include multiple paths). Either adjust
the implementation to correctly handle spaces end-to-end, or change the
validation/test expectation to reject spaces.
```suggestion
{"/path with spaces/*", true},
```
##########
internal/components/collector/kind.go:
##########
@@ -151,6 +159,65 @@ func collectPodDescribe(kubeConfigPath, outputDir,
namespace, podName string) er
return nil
}
+// containsGlob reports whether the path contains glob metacharacters.
+func containsGlob(path string) bool {
+ return strings.ContainsAny(path, "*?[")
+}
+
+// validPathPattern matches paths that contain only safe characters for shell
interpolation.
+// Allowed: alphanumeric, /, ., -, _, *, ?, [, ], and space.
+var validPathPattern = regexp.MustCompile(`^[a-zA-Z0-9/_.*?\[\]\- ]+$`)
+
+// validateGlobPattern checks that a glob pattern contains only safe characters
+// to prevent shell injection when interpolated into sh -c commands.
+func validateGlobPattern(pattern string) error {
+ if !validPathPattern.MatchString(pattern) {
+ return fmt.Errorf("glob pattern %q contains unsupported
characters", pattern)
+ }
+ return nil
+}
+
+// expandPodGlob expands a glob pattern inside a pod. If the path has no glob
+// characters it is returned as-is. Otherwise kubectl exec runs sh to expand
+// the pattern and returns the matched paths.
+func expandPodGlob(kubeConfigPath, namespace, podName, container, pattern
string) ([]string, error) {
+ if !containsGlob(pattern) {
+ return []string{pattern}, nil
+ }
+
+ if err := validateGlobPattern(pattern); err != nil {
+ return nil, err
+ }
+
+ cmd := fmt.Sprintf("kubectl --kubeconfig %s -n %s exec %s",
kubeConfigPath, namespace, podName)
+ if container != "" {
+ cmd += fmt.Sprintf(" -c %s", container)
+ }
+ cmd += fmt.Sprintf(" -- sh -c 'ls -d -- %s 2>/dev/null || true'",
pattern)
Review Comment:
`expandPodGlob` interpolates `kubeConfigPath`, `namespace`, `podName`, and
`container` into a command string executed via `util.ExecuteCommand` (bash).
These values aren't shell-escaped/quoted, so spaces or shell metacharacters can
break the command or lead to command injection (pod names/namespace come from
the cluster; kubeconfig path comes from config). Prefer executing kubectl with
arguments (no shell), or apply robust shell escaping to each interpolated value.
##########
internal/components/collector/kind.go:
##########
@@ -151,6 +159,65 @@ func collectPodDescribe(kubeConfigPath, outputDir,
namespace, podName string) er
return nil
}
+// containsGlob reports whether the path contains glob metacharacters.
+func containsGlob(path string) bool {
+ return strings.ContainsAny(path, "*?[")
Review Comment:
`containsGlob` treats any path containing `[` as a glob. That changes
behavior for literal bracket paths (e.g. `/var/log/app[1].log`) which
previously would have been copied as-is, contradicting the PR claim that
non-glob paths are unchanged. Consider only treating `[` as a glob when a
matching `]` exists (and ideally contains at least one non-`]` char), or
provide/allow an escaping mechanism for literal `[`/`]`.
```suggestion
// Fast path: standard glob wildcards.
if strings.ContainsAny(path, "*?") {
return true
}
// Handle bracket expressions like [0-9] more carefully so that literal
// '[' characters (e.g. in "app[1].log") are not treated as globs.
for i := 0; i < len(path); i++ {
if path[i] != '[' {
continue
}
// Find the next closing ']'.
for j := i + 1; j < len(path); j++ {
if path[j] != ']' {
continue
}
// There is a ']' after '[', now check that there is at
least one
// non-']' character between them to indicate a real
character
// class, not just "[]" or a malformed sequence.
for k := i + 1; k < j; k++ {
if path[k] != ']' {
return true
}
}
// Found "[]" or a sequence of only ']'s; continue
searching
// for another potential bracket expression.
break
}
}
return false
```
##########
internal/components/collector/kind.go:
##########
@@ -151,6 +159,65 @@ func collectPodDescribe(kubeConfigPath, outputDir,
namespace, podName string) er
return nil
}
+// containsGlob reports whether the path contains glob metacharacters.
+func containsGlob(path string) bool {
+ return strings.ContainsAny(path, "*?[")
+}
+
+// validPathPattern matches paths that contain only safe characters for shell
interpolation.
+// Allowed: alphanumeric, /, ., -, _, *, ?, [, ], and space.
+var validPathPattern = regexp.MustCompile(`^[a-zA-Z0-9/_.*?\[\]\- ]+$`)
+
+// validateGlobPattern checks that a glob pattern contains only safe characters
+// to prevent shell injection when interpolated into sh -c commands.
+func validateGlobPattern(pattern string) error {
+ if !validPathPattern.MatchString(pattern) {
+ return fmt.Errorf("glob pattern %q contains unsupported
characters", pattern)
+ }
+ return nil
+}
+
+// expandPodGlob expands a glob pattern inside a pod. If the path has no glob
+// characters it is returned as-is. Otherwise kubectl exec runs sh to expand
+// the pattern and returns the matched paths.
+func expandPodGlob(kubeConfigPath, namespace, podName, container, pattern
string) ([]string, error) {
+ if !containsGlob(pattern) {
+ return []string{pattern}, nil
+ }
+
+ if err := validateGlobPattern(pattern); err != nil {
+ return nil, err
+ }
+
+ cmd := fmt.Sprintf("kubectl --kubeconfig %s -n %s exec %s",
kubeConfigPath, namespace, podName)
+ if container != "" {
+ cmd += fmt.Sprintf(" -c %s", container)
+ }
+ cmd += fmt.Sprintf(" -- sh -c 'ls -d -- %s 2>/dev/null || true'",
pattern)
+
+ stdout, stderr, err := util.ExecuteCommand(cmd)
+ if err != nil {
+ logger.Log.Warnf("failed to expand glob %s in pod %s/%s: %v,
stderr: %s", pattern, namespace, podName, err, stderr)
+ return nil, fmt.Errorf("glob expansion failed for %s: %v",
pattern, err)
Review Comment:
On `kubectl exec` failure, the returned error drops `stderr` even though
other collector helpers include it in the error (not just in logs). Including
`stderr` in the returned error would make the aggregated "some files failed to
collect" message actionable when logs aren't available.
```suggestion
return nil, fmt.Errorf("glob expansion failed for %s: %v,
stderr: %s", pattern, err, stderr)
```
##########
internal/components/collector/kind.go:
##########
@@ -151,6 +159,65 @@ func collectPodDescribe(kubeConfigPath, outputDir,
namespace, podName string) er
return nil
}
+// containsGlob reports whether the path contains glob metacharacters.
+func containsGlob(path string) bool {
+ return strings.ContainsAny(path, "*?[")
+}
+
+// validPathPattern matches paths that contain only safe characters for shell
interpolation.
+// Allowed: alphanumeric, /, ., -, _, *, ?, [, ], and space.
+var validPathPattern = regexp.MustCompile(`^[a-zA-Z0-9/_.*?\[\]\- ]+$`)
Review Comment:
`validPathPattern` allows spaces, but both glob expansion (`ls -d --
<pattern>`) and subsequent copy commands (`kubectl cp` / `docker cp`) build
unquoted shell command strings. Any matched path containing spaces will be
split into multiple arguments, causing expansion/copy to fail and potentially
allowing multiple unintended patterns/paths to be passed. Either disallow
spaces in patterns/paths, or implement proper shell escaping / avoid the shell
by using exec with argv.
```suggestion
// Allowed: alphanumeric, /, ., -, _, *, ?, [, ].
var validPathPattern = regexp.MustCompile(`^[a-zA-Z0-9/_.*?\[\]\-]+$`)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]