USHIFT-7187: C2CC Dual Stack support & tests#6954
Conversation
|
@pmtk: This pull request references USHIFT-7187 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: pmtk The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository YAML (base), Central YAML (inherited) Review profile: CHILL Plan: Enterprise Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (23)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (20)
WalkthroughC2CC config, controller logic, shared test harnesses, and Robot suites were updated to represent remote next hops as dual-stack family-aware lists and to validate routing, reconciliation, connectivity, and cleanup across IPv4 and IPv6 paths. ChangesDual-stack C2CC support
Estimated code review effort: 4 (Complex) | ~60 minutes Possibly related PRs
Suggested labels: 🚥 Pre-merge checks | ✅ 14 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (14 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.12.2)level=warning msg="The linter 'gomodguard' is deprecated (since v2.12.0) due to: new major version. Replaced by gomodguard_v2." ... [truncated 31032 characters] ... elet: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tk8s.io/metrics: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tk8s.io/mount-utils: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tk8s.io/pod-security-admission: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tk8s.io/sample-apiserver: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tk8s.io/sample-cli-plugin: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tk8s.io/sample-controller: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\n\tTo ignore the vendor directory, use -mod=readonly or -mod=mod.\n\tTo sync the vendor directory, run:\n\t\tgo mod vendor\n" Comment |
|
/test ? |
|
/test e2e-aws-tests-bootc-c2cc e2e-aws-tests-bootc-c2cc-arm |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
test/suites/c2cc/disruptive.robot (1)
58-62: 🩺 Stability & Availability | 🟠 Major | 🏗️ Heavy liftPopulate NIC teardown state before the outage step.
Both NIC-outage flows still assign
${DISABLED_VM}/@{DISABLED_IFACES}only afterDisable All NICs For VMreturns. If that keyword fails mid-way, teardown has no interface list andRestore NICs And Reconnectcannot recover the VM. Please move the state capture ahead of the disruptive call or make the keyword set teardown state on failure paths too.Based on learnings: "setting
${DISABLED_VM}before callingDisable All NICs For VMis not sufficient for recovery...@{DISABLED_IFACES}will remain empty... populate reliably even whenDisable All NICs For VMerrors."Also applies to: 97-101
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/suites/c2cc/disruptive.robot` around lines 58 - 62, Populate the teardown state before calling Disable All NICs For VM so recovery can still run if that keyword fails. In the disruptive flow using ${DISABLED_VM}, @{DISABLED_IFACES}, and Restore NICs And Reconnect, make sure the VM name and interface list are captured even on failure paths, either by assigning them before the NIC-disable step or by having Disable All NICs For VM set the teardown variables itself. Apply the same fix to the other NIC-outage flow mentioned in the comment.Source: Learnings
🧹 Nitpick comments (3)
pkg/config/c2cc_test.go (1)
149-209: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd coverage for the two new
parseRemoteClustersbranches.The migration is thorough, but two new validation paths in
c2cc.gohave no test:
- empty
NextHop→"nextHop must not be empty"(c2cc.go:186-188)- two same-family hops →
"multiple IPv4/IPv6 addresses (max 1 per family)"(c2cc.go:200-202)💚 Suggested cases
{ name: "empty NextHop", cfg: mkC2CCConfig(C2CC{ RemoteClusters: []RemoteCluster{{ NextHop: []string{}, ClusterNetwork: []string{"10.45.0.0/16"}, ServiceNetwork: []string{"10.46.0.0/16"}, }}, }), expectErr: true, errMsg: "nextHop must not be empty", }, { name: "two IPv4 nextHops", cfg: mkC2CCConfig(C2CC{ RemoteClusters: []RemoteCluster{{ NextHop: []string{"10.100.0.2", "10.100.0.3"}, ClusterNetwork: []string{"10.45.0.0/16"}, ServiceNetwork: []string{"10.46.0.0/16"}, }}, }), expectErr: true, errMsg: "max 1 per family", },🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/config/c2cc_test.go` around lines 149 - 209, Add test coverage in the c2cc validation table for the new parseRemoteClusters branches in c2cc.go by adding cases in c2cc_test.go for an empty NextHop and for duplicate same-family hops. Use mkC2CCConfig with RemoteClusters to verify the empty NextHop returns the “nextHop must not be empty” error, and a case like two IPv4 next hops to assert the “multiple IPv4/IPv6 addresses (max 1 per family)” message.pkg/controllers/c2cc/helpers_test.go (1)
43-56: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winUse netlink family constants instead of magic numbers.
The
2/10literals must stay in lockstep with whateveripFamilyOf/PrimaryNextHopkey the map by. Referencingnetlink.FAMILY_V4/netlink.FAMILY_V6(already a dependency) makes the test self-documenting and immune to constant drift.♻️ Proposed change
- family := 2 // FAMILY_V4 - if ip.To4() == nil { - family = 10 // FAMILY_V6 - } + family := netlink.FAMILY_V4 + if ip.To4() == nil { + family = netlink.FAMILY_V6 + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/controllers/c2cc/helpers_test.go` around lines 43 - 56, The helper parseNextHops currently uses hardcoded family values, which should be replaced with the netlink family constants so the test stays aligned with ipFamilyOf/PrimaryNextHop behavior. Update parseNextHops in helpers_test.go to key the map using netlink.FAMILY_V4 and netlink.FAMILY_V6 instead of 2 and 10, keeping the logic the same but making the test self-documenting and resilient to constant drift.pkg/controllers/c2cc/routes.go (1)
22-24: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueParallel slices + map are redundant and risk a duplicate-CIDR collision.
desiredDstKeys[i]is alwaysdesiredDsts[i].String(), sodesiredGWs(keyed by the same string) just maps back to a gateway you could store as a parallel[]net.IP. More importantly, if two resolved entries contribute the same CIDR string, the map collapses to a single gateway while both slice entries survive — both routes then resolve to the last-written gateway. A small struct ({dst, key, gw}) avoids the desync and the collision class entirely.Also applies to: 37-47, 56-59, 88-89
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/controllers/c2cc/routes.go` around lines 22 - 24, The route state in c2cc routes is split across parallel slices plus a gateway map, which can desynchronize and collapse duplicate CIDRs. Refactor the data model around the existing route-handling logic in routes.go so each entry keeps its destination, derived key, and gateway together in a single struct instead of relying on desiredDsts, desiredDstKeys, and desiredGWs separately. Update the code paths that populate, compare, and consume these values to use the new struct consistently so duplicate CIDR resolutions remain distinct and each route retains its correct gateway.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@test/suites/c2cc/disruptive.robot`:
- Around line 58-62: Populate the teardown state before calling Disable All NICs
For VM so recovery can still run if that keyword fails. In the disruptive flow
using ${DISABLED_VM}, @{DISABLED_IFACES}, and Restore NICs And Reconnect, make
sure the VM name and interface list are captured even on failure paths, either
by assigning them before the NIC-disable step or by having Disable All NICs For
VM set the teardown variables itself. Apply the same fix to the other NIC-outage
flow mentioned in the comment.
---
Nitpick comments:
In `@pkg/config/c2cc_test.go`:
- Around line 149-209: Add test coverage in the c2cc validation table for the
new parseRemoteClusters branches in c2cc.go by adding cases in c2cc_test.go for
an empty NextHop and for duplicate same-family hops. Use mkC2CCConfig with
RemoteClusters to verify the empty NextHop returns the “nextHop must not be
empty” error, and a case like two IPv4 next hops to assert the “multiple
IPv4/IPv6 addresses (max 1 per family)” message.
In `@pkg/controllers/c2cc/helpers_test.go`:
- Around line 43-56: The helper parseNextHops currently uses hardcoded family
values, which should be replaced with the netlink family constants so the test
stays aligned with ipFamilyOf/PrimaryNextHop behavior. Update parseNextHops in
helpers_test.go to key the map using netlink.FAMILY_V4 and netlink.FAMILY_V6
instead of 2 and 10, keeping the logic the same but making the test
self-documenting and resilient to constant drift.
In `@pkg/controllers/c2cc/routes.go`:
- Around line 22-24: The route state in c2cc routes is split across parallel
slices plus a gateway map, which can desynchronize and collapse duplicate CIDRs.
Refactor the data model around the existing route-handling logic in routes.go so
each entry keeps its destination, derived key, and gateway together in a single
struct instead of relying on desiredDsts, desiredDstKeys, and desiredGWs
separately. Update the code paths that populate, compare, and consume these
values to use the new struct consistently so duplicate CIDR resolutions remain
distinct and each route retains its correct gateway.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 47cfa155-ce92-47b9-ba43-bf984328fd88
⛔ Files ignored due to path filters (1)
etcd/vendor/github.com/openshift/microshift/pkg/config/c2cc.gois excluded by!**/vendor/**
📒 Files selected for processing (22)
cmd/generate-config/config/config-openapi-spec.jsondocs/user/howto_config.mdpackaging/microshift/config.yamlpkg/config/c2cc.gopkg/config/c2cc_test.gopkg/controllers/c2cc/healthcheck.gopkg/controllers/c2cc/healthcheck_test.gopkg/controllers/c2cc/helpers_test.gopkg/controllers/c2cc/ovn.gopkg/controllers/c2cc/ovn_test.gopkg/controllers/c2cc/routes.gotest/assets/c2cc/hello-microshift.yamltest/bin/c2cc_common.shtest/resources/c2cc.resourcetest/scenarios-bootc/c2cc/el102-src@c2cc-dual-stack.shtest/scenarios-bootc/c2cc/el98-src@c2cc-dual-stack-v6.shtest/suites/c2cc/cleanup.robottest/suites/c2cc/connectivity.robottest/suites/c2cc/disruptive.robottest/suites/c2cc/infrastructure.robottest/suites/c2cc/probe.robottest/suites/c2cc/reconciliation.robot
|
/test e2e-aws-tests-bootc-c2cc e2e-aws-tests-bootc-c2cc-arm |
1 similar comment
|
/test e2e-aws-tests-bootc-c2cc e2e-aws-tests-bootc-c2cc-arm |
|
/test e2e-aws-tests-bootc-c2cc e2e-aws-tests-bootc-c2cc-arm |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/scenarios-bootc/c2cc/el102-src`@c2cc-dual-stack.sh:
- Around line 1-6: The script header is not using the repo-standard bash
prologue, so update the top of this shell script to use the required bash
shebang and enable strict mode before any sourcing. Make the change in the
script’s startup section near the existing source of c2cc_common.sh, keeping the
helper import after the prologue so failures are caught early.
In `@test/scenarios-bootc/c2cc/el98-src`@c2cc-dual-stack-v6.sh:
- Around line 1-6: The script is missing the repo-standard bash prologue and
currently uses the wrong shebang, so update this scenario script to match the
shell baseline by switching to the standard bash shebang and enabling strict
mode with set -euo pipefail before sourcing c2cc_common.sh. Keep the source line
unchanged, and ensure the prologue is placed at the top of the file so all
scenario steps run under the expected shell settings.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 6d1ee6e4-08ca-483b-8ea6-70fa92f964c3
📒 Files selected for processing (10)
test/assets/c2cc/hello-microshift.yamltest/resources/c2cc.resourcetest/scenarios-bootc/c2cc/el102-src@c2cc-dual-stack.shtest/scenarios-bootc/c2cc/el98-src@c2cc-dual-stack-v6.shtest/suites/c2cc/cleanup.robottest/suites/c2cc/connectivity.robottest/suites/c2cc/disruptive.robottest/suites/c2cc/infrastructure.robottest/suites/c2cc/probe.robottest/suites/c2cc/reconciliation.robot
✅ Files skipped from review due to trivial changes (1)
- test/suites/c2cc/connectivity.robot
🚧 Files skipped from review as they are similar to previous changes (7)
- test/assets/c2cc/hello-microshift.yaml
- test/suites/c2cc/disruptive.robot
- test/suites/c2cc/probe.robot
- test/suites/c2cc/cleanup.robot
- test/suites/c2cc/infrastructure.robot
- test/suites/c2cc/reconciliation.robot
- test/resources/c2cc.resource
agullon
left a comment
There was a problem hiding this comment.
Review Findings
1. get_host_ipv6() can return empty string on failure silently
test/bin/c2cc_common.sh:35-43
Two issues in this function:
Dead code: host_ip is fetched on line 38 but never used — run_command_on_vm takes ${host} (the hostname), not ${host_ip}.
Silent empty return: If the VM has no global IPv6 address, grep -oP fails inside the remote shell, but the exit code is swallowed by the | tail -1 | tr -d '\r' pipeline. The function returns exit 0 with an empty string. The || return 1 guards at the call sites (lines 134-136, 283-284) never trigger.
The configure_c2cc_hosts path is safe (guarded by -n checks), but c2cc_run_tests would inject --variable HOST2_IPV6: (empty) into Robot Framework without error — which could cause confusing test behavior in dual-stack scenarios if the VM hasn't gotten its IPv6 address yet.
Suggested fix:
get_host_ipv6() {
local host=$1
local ipv6
ipv6=$(run_command_on_vm "${host}" \
"ip -6 addr show scope global | grep -oP '(?<=inet6 )([0-9a-f:]+)' | head -1" \
| tail -1 | tr -d '\r')
if [[ -z "${ipv6}" ]]; then
echo "failed to get ${host} IPv6 address" >&2
return 1
fi
echo "${ipv6}"
}2. OVN and Linux route unit tests are single-stack only
pkg/controllers/c2cc/ovn_test.go and pkg/controllers/c2cc/routes.go
All OVN route tests use IPv4-only NextHops (map[int]net.IP{2: net.ParseIP("192.168.1.10")}). The new family-aware gateway selection logic:
gw, ok := rc.NextHopForFamily(ipFamilyOf(cidr))
if !ok {
continue
}is untested at the unit level for dual-stack input. A test with both FAMILY_V4 and FAMILY_V6 entries in NextHops and dual-stack CIDRs would verify that each route gets the correct family-matched gateway. Similarly, routes.go has no direct unit tests for the family-matching logic.
Not blocking — the Robot Framework integration tests cover this end-to-end — but unit-level coverage would catch regressions earlier.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/controllers/c2cc/ovn_test.go (1)
15-15: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winMagic numbers
2/10for address family keys hurt readability.
NextHopsmap keys2and10match Linuxsyscall.AF_INET/AF_INET6values but appear as unexplained magic numbers throughout every fixture. Using the named constants (or local aliases) makes intent obvious and prevents transcription errors when new tests are added.♻️ Suggested fix
- NextHops: map[int]net.IP{ - 2: net.ParseIP("192.168.1.10"), - 10: net.ParseIP("fd01::10"), - }, + NextHops: map[int]net.IP{ + syscall.AF_INET: net.ParseIP("192.168.1.10"), + syscall.AF_INET6: net.ParseIP("fd01::10"), + },Also applies to: 37-42, 72-72, 87-90, 126-129, 172-172
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/controllers/c2cc/ovn_test.go` at line 15, The NextHops map in the c2cc OVN test fixtures is using unexplained magic numbers for address-family keys, making the intent hard to read and easy to mistype. Replace the literal keys in the affected test data with the corresponding named constants or local aliases for IPv4/IPv6 family values, and apply the same change consistently across all fixture blocks in ovn_test.go so the usage is clear wherever NextHops is populated.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@pkg/controllers/c2cc/ovn_test.go`:
- Line 15: The NextHops map in the c2cc OVN test fixtures is using unexplained
magic numbers for address-family keys, making the intent hard to read and easy
to mistype. Replace the literal keys in the affected test data with the
corresponding named constants or local aliases for IPv4/IPv6 family values, and
apply the same change consistently across all fixture blocks in ovn_test.go so
the usage is clear wherever NextHops is populated.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: c0c554c2-fd99-4a44-b5ce-c0ad1031f4c0
📒 Files selected for processing (18)
pkg/controllers/c2cc/healthcheck.gopkg/controllers/c2cc/healthcheck_test.gopkg/controllers/c2cc/helpers_test.gopkg/controllers/c2cc/ovn.gopkg/controllers/c2cc/ovn_test.gopkg/controllers/c2cc/routes.gopkg/controllers/c2cc/routes_test.gotest/assets/c2cc/hello-microshift.yamltest/bin/c2cc_common.shtest/resources/c2cc.resourcetest/scenarios-bootc/c2cc/el102-src@c2cc-dual-stack.shtest/scenarios-bootc/c2cc/el98-src@c2cc-dual-stack-v6.shtest/suites/c2cc/cleanup.robottest/suites/c2cc/connectivity.robottest/suites/c2cc/disruptive.robottest/suites/c2cc/infrastructure.robottest/suites/c2cc/probe.robottest/suites/c2cc/reconciliation.robot
🚧 Files skipped from review as they are similar to previous changes (14)
- test/suites/c2cc/disruptive.robot
- pkg/controllers/c2cc/healthcheck.go
- test/suites/c2cc/connectivity.robot
- test/assets/c2cc/hello-microshift.yaml
- pkg/controllers/c2cc/helpers_test.go
- pkg/controllers/c2cc/ovn.go
- pkg/controllers/c2cc/healthcheck_test.go
- pkg/controllers/c2cc/routes.go
- test/suites/c2cc/probe.robot
- test/suites/c2cc/cleanup.robot
- test/suites/c2cc/infrastructure.robot
- test/suites/c2cc/reconciliation.robot
- test/bin/c2cc_common.sh
- test/resources/c2cc.resource
|
/retest |
|
@pmtk: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary by CodeRabbit
New Features
nextHopentries, enabling separate IPv4/IPv6 gateways for dual-stack clusters.Bug Fixes
nextHoplists, rejects invalid/duplicate family-specific entries, and verifies next-hop coverage per CIDR family.Documentation
config.yamlexamples to representnextHopas an array.