Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable UTs for APB Temporarily. #3741

Merged
merged 1 commit into from
Jul 5, 2023

Conversation

tssurya
Copy link
Member

@tssurya tssurya commented Jul 4, 2023

- What this PR does and why is it needed
Disable UTs for APB Temporarily till the feature
is stabilized.

Reason: UTs are very flaky. Each PR is needed a
minimum of 3 close/open combinations and luck for CI
to pass to even run e2e's. At this stage where other features
are trying to get in before the deadline this process is painful.
At least if the retest flag was present it would be nice, closing/opening
PRs on github frequently - we don't know what the consequences of that
for bigger feature PRs are.. too muc of time consumption trying to get CI
to pass

cc @jordigilh : WDYT? I think you and @npinaeva are working to get it all fixed
but that may take 2 more weeks for merge, meanwhile I'd like to unblock CI.
cc @jcaamano

@coveralls
Copy link

coveralls commented Jul 4, 2023

Coverage Status

coverage: 53.392% (-0.05%) from 53.444% when pulling fc5ca9b on tssurya:disable-apb-uts-temporarily into 6870575 on ovn-org:master.

@tssurya
Copy link
Member Author

tssurya commented Jul 4, 2023

I see unidling tests flaking:

023-07-04T06:11:41.2605215Z �[32m• [SLOW TEST:420.658 seconds]�[0m
2023-07-04T06:11:41.2607221Z e2e control plane
2023-07-04T06:11:41.2608646Z �[90m/home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/e2e.go:563�[0m
2023-07-04T06:11:41.2609128Z   test node readiness according to its defaults interface MTU size
2023-07-04T06:11:41.2610642Z   �[90m/home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/e2e.go:770�[0m
2023-07-04T06:11:41.2612006Z     should get node not ready with a too small MTU
2023-07-04T06:11:41.2613470Z     �[90m/home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/e2e.go:804�[0m
2023-07-04T06:11:41.2615310Z �[90m------------------------------�[0m
2023-07-04T06:11:41.2615844Z �[0mUnidling�[0m �[90mWith annotated service�[0m 
2023-07-04T06:11:41.2620414Z   �[1mShould connect to an unidled backend at the first attempt�[0m
2023-07-04T06:11:41.2621087Z   �[37m/home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/unidling.go:206�[0m
2023-07-04T06:11:41.2621448Z [BeforeEach] Unidling
2023-07-04T06:11:41.2621870Z   /home/runner/go/pkg/mod/k8s.io/[email protected]/test/e2e/framework/framework.go:187
2023-07-04T06:11:41.2622280Z �[1mSTEP�[0m: Creating a kubernetes client
2023-07-04T06:11:41.2622614Z Jul  4 06:11:41.260: INFO: >>> kubeConfig: /home/runner/ovn.conf
2023-07-04T06:11:41.2623219Z �[1mSTEP�[0m: Building a namespace api object, basename unidling
2023-07-04T06:11:41.2839965Z �[1mSTEP�[0m: Waiting for a default service account to be provisioned in namespace
2023-07-04T06:11:41.2864341Z �[1mSTEP�[0m: Waiting for kube-root-ca.crt to be provisioned in namespace
2023-07-04T06:11:41.2889349Z [BeforeEach] Unidling
2023-07-04T06:11:41.2889978Z   /home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/unidling.go:50
2023-07-04T06:11:41.2890321Z [BeforeEach] With annotated service
2023-07-04T06:11:41.2890826Z   /home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/unidling.go:130
2023-07-04T06:11:41.2940942Z �[1mSTEP�[0m: creating an annotated service with no endpoints and idle annotation
2023-07-04T06:11:41.3155215Z �[1mSTEP�[0m: creating execpod-noendpoints on node ovn-control-plane
2023-07-04T06:11:41.3156927Z Jul  4 06:11:41.315: INFO: Creating new exec pod
2023-07-04T06:12:19.4039438Z Jul  4 06:12:19.403: INFO: waiting up to 30s to connect to testserviceqarmj:80
2023-07-04T06:12:19.4097239Z [It] Should connect to an unidled backend at the first attempt
2023-07-04T06:12:19.4098066Z   /home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/unidling.go:206
2023-07-04T06:12:19.4683709Z Jul  4 06:12:19.467: INFO: Running '/usr/local/bin/kubectl --server=https://127.0.0.1:41547 --kubeconfig=/home/runner/ovn.conf --namespace=unidling-7839 exec execpod-noendpointsfwk7s -- /bin/sh -x -c /agnhost connect --timeout=10s testserviceqarmj:80'
2023-07-04T06:12:20.4758761Z Jul  4 06:12:20.463: INFO: rc: 1
2023-07-04T06:12:20.4776969Z Jul  4 06:12:20.463: INFO: checking service with cmd "/agnhost connect --timeout=10s testserviceqarmj:80" from pod execpod-noendpointsfwk7s in ns unidling-7839 returned stdout:  stderr: + /agnhost connect '--timeout=10s' testserviceqarmj:80
2023-07-04T06:12:20.4788956Z DNS: lookup testserviceqarmj on 10.96.0.10:53: read udp 10.244.2.12:46943->10.96.0.10:53: read: connection refused
2023-07-04T06:12:20.4789698Z command terminated with exit code 1
2023-07-04T06:12:20.4789872Z 
2023-07-04T06:12:20.4790145Z Jul  4 06:12:20.463: FAIL: Expected
2023-07-04T06:12:20.4798542Z     <e2e.serviceStatus>: 2
2023-07-04T06:12:20.4801180Z to equal
2023-07-04T06:12:20.4817579Z     <e2e.serviceStatus>: 0
2023-07-04T06:12:20.4821970Z 
2023-07-04T06:12:20.4823777Z Full Stack Trace
2023-07-04T06:12:20.4824273Z github.com/ovn-org/ovn-kubernetes/test/e2e.glob..func28.3.6()
2023-07-04T06:12:20.4824845Z 	/home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/unidling.go:228 +0x223
2023-07-04T06:12:20.4825278Z github.com/onsi/ginkgo/internal/leafnodes.(*runner).runSync(0xc000c5d2f0?)
2023-07-04T06:12:20.4826070Z 	/home/runner/go/pkg/mod/github.com/onsi/[email protected]/internal/leafnodes/runner.go:113 +0xb1
2023-07-04T06:12:20.4826672Z github.com/onsi/ginkgo/internal/leafnodes.(*runner).run(0xc000c5d5a0?)
2023-07-04T06:12:20.4827525Z 	/home/runner/go/pkg/mod/github.com/onsi/[email protected]/internal/leafnodes/runner.go:64 +0x125
2023-07-04T06:12:20.4828065Z github.com/onsi/ginkgo/internal/leafnodes.(*ItNode).Run(0x53?)
2023-07-04T06:12:20.4828885Z 	/home/runner/go/pkg/mod/github.com/onsi/[email protected]/internal/leafnodes/it_node.go:26 +0x7b
2023-07-04T06:12:20.4829688Z github.com/onsi/ginkgo/internal/spec.(*Spec).runSample(0xc000ac2ff0, 0xc000c5d968?, {0x1fddec0, 0xc00007e900})
2023-07-04T06:12:20.4830192Z 	/home/runner/go/pkg/mod/github.com/onsi/[email protected]/internal/spec/spec.go:215 +0x2a9
2023-07-04T06:12:20.4836986Z github.com/onsi/ginkgo/internal/spec.(*Spec).Run(0xc000ac2ff0, {0x1fddec0, 0xc00007e900})
2023-07-04T06:12:20.4838164Z 	/home/runner/go/pkg/mod/github.com/onsi/[email protected]/internal/spec/spec.go:138 +0xf2
2023-07-04T06:12:20.4846869Z github.com/onsi/ginkgo/internal/specrunner.(*SpecRunner).runSpec(0xc000210160, 0xc000ac2ff0)
2023-07-04T06:12:20.4847497Z 	/home/runner/go/pkg/mod/github.com/onsi/[email protected]/internal/specrunner/spec_runner.go:200 +0xf1
2023-07-04T06:12:20.4848050Z github.com/onsi/ginkgo/internal/specrunner.(*SpecRunner).runSpecs(0xc000210160)
2023-07-04T06:12:20.4848650Z 	/home/runner/go/pkg/mod/github.com/onsi/[email protected]/internal/specrunner/spec_runner.go:170 +0x1b6
2023-07-04T06:12:20.4849183Z github.com/onsi/ginkgo/internal/specrunner.(*SpecRunner).Run(0xc000210160)
2023-07-04T06:12:20.4849931Z 	/home/runner/go/pkg/mod/github.com/onsi/[email protected]/internal/specrunner/spec_runner.go:66 +0xc5
2023-07-04T06:12:20.4850541Z github.com/onsi/ginkgo/internal/suite.(*Suite).Run(0xc0001109a0, {0x7fcf81a531f8, 0xc0002fcd00}, {0x1cf4fde, 0x9}, {0xc00007c700, 0x2, 0x2}, {0x1ff8300, 0xc00007e900}, ...)
2023-07-04T06:12:20.4851078Z 	/home/runner/go/pkg/mod/github.com/onsi/[email protected]/internal/suite/suite.go:79 +0x4e5
2023-07-04T06:12:20.4851683Z github.com/onsi/ginkgo.runSpecsWithCustomReporters({0x1fe07c0?, 0xc0002fcd00}, {0x1cf4fde, 0x9}, {0xc00007c6e0, 0x2, 0x1d0e230?})
2023-07-04T06:12:20.4852272Z 	/home/runner/go/pkg/mod/github.com/onsi/[email protected]/ginkgo_dsl.go:245 +0x189
2023-07-04T06:12:20.4852877Z github.com/onsi/ginkgo.RunSpecsWithDefaultAndCustomReporters({0x1fe07c0, 0xc0002fcd00}, {0x1cf4fde, 0x9}, {0xc00006b5a0, 0x1, 0x1})
2023-07-04T06:12:20.4853489Z 	/home/runner/go/pkg/mod/github.com/onsi/[email protected]/ginkgo_dsl.go:228 +0x1be
2023-07-04T06:12:20.4854109Z github.com/ovn-org/ovn-kubernetes/test/e2e.TestE2e(0x0?)
2023-07-04T06:12:20.4854728Z 	/home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/e2e_suite_test.go:71 +0x2c5
2023-07-04T06:12:20.4855174Z testing.tRunner(0xc0002fcd00, 0x1e15678)
2023-07-04T06:12:20.4856002Z 	/opt/hostedtoolcache/go/1.19.6/x64/src/testing/testing.go:1446 +0x10b
2023-07-04T06:12:20.4856408Z created by testing.(*T).Run
2023-07-04T06:12:20.4857243Z 	/opt/hostedtoolcache/go/1.19.6/x64/src/testing/testing.go:1493 +0x35f
2023-07-04T06:12:20.4857647Z [JustAfterEach] Unidling
2023-07-04T06:12:20.4858579Z   /home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/util.go:903
2023-07-04T06:12:20.4859138Z �[1mSTEP�[0m: creating a backend pod for the service testserviceqarmj
2023-07-04T06:12:20.7951858Z Jul  4 06:12:20.793: INFO: The status of Pod pod-backend is Pending, waiting for it to be Running (with Ready = true)
2023-07-04T06:12:22.3184436Z [AfterEach] Unidling
2023-07-04T06:12:22.3184972Z   /home/runner/go/pkg/mod/k8s.io/[email protected]/test/e2e/framework/framework.go:188
2023-07-04T06:12:22.3185671Z �[1mSTEP�[0m: Collecting events from namespace "unidling-7839".
2023-07-04T06:12:22.3239920Z �[1mSTEP�[0m: Found 6 events.
2023-07-04T06:54:48.2950647Z 
2023-07-04T06:54:48.2951730Z �[91m�[1mSummarizing 2 Failures:�[0m
2023-07-04T06:54:48.2952695Z 
2023-07-04T06:54:48.2955161Z �[91m�[1m[Fail] �[0m�[90mUnidling �[0m�[0mWith annotated service �[0m�[91m�[1m[It] Should connect to an unidled backend at the first attempt �[0m
2023-07-04T06:54:48.2956736Z �[37m/home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/unidling.go:228�[0m
2023-07-04T06:54:48.2958320Z 
2023-07-04T06:54:48.2965533Z �[91m�[1m[Panic!] �[0m�[90mUnidling �[0m�[91m�[1m[BeforeEach] With annotated service �[0m�[90mShould connect to an unidled backend at the first attempt �[0m
2023-07-04T06:54:48.2968966Z �[37m/opt/hostedtoolcache/go/1.19.6/x64/src/runtime/panic.go:884�[0m
2023-07-04T06:54:48.2972764Z 
2023-07-04T06:54:48.2975518Z �[1m�[91mRan 86 of 228 Specs in 3725.603 seconds�[0m
2023-07-04T06:54:48.2978349Z �[1m�[91mFAIL!�[0m -- �[32m�[1m85 Passed�[0m | �[91m�[1m1 Failed�[0m | �[33m�[1m0 Flaked�[0m | �[33m�[1m0 Pending�[0m | �[36m�[1m142 Skipped�[0m
2023-07-04T06:54:48.2981415Z 
2023-07-04T06:54:48.2981749Z �[38;5;228mYou're using deprecated Ginkgo functionality:�[0m

if i see a second one I am going to open an issue: https://github.com/ovn-org/ovn-kubernetes/actions/runs/5451066748/jobs/9917243947?pr=3741

@jordigilh
Copy link
Contributor

jordigilh commented Jul 4, 2023

May I suggest an alternative option to use XDescribe

var _ = ginkgo.XDescribe("OVN for APB External Route Operations", func() {

Instead of commenting the code?

This options still has an exit code of 0 for the test and it reports the tests as pending rather than gone:

Ran 282 of 313 Specs in 540.808 seconds
SUCCESS! -- 282 Passed | 0 Failed | 31 Pending | 0 Skipped
PASS

Rather than:

Ran 313 of 313 Specs in 568.421 seconds
SUCCESS! -- 313 Passed | 0 Failed | 0 Pending | 0 Skipped
PASS

when running all the tests (assuming none is commented out).

@tssurya
Copy link
Member Author

tssurya commented Jul 4, 2023

yes as discussed in meeting will use XDescribe

@tssurya tssurya force-pushed the disable-apb-uts-temporarily branch from 658639b to fc5ca9b Compare July 4, 2023 16:15
@tssurya tssurya closed this Jul 5, 2023
@tssurya tssurya reopened this Jul 5, 2023
@tssurya
Copy link
Member Author

tssurya commented Jul 5, 2023

@jcaamano : this one is good once CI passes.

@jcaamano
Copy link
Contributor

jcaamano commented Jul 5, 2023

@tssurya should the failing tests be disabled?

[Fail] External Gateway With Admin Policy Based External Route CRs e2e multiple external gateway stale conntrack entry deletion validation Static Hop: Should validate conntrack entry deletion for TCP/UDP traffic via multiple external gateways a.k.a ECMP routes [It] IPV4 udp 
/home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/external_gateways.go:2642

[Fail] External Gateway With Admin Policy Based External Route CRs e2e multiple external gateway stale conntrack entry deletion validation Static Hop: Should validate conntrack entry deletion for TCP/UDP traffic via multiple external gateways a.k.a ECMP routes [It] IPV4 udp 
/home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/external_gateways.go:1403

[Fail] External Gateway With Admin Policy Based External Route CRs e2e multiple external gateway stale conntrack entry deletion validation Static Hop: Should validate conntrack entry deletion for TCP/UDP traffic via multiple external gateways a.k.a ECMP routes [It] IPV4 tcp 
/home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/external_gateways.go:2642

@tssurya
Copy link
Member Author

tssurya commented Jul 5, 2023

@jcaamano : I think those are fine, if e2e's are failing since its a separate lane anyways people can ignore the whole lane.. the UTs are the ones blocking other PRs. cc @npinaeva @jordigilh : saving the e2e failure link here in case its useful: https://github.com/ovn-org/ovn-kubernetes/actions/runs/5462546420/jobs/9943298187?pr=3741

@jcaamano jcaamano force-pushed the disable-apb-uts-temporarily branch from fc5ca9b to d5d3506 Compare July 5, 2023 12:29
Disable UTs for APB Temporarily till the feature
is stabilized.

Reason: UTs are very flaky. Each PR is needed a
minimum of 3 close/open combinations and luck for CI
to pass to even run e2e's. At this stage where other
features are trying to get in before the deadline this
process is painful.

Signed-off-by: Surya Seetharaman <[email protected]>
@jcaamano jcaamano force-pushed the disable-apb-uts-temporarily branch from d5d3506 to 4a24833 Compare July 5, 2023 14:13
@tssurya
Copy link
Member Author

tssurya commented Jul 5, 2023

wait why is this PR's CI running so many times?
ah i guess its because other things are merging and we have to rebase before merge? Is that a golden rule even if there are no conflicts with master?

@jcaamano
Copy link
Contributor

jcaamano commented Jul 5, 2023

wait why is this PR's CI running so many times? ah i guess its because other things are merging and we have to rebase before merge? Is that a golden rule even if there are no conflicts with master?

I don't have permissions to do anything else.

@jcaamano jcaamano merged commit 926a1dc into ovn-org:master Jul 5, 2023
25 checks passed
@tssurya
Copy link
Member Author

tssurya commented Jul 5, 2023

I don't have permissions to do anything else.

ahhh :(

thanks @jcaamano for getting to this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants