Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(B)ANP: Add num_anp && num_banp metrics #4239

Merged
merged 1 commit into from
Apr 2, 2024

Conversation

tssurya
Copy link
Member

@tssurya tssurya commented Mar 25, 2024

This commits adds two metrics to:

  1. count number of admin network policies and (between 0 and 100)
  2. number of baseline admin network policies in a cluster (0 or 1)

My goal with these two metrics are:

  1. understand how the user is using this feature
  2. how many admin network policies have they created (I plan to do future PRs for more fine-grained information)
  3. is the user using baseline admin network policy in their cluster?

So its also for the user but more so for us.

I am starting out with the metrics stuff for ANP/BANP
PTAL @martinkennelly at this one for starters. I think this is simple enough to not need the monitoring team's purview but future PRs will need their help. WDYT?

Copy link

netlify bot commented Mar 25, 2024

Deploy Preview for subtle-torrone-bb0c84 ready!

Name Link
🔨 Latest commit a9b6bb2
🔍 Latest deploy log https://app.netlify.com/sites/subtle-torrone-bb0c84/deploys/66049ad06d0871000889261f
😎 Deploy Preview https://deploy-preview-4239--subtle-torrone-bb0c84.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link

netlify bot commented Mar 25, 2024

Deploy Preview for resilient-beignet-2c8d1c ready!

Name Link
🔨 Latest commit f4b87fa
🔍 Latest deploy log https://app.netlify.com/sites/resilient-beignet-2c8d1c/deploys/66018ed08c6ca70008c2762d
😎 Deploy Preview https://deploy-preview-4239--resilient-beignet-2c8d1c.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@tssurya tssurya force-pushed the add-anp-banp-number-count-metrics branch from f4b87fa to 155b72b Compare March 25, 2024 18:38
@coveralls
Copy link

coveralls commented Mar 25, 2024

Coverage Status

coverage: 52.499% (+0.009%) from 52.49%
when pulling 684ae18 on tssurya:add-anp-banp-number-count-metrics
into 0f47b72 on ovn-org:master.

go-controller/pkg/metrics/ovnkube_controller.go Outdated Show resolved Hide resolved
@@ -269,6 +269,23 @@ var metricEgressFirewallCount = prometheus.NewGauge(prometheus.GaugeOpts{
Help: "The number of egress firewall policies",
})

/** AdminNetworkPolicyMetrics Begin**/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whats your goal with these metrics? what do you hope to achieve and how do you want the user to use them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question which I should have put in the PR description,
My goal with these two metrics are:

  1. understand how the user is using this feature
  2. how many admin network policies have they created (I plan to do future PRs for more fine-grained information)
  3. is the user using baseline admin network policy in their cluster?

So its also for the user but more so for us.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan to create any alerts based on these metrics? For example, if theres a limit for any of the CRs, you can put a warning when they are closed to the threshold.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I remember reading that theres a limit on at least one of the CRs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah there is a limit on ANP being max at 100, but TBH, I cannot fathom anyone using more than 10 :D so perhaps alerts is definitely good advice in case someone abuses this (as is always case with all features!)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API enforces this anyway right - so alert might not be needed.

Copy link
Member Author

@tssurya tssurya Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah no API doesn't enforce it (API says max 1000), its an ovnkube thing where I restrict them being able to do only 100 priorities so 100 ANPs, alert might be needed here actually (future PR/CNO).

go-controller/pkg/metrics/ovnkube_controller.go Outdated Show resolved Hide resolved
@tssurya tssurya force-pushed the add-anp-banp-number-count-metrics branch from 155b72b to 684ae18 Compare March 26, 2024 19:24
@tssurya
Copy link
Member Author

tssurya commented Mar 27, 2024

flake is tracked at : #4242

This commits adds the metrics to count
number of admin network policies and number of
baseline admin network policies in a cluster

Signed-off-by: Surya Seetharaman <[email protected]>
@tssurya tssurya force-pushed the add-anp-banp-number-count-metrics branch from 684ae18 to a9b6bb2 Compare March 27, 2024 22:16
@tssurya
Copy link
Member Author

tssurya commented Mar 28, 2024

unrelated flake in UT:

2024-03-27T22:37:58.9886045Z �[91m�[1m[Fail] �[0m�[90mOVN master EgressIP Operations �[0m�[0mOn node UPDATE �[0m�[90m[secondary host network] should perform proper OVN transactions when namespace and pod is created after node egress label switch �[0m�[91m�[1m[It] interconnect enabled; node1 in global and node2 in remote zones �[0m

saving link for debugging: https://github.com/ovn-org/ovn-kubernetes/actions/runs/8459367536/job/23175556649?pr=4239

@trozet trozet merged commit 9ce3872 into ovn-org:master Apr 2, 2024
41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Implement Admin Network Policy API in OVNK
4 participants