Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xdmod ticket #33271 aggregate_supremm.sh runs to complete, but no data appears in the supremm section of xdmod #227

Open
rob-baron opened this issue Aug 15, 2023 · 1 comment

Comments

@rob-baron
Copy link
Contributor

Robert Bartlett Baron, reported about 1 month ago
The script "aggregate_supremm.sh" ran to completion.

However, I still do not see any data appearing in the supremm section within the xdmod GUI.

Any suggestions as to what to check next?
Robert Bartlett Baron , said 28 days ago
We have modeled our OpenShift cluster as an HPC cluster, so the individual pods are showing up as jobs. A pod usually uses a couple hundred milli-cpu, or 0.200 CPU. When this gets shredded, it appears as 0 CPU in the jobs table. But the job's table cpu only has an integer value.

Unfortunately, none of these gets transferred to the supremm table when the aggregate_supremm.sh runs.

Should we report CPU as milli CPU - that is 200 - which could be stored in an integer value. If so, then how would it it reported by supremm, so that the units are in CPU (the values from the jobs table are divided by 1000.

Is there a better way to do this?
Robert Bartlett Baron

@rob-baron
Copy link
Contributor Author

rob-baron commented Sep 4, 2023

Conner Saeli , said 7 days ago
Ticket: https://help.xdmod.org/support/tickets/33271

Hi Robert,

Just to clarify, are you referring to the "cores" or "cores_avail" in modw_supremm.job?

The "cores" column for the job table in supremm is taken from the "processor_count" column in modw.job_tasks. "processor_count" is also stored as an integer. My follow-up questions for you are:
Is the value for "processor_count" in modw.job_tasks for your instance of XDMoD stored in milicpus? Is this number an integer or a float?
Are you able to properly view Job information?
On the other hand, the "cores_avail" column is populated from performance data that is available for individual cores. For example, if a job requests 4 cores, but there are performance data for only 3 cores, then "cores" would be the "cores_avail" column would be 3. I do not know how PCP reports fractional CPUs in OpenShift, so I cannot provide any insight into how to use this information in supremm.

​Thanks,
Conner Saeli

Robert Bartlett Baron , said less than a minute ago
Conner,

Thank you for responding to me after 2 months. As a way of working around the lack of support for cloud computing, specifically lack of support for kubernetes, we created a PCP report to contain equivalent information.

When we went checked the results we were finding that the processor_count was reporting the floor of the value in the log file. As I hadn't heard back from you, I created a test set of log files and multiplied the ncpu filed by 1000 and made it an int. After shredding and ingesting, the processor_count field was being stored as an integer, though the technical unit would be in milli cores.

And yes, the job information is viewable.

So both the "processor_count" and the "cores_avail" need to be integers. I'm assuming that if they are set in milli cores that aggregate_supremm will create the ratio of processor_count/cores_avail which would be unitless.

As my last day at BU is 06-SEP-2023 (in 2 days) feel free to close this ticket.

Thank you,

Robert Baron

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant