Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Actmon crashing bug. #2689

Open
dgarnier opened this issue Jan 13, 2024 · 2 comments
Open

Actmon crashing bug. #2689

dgarnier opened this issue Jan 13, 2024 · 2 comments
Labels
branch/alpha This is present on or relates to the alpha branch bug An unexpected problem or unintended behavior tool/actions Relates to the action tools (actions, actmon, actlog)

Comments

@dgarnier
Copy link
Contributor

Affiliation
OpenStar

Version(s) Affected
latest-alpha a

Platform
Ubuntu 22.04

Describe the bug
Dispatching an action will cause the action monitor to crash.

To Reproduce
Steps to reproduce the behavior:

actmon -m JUNIOR_MONITOR &
mdstcl
TCL> set tree junior /shot=240112003
TCL> dispatch/build
TCL> dispatch/phase/monitor=JUNIOR_MONITOR init

Expected behavior
Action monitor would fill with completed or completing actions.

likely clause

Its pretty clear that ServerSendMessage is calling the ast with its first argument in servershr/Job.h:

if (callback_done)
  callback_done(j->callback_param);

but its defined with a dummy first argument and a second argument in actions/actlogp.h:

static void MessageAst(void *dummy __attribute__((unused)), char *reply)

I did try to do this change.. but it and it fixes the crashing bug, but I'm not getting results to show up in actmon. Did this just die of bit rot at some stage?

@dgarnier dgarnier added the bug An unexpected problem or unintended behavior label Jan 13, 2024
@mwinkel-dev mwinkel-dev added tool/actions Relates to the action tools (actions, actmon, actlog) branch/alpha This is present on or relates to the alpha branch labels Jan 13, 2024
@zack-vii
Copy link
Contributor

Hi there,
I did a quick lookup on the callback and I think the caller should be found here.

static void event_ast(void *astprm, int msglen __attribute__((unused)), char *msg)

So the signature seems to be right. You can try actlog instead of actmon for a terminal based variant of actmon (no x required). Maybe running it in gdb will point you in the right direction. If i remember correctly it is crutial that actmon/actlog have access to the action nodes and hence shot files in order to resolve the tree paths. there was a way to fallback to model files instead i think. also it may fail if it missed the newshot and subsequent build of the action tree.

@dgarnier
Copy link
Contributor Author

dgarnier commented Feb 3, 2024

Hi Timo,
Yes.. you are right. event_ast uses the correct prototype. The problem was I was using the "old method" of a mdsip based monitor server. This will use ServerSendMessage which has the problem. Based on your point, it shouldn't be changed in actlogp.h but in either with another wrapper in ServerMonitorCheckin.c or in ServerSendMessage.c. As far as I can tell, ServerSendMessage is only ever called with ast = NULL or from this one call, so probably it should be ServerSendMessage that gets corrected.. and better yet.. have a proper prototype in servershrp.h. So, this is a bug, but lower in my priority since I have a workaround. In the meantime, I have other issues with servershr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch/alpha This is present on or relates to the alpha branch bug An unexpected problem or unintended behavior tool/actions Relates to the action tools (actions, actmon, actlog)
Projects
None yet
Development

No branches or pull requests

3 participants