Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zinc server becomes unresponsive when the image is saved while a ‘system’ process is running #101

Open
Rinzwind opened this issue Sep 27, 2022 · 4 comments

Comments

@Rinzwind
Copy link

In the following example, the second #get: message signals a ConnectionTimedOut:

ZnServer startDefaultOn: 1701.
ZnClient new get: 'http://localhost:1701'.
LibC system: 'sleep 2 &'.
Smalltalk snapshot: true andQuit: false.
(Delay forSeconds: 3) wait.
ZnClient new get: 'http://localhost:1701'.

This seems to be because saving the image causes the Zinc server to try to set up a new socket through #initializeServerSocket, while the previous socket remains open in the child process created by the ‘system’ call.

Version information:

  • Image: Pharo-11.0.0+build.224.sha.6fcd6c1aaf965912b63ffc2ef2059d10cf57c744 (64 Bit)
  • VM: Pharo 9.0.18 built on Sep 2 2022 16:54:02 Compiler: 4.2.1 Compatible Apple LLVM 11.0.3 (clang-1103.0.32.29)
  • OS: macOS 11.7
@Rinzwind
Copy link
Author

I think the issue can be avoided by setting the FD_CLOEXEC flag on the socket file descriptor, but I’m not sure how to do that through Socket.

See the C program given below, which can be compiled without and with SET_CLOEXEC defined:

$ cc -o test1 test.c; cc -o test2 -DSET_CLOEXEC test.c

The exit status of test1 is 4:

$ ./test1; echo $?
4

While the exit status of test2 is 0:

$ ./test2; echo $?
0

test.c:

#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <fcntl.h>

int csocket() {
	int sfd = socket(AF_INET, SOCK_STREAM, 0);
#ifdef SET_CLOEXEC
	if (fcntl(sfd, F_SETFD, FD_CLOEXEC) == -1) return -1;
#endif
	if (sfd == -1) return -1;
	struct sockaddr_in sai;
	memset(&sai, 0, sizeof(sai));
	sai.sin_family = AF_INET;
	sai.sin_port = htons(1701);
	if (bind(sfd, (struct sockaddr *)&sai, sizeof(sai)) == -1) return -1;
	if (listen(sfd, 1) == -1) return -1;
	return sfd;
}

int main() {
	int sfd1 = csocket();
	if (sfd1 == -1) return 1;
	if (system("sleep 1 &") != 0) return 2;
	if (close(sfd1) == -1) return 3;
	int sfd2 = csocket();
	if (sfd2 == -1) return 4;
	return 0;
}

@Rinzwind
Copy link
Author

Related issue with Socket: pharo-project/pharo#11723

@Rinzwind
Copy link
Author

The following obviously needs a cleaner implementation, but it shows that the issue can be avoided by setting FD_CLOEXEC on the Zinc server’s socket, the second #get: message no longer signals an error:

"Platform-dependent constants, values given are for macOS 11.7 (Intel):"
offsetPrivateSocketPtrInSQSocketStruct := 8.
offsetSInPrivateSocketStruct := 0.
sizeOfInt := 4.
valueFD_SET := 2.
valueFD_CLOEXEC := 1.

ZnServer startDefaultOn: 1701.
LibC uniqueInstance
	fcntl: ((ZnServer default serverSocket socketHandle
		pointerAtOffset: offsetPrivateSocketPtrInSQSocketStruct)
			integerAt: offsetSInPrivateSocketStruct + 1 size: sizeOfInt signed: true)
	cmd: valueFD_SET
	arg: valueFD_CLOEXEC.
ZnClient new get: 'http://localhost:1701'.
LibC system: 'sleep 2 &'.
Smalltalk snapshot: true andQuit: false.
(Delay forSeconds: 3) wait.
ZnClient new get: 'http://localhost:1701'.

With LibC>>#fcntl:cmd:arg: implemented as:

fcntl: fildes cmd: cmd arg: arg

 	^ self ffiCall: #(int fcntl(int fildes, int cmd, int arg))

@Rinzwind
Copy link
Author

It’s maybe worth knowing that I mostly encounter(ed) this issue while using PTerm: in the example, the statement LibC system: 'sleep 2 &' can be replaced by TerminalEmulator openBash to get the same issue (with the same solution).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant