after vfork and closing zipos file, __isfdkind is incorrect #978

G4Vi · 2023-12-03T05:57:20Z

I made a test:

TEST(zipos, close_after_vfork) {
  if (!IsLinux()) return;
  int zip_fd = open("/zip/life.elf", O_RDONLY);
  ASSERT_NE(-1, zip_fd);
  SPAWN(vfork);
  EXPECT_EQ(1, __isfdkind(zip_fd, kFdZip));
  close(zip_fd);
  int fd = open("/etc/hosts", O_RDONLY);
  ASSERT_NE(-1, fd);
  EXPECT_EQ(0, __isfdkind(fd, kFdZip));
  exit(0);
  EXITS(0);
}

Fails on EXPECT_EQ(0, __isfdkind(fd, kFdZip));

@mrdomino could you please take a look? I'm trying to revive #888 and maintain vfork compat if possible.

The text was updated successfully, but these errors were encountered:

mrdomino · 2023-12-03T07:38:12Z

@jart what should happen here? Is it safe to lock g_fds after vfork? I'm thinking maybe if __vforked then we don't touch refcounts but do clear g_fds?

mrdomino · 2023-12-03T07:46:58Z

(No, that probably won't work as such. I'll think about it some more.)

jart · 2023-12-03T07:58:30Z

I would imagine that it is safe to lock a mutex after vfork, provided you unlock it before calling execve().

mrdomino · 2023-12-03T13:58:10Z

Ok, took a bit to look at this.

Looks like the problem is straightforward. close() does not call __releasefd() if __vforked.

The comment says that it wants to avoid locking so close() doesn't need to block signals.

So one proposal is: if __vforked, have close() block signals, lock fds, and call __releasefd().

Unfortunately, this does not work: master...0bc5e63 dies with nsync panic: attempt to nsync_mu_unlock() an nsync_mu not held in write mode.

I also don't know that much about vfork; maybe the performance implications make that solution undesirable.

jart · 2023-12-04T04:57:24Z

The only thing close() should do when vfork'd is call sys_close(fd). There shouldn't be any need for mutexes or signal masking. The reason why this issue is happening is because the vfork()'d child process appears to be calling __zipos_close() which I assume is modifying the zipos-related data structures in memory. Since the vfork()'d child process shares the same address space as its parent process (including all threads running in the parent process) then once the child dies (after calling execve) and the parent thread resumes execution, the file descriptor is still open in the parent, but the parent's zipos memory says it's closed.

jart · 2023-12-04T05:02:22Z

Speaking of which, I've just realized another issue we need to solve. The __vforked variable needs to be expanded to hold the tid or something. That's because when vfork() is called, the only thread that suspends execution in the parent process is the one that called vfork(). An issue most likely currently exists where, during that time, unrelated threads existing in the parent process will mistakenly go into __vforked mode. The __vforked == 1 condition is only intended to be observed by the child process.

G4Vi · 2023-12-04T07:40:22Z

This is no longer a blocker for the fexecve work. I read about vfork and realized it was vfork unsafe test setup causing the issue, not the fexecve implementation. I rewrote the test to perform the setup before vforking and it's working again.

mrdomino · 2023-12-05T00:44:45Z

I changed the code to not call __zipos_close if __vforked, but the test case is failing; the test case looks wrong to me now; it shouldn't be opening a new file in the child right?

jart · 2023-12-05T07:40:22Z

Calling something like open("/dev/null", O_RDWR) after vfork() is not an unreasonable thing to do, in which case the only thing open() should do is call sys_open().

mrdomino · 2023-12-05T09:41:43Z

Actually, I just checked, and my commit does not do anything. __zipos_close already checks for !__vforked prior to freeing the handle.

It looks like it is the case that open() is only calling sys_open() and close() is only calling sys_close().

I wonder if __isfdkind should have a special case for __vforked that returns 0?

jart · 2023-12-05T09:53:37Z

Yeah I see it now. The test in the original comment is definitely wrong. It doesn't make sense to white-box test g_fds memory from within the vfork'd child. The right thing to do here is to test against the public API. You can do that by hard-coding the file descriptor numbers into your test. This is a GOOD thing.

TEST(zipos, close_after_vfork) {
  ASSERT_SYS(0, 3, open("/zip/life.elf", O_RDONLY));
  SPAWN(vfork);
  ASSERT_SYS(0, 0, close(3));
  ASSERT_SYS(0, 3, open("/etc/hosts", O_RDONLY));
  ASSERT_SYS(0, 0, close(3));
  ASSERT_SYS(EBADF, -1, close(3));
  EXITS(0);
  ASSERT_SYS(0, 0, close(3));
  ASSERT_SYS(EBADF, -1, close(3));
}

That test would much more closely conform to cosmo's best practices for system call testing.

G4Vi · 2023-12-05T17:54:43Z

Yeah I see it now. The test in the original comment is definitely wrong. It doesn't make sense to white-box test g_fds memory from within the vfork'd child. The right thing to do here is to test against the public API. You can do that by hard-coding the file descriptor numbers into your test. This is a GOOD thing.
....

That test would much more closely conform to cosmo's best practices for system call testing.

Agreed! The original test was an example of an issue with a change to the internal api. There's no guarantees or necessarily need for the internal api to maintain a behavior so it didn't really demonstrate a bug.

mrdomino mentioned this issue Dec 5, 2023

don't call __zipos_close if __vforked #984

Merged

jart closed this as completed in #984 Dec 5, 2023

mrdomino mentioned this issue Dec 12, 2023

zipos / vfork cleanup #1004

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

after vfork and closing zipos file, __isfdkind is incorrect #978

after vfork and closing zipos file, __isfdkind is incorrect #978

G4Vi commented Dec 3, 2023

mrdomino commented Dec 3, 2023

mrdomino commented Dec 3, 2023

jart commented Dec 3, 2023

mrdomino commented Dec 3, 2023 •

edited

Loading

jart commented Dec 4, 2023

jart commented Dec 4, 2023

G4Vi commented Dec 4, 2023

mrdomino commented Dec 5, 2023

jart commented Dec 5, 2023

mrdomino commented Dec 5, 2023

jart commented Dec 5, 2023 •

edited

Loading

G4Vi commented Dec 5, 2023

after vfork and closing zipos file, __isfdkind is incorrect #978

after vfork and closing zipos file, __isfdkind is incorrect #978

Comments

G4Vi commented Dec 3, 2023

mrdomino commented Dec 3, 2023

mrdomino commented Dec 3, 2023

jart commented Dec 3, 2023

mrdomino commented Dec 3, 2023 • edited Loading

jart commented Dec 4, 2023

jart commented Dec 4, 2023

G4Vi commented Dec 4, 2023

mrdomino commented Dec 5, 2023

jart commented Dec 5, 2023

mrdomino commented Dec 5, 2023

jart commented Dec 5, 2023 • edited Loading

G4Vi commented Dec 5, 2023

mrdomino commented Dec 3, 2023 •

edited

Loading

jart commented Dec 5, 2023 •

edited

Loading