Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[💡 FEATURE REQUEST]: Improve clarity of worker errors when starting a server #2020

Open
Kaspiman opened this issue Oct 6, 2024 · 14 comments
Assignees
Labels
C-feature-accepted Category: Feature discussed and accepted P-logs Plugin: logs

Comments

@Kaspiman
Copy link

Kaspiman commented Oct 6, 2024

Plugin

Server

I have an idea!

When my newbie was getting used to the RR, some problems arose that were difficult to diagnose.

Let's say i have typical RR config for production usage for Logs plugin:

logs:
    level: warn
    mode: production

and also a broken worker with a syntax error (for example).

Then the output will only include the EOF error without any details.

handle_serve_command: Function call error:
2024-10-06T18:27:36.449763889Z 	serve error from the plugin *http.Plugin stopping execution, error: static_pool_allocate_workers: WorkerAllocate: EOF

A newbie has questions: why the server won't start and what is EOF?

The problem is that errors from the worker only come from the debug level. In production mode, it is difficult to immediately understand what happened. EOF is too unclear a reason. Also, an error in the EOF text was sent to the browser without any details.

What if RR write a more meaningful error message right away? It seems that the worker startup error should be more understandable and the error level should be.

Another small question. Do RR really use STDERR to exchange data frames?
https://docs.roadrunner.dev/docs/php-worker/worker#communication-methods

@Kaspiman Kaspiman added the C-feature-request Category: feature requested, but need to be discussed label Oct 6, 2024
@rustatian
Copy link
Member

rustatian commented Oct 6, 2024

Hey @Kaspiman 👋
Unfortunately, it is not possible to determine the exact error, I understand that EOF is too general error, but usually, it points to some error with the PHP file, e.g.: no such file for example or some garbage data in the process pipes which prevents RR to allocate a PHP process. The case is, RR does not know about PHP at all. From the RR perspective, it is just a child process and if the runtime can't provide additional information, RR unfortunately can't obtain such an info by itself somewhere.

Also, an error in the EOF text was sent to the browser without any details.

This is kind of strange, because EOF usually indicates a startup problem. When RR is running, this is highly unlikely to see this error. If you don't move the worker and send a ./rr reset command.

Another small question. Do RR really use STDERR to exchange data frames?

RR uses stdin/stdout to communicate with the PHP process. STDERR used to obtain logs/diagnostic information from the PHP process.

@rustatian
Copy link
Member

rustatian commented Oct 6, 2024

It would be nice if you attach to the ticket some examples of PHP workers + actual error messages + desired error messages + .rr.yaml configurations.

@rustatian rustatian moved this to 📋 Backlog in Jira 😄 Oct 6, 2024
@Kaspiman
Copy link
Author

Kaspiman commented Oct 7, 2024

Use Case №1:

Any broken php worker, for example with syntax error.

http:
    pool:
        num_workers: 1

logs:
    level: warn

Actual error:

[INFO] RoadRunner server started; version: 2024.2.1, buildtime: 2024-09-12T16:25:02+0000
[INFO] sdnotify: not notified
handle_serve_command: Function call error:
serve error from the plugin *http.Plugin stopping execution, error: static_pool_allocate_workers: WorkerAllocate: EOF

Desired error message with error level: Failed to allocate worker, check application logs with stacktrace from PHP.

Now this trace is visible only from the info level.

Logs with info level and broken php worker:

[INFO] RoadRunner server started; version: 2024.2.1, buildtime: 2024-09-12T16:25:02+0000
[INFO] sdnotify: not notified
INFO    server          Warning: Undefined variable $worker in /var/www/html/worker.php on line 20
 Fatal error: Uncaught TypeError: Spiral\RoadRunner\Http\PSR7Worker::__construct(): Argument #1 ($worker) must be of type Spiral\RoadRunner\WorkerInterface, null given, called in /var/www/html/worker.php on line 20 and defined in /var/www/html/vendor/spiral/roadrunner-http/src/PSR7Worker.php:42
 Stack trace:
 #0 /var/www/html/worker.php(20): Spiral\RoadRunner\Http\PSR7Worker->__construct(NULL, Object(Nyholm\Psr7\Factory\Psr17Factory), Object(Nyholm\Psr7\Factory\Psr17Factory), Object(Nyholm\Psr7\Factory\Psr17Factory))
 #1 {main}
   thrown in /var/www/html/vendor/spiral/roadrunner-http/src/PSR7Worker.php on line 42
 
 handle_serve_command: Function call error:
     serve error from the plugin *http.Plugin stopping execution, error: static_pool_allocate_workers: WorkerAllocate: EOF

Basically, we need to log the worker output at the error level unless otherwise stated.


Use Case №2:

debug = true and any broken php worker, for example with syntax error.

logs:
    level: warn

http:
    pool:
        debug: true

Logs:

[INFO] RoadRunner server started; version: 2024.2.1, buildtime: 2024-09-12T16:25:02+0000
[INFO] sdnotify: not notified
ERROR	http        	execute	{"start": "2024-10-07T07:32:12+0000", "elapsed": 9, "error": "EOF"}
ERROR	http        	execute	{"start": "2024-10-07T07:32:12+0000", "elapsed": 9, "error": "EOF"}

The server has started, and when processing one request, two EOF errors are written to the logs.

Error 500 and EOF message in browser.

Desired error message in logs with error level:Failed to allocate worker, check application logs with stacktrace from PHP.

Desired error message in browser: Failed to allocate worker, check application logs with/without (idk) stacktrace from PHP.

@Kaspiman
Copy link
Author

Kaspiman commented Oct 7, 2024

Is it possible to rework this kind of error?

Typical error:

 handle_serve_command: Function call error:
     serve error from the plugin *http.Plugin stopping execution, error: static_pool_allocate_workers: WorkerAllocate: EOF

It contains a lot of information that is more useful to the developer than to the user.

Possible rework:

Error from the <PluginName> plugin, message: Failed to allocate worker, check application logs, code: static_pool_allocate_workers, stacktrace: ...

@rustatian
Copy link
Member

Ok, so, if I understand correctly, the problem is in log level, because under the info log level you may clearly see what is the actual error.
The error about Function call comes from the Endure, but definitely it is possible to add some link to docs or guideline of what to do next to see the error. Will add this to 2024.3 milestone.

@rustatian rustatian added C-feature-accepted Category: Feature discussed and accepted P-logs Plugin: logs and removed C-feature-request Category: feature requested, but need to be discussed labels Oct 7, 2024
@rustatian rustatian added this to the v2024.3 milestone Oct 7, 2024
@rustatian rustatian moved this from 📋 Backlog to 🔖 Ready in Jira 😄 Oct 7, 2024
@rustatian
Copy link
Member

The actual problem here, is that we don't have a log level from PHP. Thus, all worker-related messages (RR can't distinguish between a just message and an error message, from the RR POV all that is text) are logged under the info log level. And when you use warn, obviously, all info messages are not displayed (and this is the problem).

A real solution I want to see here is a some configuration exchange from the PHP worker side to RR. Like in the Temporal plugin. RR on the init stage pings all workers, so it would be nice to exchange a configuration info between RR <-> PHP. Logging configuration is the first and obvious step.

@Kaspiman
Copy link
Author

Kaspiman commented Oct 7, 2024

Is it possible to read all logs from an application with an error level if a specific level is not specified? This will also help in other cases, for example, when the output is erroneously sent to stdout.

@rustatian
Copy link
Member

Is it possible to read all logs from an application with an error level if a specific level is not specified?

This is not a proper solution to a problem. Worker's log level should be passed from the PHP Worker, otherwise - that particular worker should be treated as invalid and re-created.
If you would like to collaborate and create a PHP part, I would be happy to provide a protocol specs and RR implementation.

@Kaspiman
Copy link
Author

Kaspiman commented Oct 7, 2024

This problem may not only be in the worker. What if someone has their own worker, and not the one offered by Spiral?

What if it is a script launched in the Service plugin or in the server.on_init section? We won't be able to set up a worker there, it simply doesn't exist.

I think it's possible to interpret any output from the worker as an error. This will force people to catch errors and use the Logs plugin.

@rustatian
Copy link
Member

We do not support third-party workers. But all needs should be covered by the well-defined protocol. And from this point it doesn't actually matters, who created a worker.

I think it's possible to interpret any output from the worker as an error

This is possible, but logs are a very sensitive topic. And this topic should be well-discussed.

@Kaspiman
Copy link
Author

Kaspiman commented Oct 7, 2024

So, let's discuss with community. Discord or https://github.com/orgs/roadrunner-server/discussions?

@rustatian
Copy link
Member

You may create both, discussion on the GitHub and create a topic in the Discord forum.
image

@Kaspiman
Copy link
Author

The actual problem here, is that we don't have a log level from PHP. Thus, all worker-related messages (RR can't distinguish between a just message and an error message, from the RR POV all that is text) are logged under the info log level. And when you use warn, obviously, all info messages are not displayed (and this is the problem).

A real solution I want to see here is a some configuration exchange from the PHP worker side to RR. Like in the Temporal plugin. RR on the init stage pings all workers, so it would be nice to exchange a configuration info between RR <-> PHP. Logging configuration is the first and obvious step.

Could you please provide more accurate description of that exchange? What should do PHP worker in startup phrase?

@rustatian
Copy link
Member

Just a structure with some fields. Like version, desired log level and that's it at the moment I guess. You may take a look at the temporal PHP worker and worker info structure it sends to Go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature-accepted Category: Feature discussed and accepted P-logs Plugin: logs
Projects
Status: 🔖 Ready
Development

No branches or pull requests

2 participants