Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trino python client doesn't respect forwarded HTTP header #494

Open
1 task done
FPGAwesome opened this issue Nov 5, 2024 · 7 comments
Open
1 task done

Trino python client doesn't respect forwarded HTTP header #494

FPGAwesome opened this issue Nov 5, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@FPGAwesome
Copy link

Expected behavior

When Trino coordinator has http-server.process-forwarded=true set, connections over http can be assumed secured by LB or proxy and should be allowed.

Actual behavior

This client setting prevents such connections from occurring:

if self._http_scheme == constants.HTTP:

Steps To Reproduce

Connecting to trino via http using any authentication method will raise the error, with or without headers in place..

Log output

No response

Operating System

Windows

Trino Python client version

0.330.0

Trino Server version

464

Python version

3.11

Are you willing to submit PR?

  • Yes I am willing to submit a PR!
@hashhar
Copy link
Member

hashhar commented Nov 20, 2024

Good point. The check should additionally check X-Forwarded-Proto there I guess.

To be sure we should check the JDBC driver for what it does.

@hashhar hashhar added the bug Something isn't working label Nov 20, 2024
@hashhar
Copy link
Member

hashhar commented Nov 27, 2024

Hmmmm, wait a sec. Isn't this a problem only when you connect directly to Trino.
If you are going through the LB then the http_scheme needs to be HTTPS since the connection to LB MUST BE SECURE if that's terminating the TLS.

If you connect directly to a Trino coordinator where HTTPS is terminated by LB then yes I can see this check causing a problem.

Either way we should check how the JDBC driver handles this case.

@FPGAwesome
Copy link
Author

So my understanding is that when trino is operating under a service mesh which terminates mTLS automatically, you can make a direct http connection but it would be secured. The issue I ran into using the library is the direct connection uses http but it is technically secured, however I can't indicate to the trino client that this is the case. Apologies if I'm misunderstanding something in the networking, this is just from my understanding of service meshes.

@hashhar
Copy link
Member

hashhar commented Nov 27, 2024

you can make a direct http connection but it would be secured

Do you mean direction connection to Trino coordinator from the Python client in the above statement? If yes then I don't see how using mTLS on the LB makes it secured.

If your LB is using mTLS (and terminating it) then the client which connects to LB will need to present a client certificate and connect over HTTPS (there's no mTLS without HTTPS). The LB will then use HTTP to talk to Trino itself but that's not really relevant for the trino-python-client.

Basically the options are:

  1. Client + client-cert --- HTTPS (mTLS) ---> LB --- HTTP ---> Trino (TLS terminated on LB)
  2. Client + client-cert --- HTTPS (mTLS) ---> LB --- HTTPS ---> Trino (pointless, double TLS)
  3. Client --- HTTPS ---> Trino (will work assuming https is enabled on Trino even though you bypass the LB)
  4. Client --- HTTP ---> Trino (will not work since insecure and connection was not passed through from the LB)

@lihonosov
Copy link

agreeing with @FPGAwesome: the following works with the raw/custom client but does not work with the Trino Python client due to this issue:
Image

@hashhar
Copy link
Member

hashhar commented Nov 27, 2024

can you please share a snippet of how the DB-API connection is being created?

Also please share some details from the failure (stack traces or logs? trino config?).

I'm able to connect to Trino clusters behind a LB terminating TLS using password auth.

Note that in your diagram the "internal app" connecting to Trino over HTTP is considered insecure and hence will not work. However the outside clients will be able to connect when going through the LB which terminates the TLS.

http-server.process-forwarded=true instructs Trino to treat HTTP as safe IF AND ONLY IF the connection was forwarded from something that terminated TLS. This is identified using the X-Forwarded-Proto header and is obviously not true for the Python app you are using inside the mesh since it's not being forwarded and is sending credentials in plaintext over HTTP.

Also are you able to use the trino-cli or JDBC driver to access a Trino coordinator which accepts only terminated TLS while bypassing the LB?

@lihonosov
Copy link

@hashhar please find my answers below:

can you please share a snippet of how the DB-API connection is being created?

with trino.dbapi.connect(
    host=f"https://trino-url",     # cannot use authentication with HTTP
    port=443,
    auth=trino.auth.BasicAuthentication(
        "admin", "secret"
    ),
   http_headers={
       "X-Trino-Transaction-Id": request.headers["Transaction-Id"],
       "X-Forwarded-Proto": request.headers["X-Forwarded-Proto"],
   },
    session_properties={"query_max_run_time": request.headers["Timeout"]},
) as connection:
    cursor = connection.cursor()
    cursor.execute("SELECT 1")

I'm thinking of switching to HTTP to avoid extra network hops to the ALB, as all this traffic is internal, and the initial request to the app is proxied through the ALB with TLS. Does it make sense?

Also please share some details from the failure (stack traces or logs? trino config?).

If I change https:443 to http:8080 in the code above, I will get the following error: cannot use authentication with HTTP

I'm able to connect to Trino clusters behind a LB terminating TLS using password auth.

Yes, it works if I use my own implementation (POST and GET requests), but the trino-python-client library fails

Note that in your diagram the "internal app" connecting to Trino over HTTP is considered insecure and hence will not work. However the outside clients will be able to connect when going through the LB which terminates the TLS.

In this case, the 'internal app' is behind a load balancer with TLS and is not exposed to the public

http-server.process-forwarded=true instructs Trino to treat HTTP as safe IF AND ONLY IF the connection was forwarded from something that terminated TLS. This is identified using the X-Forwarded-Proto header and is obviously not true for the Python app you are using inside the mesh since it's not being forwarded and is sending credentials in plaintext over HTTP.

Why? 🤔 The requests to the app are coming from the ALB and are being proxied to the app

Also are you able to use the trino-cli or JDBC driver to access a Trino coordinator which accepts only terminated TLS while bypassing the LB?

I set up port forwarding to a coordinator that only accepts terminated TLS, bypassed the load balancer, and tried to connect using the JDBC driver. However, it fails with a 403 error (because there is no equivalent to X-Forwarded-Proto in JDBC properties?):

Error starting query at http:// localhost:8080/ v1/ statement returned an invalid response: JsonResponse{statusCode=403, headers={content-length=[60], content-type=[text/ plain], date=[Mon, 02 Dec 2024 22:37:11 GMT], vary=[Accept-Encoding]}, hasValue=false} [Error: Error 403 Forbidden: Authentication over HTTP is not enabled].

However, raw POST requests with the X-Forwarded-Proto header and GET requests work

I'll take another look later, as I'm not sure if I understood your questions correctly and might be missing something, but I hope my answers help clarify things

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

No branches or pull requests

3 participants