Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed webhook setup for work with revers proxy #197

Merged
merged 2 commits into from
Apr 1, 2024
Merged

Fixed webhook setup for work with revers proxy #197

merged 2 commits into from
Apr 1, 2024

Conversation

kvazimoda24
Copy link
Contributor

No description provided.

@J-Rios
Copy link
Owner

J-Rios commented Mar 30, 2024

Hi, thanks for the contribution :)

I get an idea of why there is an issue for setup the Bot to be used through a reverse proxy, but please is a good practice to give an explanation on the Pull Request decription for current issue and your proposed solution to allow better understand of it.

Also, some change requests:

  • Update README file with information on how to setup the Bot to use a Proxy.

  • Instead of modifying current webhook settings to use a full URL, could you please just create a new setting parameter, something like "PROXY_URL" (in both, settings.py and constants.py), default to "None" value, and in case this variable is set, use it instead of the other settings to setup the webhook_url variable (in tlg_app_run() function of join_captcha_bot.py).

Something like:

imagen

Regards.

@kvazimoda24
Copy link
Contributor Author

According to the documentation of the python-telegram-bot library, there is a parameter webhook_url — this is a string that is completely transmitted to the Telegram servers, and at this address the Telegram server will pull our bot.
The remaining parameters concern only our bot and are not transmitted to the Telegram servers (with the exception of secret_token)
And at first glance, it seems that these parameters are somehow connected, but when working through a reverse proxy, it turns out that both the port and the path may differ. The hostname is not used anywhere at all.
Also, in the code that was before my edit, the webhook_url used the token of the bot itself, which is unsafe, because this token will go to the reverse proxy server log.

Based on what was described above, I decided to simply duplicate the library parameters, because it's concise and logical.
We indicate which link to send to the Telegram server, indicate to the bot on which port to wait for the connection and which path to use. If we launch the bot without a reverse proxy, then we simply need to manually enter the port and path in WEBHOOK_URL. It seems to me more logical than creating separate parameters PROXY_URL and PROXY_URL_PATH, which work in a completely unobvious way. Perhaps, to make it more clear, it is worth finalizing the documentation. Your option seems overcomplicated and illogical to me.

@J-Rios
Copy link
Owner

J-Rios commented Mar 31, 2024

Well the problem is that your changes modifies and breaks how the user setup the Bot for Webhook when no proxy is used, this is why I prefer to just check with an if statement if the PROXY_URL variable is set, so in that case we apply your suggestion of single setting for the full URL, otherwise (for no proxy usage) we kept the same procedure and settings variables.

Also forgot to question: did you test your solution, and were you able to receive Telegram events using an intermidiate proxy?

@J-Rios
Copy link
Owner

J-Rios commented Mar 31, 2024

mmm, I think I get a bit more of your point, taking a look into intenal PTB (python-telegram-bot) library implementation it seems that there is 2 ways for setting up the webhook:

A. Only using "listen", "port" and "url_path" arguments (that will intenally create the "web_url" in the way "https://listen:port/url_path").

B. Using "web_url", "listen", "port" and "url_path" arguments (web_url won't be created by python-telegram-bot; and this allow to setup the proxy address here).

So your proposal is to use option B, so "web_url" will be the Proxy URL were Telegram Server will talk with it, then "listen" and "port" values are from the system that runs the Bot and not the Proxy, is that right?

@kvazimoda24
Copy link
Contributor Author

Also forgot to question: did you test your solution, and were you able to receive Telegram events using an intermidiate proxy?

Yes, this code is working for me right now. Public bot @kvazimoda24_capcha_bot

@kvazimoda24
Copy link
Contributor Author

I want to point out that it is not safe to use a bot token in an address. We definitely need to get rid of this. And this change will already break compatibility. And since we are breaking compatibility, then it’s normal to do it, and not these half measures. Launching a bot is not something simple, like copying a file from a flash drive to a computer. And this requires a certain competence from a person. We have a saying in Russia: “Make a product that any idiot can use, and only an idiot will use it.” Sometimes I think you think the people who use this bot are fools who can't understand the purpose of a couple of options. :)
Actually, I don’t see a problem in making an adequate change, even if it breaks backward compatibility. But the behavior of the code will now be predictable. Before my edits, until I looked into the code myself, I did not know that the bot was shining its token in the webhook_url. I understand that the names of the variables in the config may not be obvious. Therefore, I propose to refine this point. For example, named the variable WEBHOOK_LISTEN_PORT. And, of course, finalize the documentation to describe this in as much detail as possible. We can also describe in Changelog why we had to break backward compatibility. That this is done for the sake of safety and predictable behavior.

@kvazimoda24
Copy link
Contributor Author

My logic in this matter is simple: we set a link that we will transfer to the Telegram server (webhook_url), and setting which port to listen to and which path (url_path and port).
If we work without a reverse proxy, then we duplicate the port and path in webhook_url.
If with a reverse proxy, the values ​​may vary.
And further describe in the documentation that using a token as a url_path is an extremely bad idea.

@kvazimoda24
Copy link
Contributor Author

And I'll add more. I use a lot of different software. These are JetBrains YouTrack, and NextCloud, and Passwork... And everywhere there is an analogue of the webhook_url setting, which tells the application how it “looks” from the client side when working through a reverse proxy. There are also separate setting which port to listen to. All this means is that I didn’t come up with such a setup scheme out of thin air. This is common practice.

@J-Rios
Copy link
Owner

J-Rios commented Apr 1, 2024

So your proposal is to use option B, so "web_url" will be the Proxy URL were Telegram Server will talk with it, then "listen" and "port" values are from the system that runs the Bot and not the Proxy, is that right?

In that case the best solution will be to keep all the next settings and setup the webhook as follow:

app.run_webhook(
    listen=CONST["WEBHOOK_HOST"],
    port=CONST["WEBHOOK_PORT"],
    url_path=CONST["WEBHOOK_PATH"],
    webhook_url=CONST["WEBHOOK_URL"],
    key=CONST["WEBHOOK_CERT_PRIV_KEY"],
    cert=CONST["WEBHOOK_CERT"],
    secret_token=CONST["WEBHOOK_SECRET_TOKEN"],
    drop_pending_updates=True,
    allowed_updates=Update.ALL_TYPES
)

@kvazimoda24
Copy link
Contributor Author

Using the WEBHOOK_HOST variable for the listen parameter is a bad idea. listen defines the address that the bot's WEB server will listen to, and the hostname may resolve either to the wrong address or not resolve at all. If you want to set the listen parameter, then it is better to rename the variable to something like WEBHOOK_IP or WEBHOOK_LISTEN_IP.

@J-Rios
Copy link
Owner

J-Rios commented Apr 1, 2024

In this context "Host" doesn't mean "Hostname", it is expected for the Host address, but I agree that can be renamed to "WEBHOOK_IP" or "WEBHOOK_ADDRESS" for a better understanding.

It could be set to fixed "0.0.0.0" in the settings.py by default.

@kvazimoda24
Copy link
Contributor Author

Anything that can be misunderstood will be misunderstood. Therefore, it is better to actually use either "ADDRESS" or "IP". In the end, it looks like we have reached a consensus.

@kvazimoda24
Copy link
Contributor Author

I don't have much experience with Github. Tell me, should I make changes to my pull request?

@J-Rios
Copy link
Owner

J-Rios commented Apr 1, 2024

Sure, feel free to add new commits according to this whenever you want ;)

@J-Rios
Copy link
Owner

J-Rios commented Apr 1, 2024

I don't have much experience with Github. Tell me, should I make changes to my pull request?

You can just add new changes and commits to your fork branch, then when you push, that changes should automatically update and appear in this Merge Request.

Of course, if you have any issue you can open a new Pull Request, but feel free to use current one as test scenario for you to test and check how Github work in this regards ;)

@kvazimoda24
Copy link
Contributor Author

I did as agreed. The descriptions seem to have been corrected everywhere too.

Copy link
Owner

@J-Rios J-Rios left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, all good!

Thanks a lot, I'm going to merge now.

Best regards.

@J-Rios J-Rios merged commit 6ff4b43 into J-Rios:master Apr 1, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants