Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(backend-python): add basic telestion library for python #423

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

cb0s
Copy link
Member

@cb0s cb0s commented Mar 13, 2024

Summary

Add Telestion libraries for creating backend libraries in Python.

Details

The provided tools should - like all other backends - provide necessary tooling for parsing and combining various config methods and start the nats client.

Additional information

For testing the backend, a vitual environment with Python 3.12 is recommended (earlier versions might not work). The necessary requirements can be downloaded from the requirements.txt.

Related links

CLA

  • I have signed the individual contributor's license agreement and sent it to the board of the WüSpace e. V. organization.

cb0s added 4 commits March 13, 2024 21:11
@pklaschka
Copy link
Member

Please also adjust the .github/CODEOWNERS before we merge this – I would rather not own the Python library, considering that the last time I did major work in Python was over two years ago 🙈.

@cb0s
Copy link
Member Author

cb0s commented Mar 16, 2024

Please also adjust the .github/CODEOWNERS before we merge this – I would rather not own the Python library, considering that the last time I did major work in Python was over two years ago 🙈.

We already set this up when we committed Add a CODEOWNERS file.

@pklaschka
Copy link
Member

Please also adjust the .github/CODEOWNERS before we merge this – I would rather not own the Python library, considering that the last time I did major work in Python was over two years ago 🙈.

We already set this up when we committed Add a CODEOWNERS file.

image

Interesting... For some reason, GitHub seems to believe I'm the owner 🤔

@pklaschka
Copy link
Member

image
Ah, wait... It's only because it includes documentation (the README.md) – disregard what I said 🙈

Copy link
Member

@pklaschka pklaschka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preliminary review – see inline comments.

Comment on lines 32 to 34
def build_config(clazz: type[_TelestionConfigT] = None) -> _TelestionConfigT:
if clazz is None:
clazz = TelestionConfig
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why default to None instead of TelestionConfig if you re-set it immediately afterwards?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because in Python function default arguments are retained between executions. Yes, that's a feature, not a bug.

https://docs.python.org/3/faq/programming.html#why-are-default-values-shared-between-objects

backend-python/src/telestion/backend/config.py Outdated Show resolved Hide resolved
if '://' in url:
_, url = url.split('://', 1)

return f"nats://{config.username}:{config.password}@{url}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaks if the username contains a colon (e.g., database:backup)

return json.loads(msg, **loads_kwargs)


def _prepare_nats_url(config: TelestionConfig) -> str:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I think we should look at pre-existing URL handling APIs (may that be in the Python core or a library), due to the numerous potential attack vectors currently present in the code.

backend-python/src/telestion/backend/lib.py Outdated Show resolved Hide resolved
backend-python/README.md Outdated Show resolved Hide resolved
fussel178 and others added 3 commits December 21, 2024 18:52
Git tries to mimic CRLF files on Windows by changing the line endings on checkout and silently removes them on commit.

TLDR:

On Windows, please set:

```
git config --global core.autocrlf true
```

see https://docs.github.com/en/get-started/getting-started-with-git/configuring-git-to-handle-line-endings
@cb0s cb0s requested review from pklaschka and fussel178 December 23, 2024 00:02
@cb0s cb0s marked this pull request as ready for review December 23, 2024 00:06
@cb0s
Copy link
Member Author

cb0s commented Dec 23, 2024

One of the behaviour tests is still failing, trying to fix that tomorrow.
I would love to get some general feedback though that we can finally merge this tomorrow as an early X-Mas present to us.

…custom arguments in `build_config()` and fix tests
Copy link
Member

@fussel178 fussel178 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preliminary review, I currently don't have the time to further look into the config.py. Separate review is following.

Good first implementation. I've added my comments below.

Some things we need to consider:

  • we need to write some documentation with example usages to get a feeling how we want to use the library
  • deployment: maybe add a pyproject.toml specified in PEP518 and if it works with our special environment (project root ≠ git root)

# ])
# asyncio.run(main())
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait_for(main(), 1)) # fail automatically after one second
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, 1 second is a little bit short especially on a lower end test machine.
I would generously increase this to 10 seconds.
And maybe extract it into global constant, e.g. RUN_TIMEOUT or so.

Unrelated question:
Should we add an execution timeout to the container.run command in backend-features/lib/testbed.py?

result = context.docker.containers.run(
context.testbed,
cmd,
environment=env,
auto_remove=True,
network=nats_network_name(context),
).splitlines()[-1].decode()

nats_connection_options = {
"allow_reconnect": False, # only try to connect once
"max_reconnect_attempts": 0, # test
"connect_timeout": 1, # fail after 1 sec.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe extract it to global constant? (e.g. NATS_CONNECT_TIMEOUT)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With our new E2E test I think we don't this folder anymore. 😉

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put the test and some other example usages in a separate folder samples?

Maybe structured like the backend-go or backend-deno projects?

Comment on lines +113 to +119
if config.nats_user is None or config.nats_password is None:
return url

if '://' in url:
_, url = url.split('://', 1)

return f"nats://{config.nats_user}:{config.nats_password}@{url}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To prevent URL injection and other stuff can you use the urllib.parse.urlparse function to parse the nats_url input?
Then, create another urllib.parse.ParseResult object by copying over the important parts from urllib.parse.urlparse and inserting username and password here.
Finally, you can return a (hopefully) valid URL with urllib.parse.ParseResult.geturl().

Notice:

The urllib.parse function does not validate the user input!
Maybe we need to add this to the doc comments somewhere...

cf. https://docs.python.org/3/library/urllib.parse.html

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking if it would be nice to export the start_service function explicitly. 🤔

So you don't need to import start_service from telestion.backend.lib but from telestion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move the telestion folder with the library code from src into the project root?

The nats.py lib does it, too. (https://github.com/nats-io/nats.py)

Comment on lines +1 to +8
# Telestion Python Backend

This repository contains everything to build Telestion backend services in Python.
Additional examples help in setting everything up.

The Python backend is mainly added to allow for easier interfacing with many scientific libraries, such as numpy,
scipy, tensorflow or pytorch.
Creating static graphs with Matplotlib that can be rendered in the frontend could also be created.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Telestion Python Backend
This repository contains everything to build Telestion backend services in Python.
Additional examples help in setting everything up.
The Python backend is mainly added to allow for easier interfacing with many scientific libraries, such as numpy,
scipy, tensorflow or pytorch.
Creating static graphs with Matplotlib that can be rendered in the frontend could also be created.
# Telestion Service Framework for Python
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.10407142.svg)](https://doi.org/10.5281/zenodo.10407142)
![GitHub License](https://img.shields.io/github/license/wuespace/telestion)
This library provides a framework for building Telestion services in Python.
## Usage
```python
import asyncio
from telestion import start_service
async main():
service = await start_service(nats_disabled=True)
await service.publish('hello', lib.json_encode({'foo': 'bar'}))
await service.close()
if __name__ == '__main__':
asyncio.run(main())
```
TODO: Add short description of example code.
## Behavior Specification
The behavior of this library is specified in
the [Behavior Specification](https://docs.telestion.wuespace.de/Backend%20Development/service-behavior/).
This specification is also used to test the library.
The source code of the tests can be found in the repository under `/backend-features`.
## License
This project is licensed under the terms of the [MIT license](LICENSE).

(adapted from Deno backend)

@@ -6,7 +6,7 @@ Feature: Development mode
Given I have no service configuration
When I start the service with "--dev" without NATS
Then the service should start
And the service should be configured with "NATS_URL" set to "localhost:4222"
And the service should be configured with "NATS_URL" set to either "localhost:4222" or "nats://localhost:4222"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point.
We need to test that every backend can handle both the full url and only the hostname+port part correctly.
But I think that's the scope of another pull request.

Can you please revert that change and implement proper handling of NATS_URL?

@@ -18,4 +18,4 @@ Feature: Development mode
Then the service should start
And the service should connect to NATS
And the service should be configured with "DATA_DIR" set to "/tmp"
And the service should be configured with "NATS_URL" set to "nats:4255"
And the service should be configured with "NATS_URL" set to either "nats:4255" or "nats://nats:4255"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants