Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial structure #9

Merged
merged 8 commits into from
Mar 18, 2024
Merged

Initial structure #9

merged 8 commits into from
Mar 18, 2024

Conversation

Marenz
Copy link
Contributor

@Marenz Marenz commented Feb 28, 2024

  • Prepare dependencies for dispatch
  • Fix labeler path
  • Add initial structure

@github-actions github-actions bot added part:docs Affects the documentation part:tests Affects the unit, integration and performance (benchmarks) tests part:tooling Affects the development tooling (CI, deployment, dependency management, etc.) labels Feb 28, 2024
@Marenz Marenz force-pushed the beginnings branch 3 times, most recently from d67c8a0 to 364f7cb Compare February 28, 2024 16:43
@Marenz Marenz added the cmd:skip-release-notes It is not necessary to update release notes for this PR label Feb 28, 2024
@Marenz Marenz force-pushed the beginnings branch 2 times, most recently from f541548 to fb2655f Compare March 7, 2024 10:41
@Marenz Marenz marked this pull request as ready for review March 11, 2024 17:31
@Marenz Marenz requested a review from a team as a code owner March 11, 2024 17:31
@shsms
Copy link

shsms commented Mar 12, 2024

Could you describe the motivation for this change? There's from frequenz import dispatch now? Aren't we only having namespace packages under frequenz like frequenz.{actor, client, api, etc} and putting code under that?

Or is this going to be a completely independent thing like frequenz.channels?

@Marenz
Copy link
Contributor Author

Marenz commented Mar 12, 2024

The structure is a kind of experiment that luca suggested to me to try.
Basically, as the user shouldn't interact directly with the actor, the high-level interface should just directly offer the dispatch arrived/is ready channels like shown in https://github.com/frequenz-io/frequenz-apps-draft/blob/main/dispatch_exec/dispatch.py

src/frequenz/dispatch/actor.py Outdated Show resolved Hide resolved
src/frequenz/dispatch/actor.py Outdated Show resolved Hide resolved
src/frequenz/dispatch/actor.py Outdated Show resolved Hide resolved
src/frequenz/dispatch/actor.py Show resolved Hide resolved
src/frequenz/dispatch/actor.py Outdated Show resolved Hide resolved
Comment on lines 137 to 144
while True:
next_time = self.calculate_next_run(
dispatch, datetime.now().replace(tzinfo=timezone.utc)
)
if next_time is None:
logging.info("Dispatch finished: %s", dispatch)
self._scheduled.pop(dispatch.id)
return
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe more readable with:

while next_time := self.calculate_next_run(dispatch, datetime.now(tz=timezone.utc)):
    ...

# and after the loop
logging.info("Dispatch finished: %s", dispatch)
self._scheduled.pop(dispatch.id)

Copy link
Contributor Author

@Marenz Marenz Mar 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then I can't reuse now easily with your suggestion, but I modified it a bit..

        def next_run_info() -> tuple[datetime, datetime] | None:
            now = datetime.now(tz=timezone.utc)
            next_run = self.calculate_next_run(dispatch, now)

            if next_run is None:
                return None

            return now, next_run

        while pair := next_run_info():
            now, next_time = pair

            if next_time - now > _MAX_AHEAD_SCHEDULE:
                await asyncio.sleep(_MAX_AHEAD_SCHEDULE.total_seconds())
                continue

            _logger.info("Dispatch %s scheduled for %s", dispatch.id, next_time)
            _logger.info("Sleeping for %s", next_time - now)
            await asyncio.sleep((next_time - now).total_seconds())

            _logger.info("Dispatch ready: %s", dispatch)
            await self._ready_dispatch_sender.send(dispatch)

        _logger.info("Dispatch finished: %s", dispatch)
        self._scheduled.pop(dispatch.id)

@shsms
Copy link

shsms commented Mar 12, 2024

The structure is a kind of experiment that luca suggested to me to try.
Basically, as the user shouldn't interact directly with the actor, the high-level interface should just directly offer the dispatch arrived/is ready channels like shown in https://github.com/frequenz-io/frequenz-apps-draft/blob/main/dispatch_exec/dispatch.py

Then maybe we keep frequenz.actor.dispatch as it is, and add frequenz.dispatch separately, such that frequenz.dispatch can eventually go into the SDK?

Either way, the reason for the change is not obvious at all, so it would be worth the effort to explain it in the commit message and in the PR description.

@Marenz
Copy link
Contributor Author

Marenz commented Mar 12, 2024

I don't see it much as "change of structure", given that previously there was basically nothing except empty or dummy files ;)

Anyway, I found what I was missing. The structure change is an attempt to test the rewrite of the structure according to https://github.com/frequenz-floss/frequenz-sdk-python/wiki/APIs-stack-and-repositories-structure#completely-rewrite-of-the-structure

@Marenz
Copy link
Contributor Author

Marenz commented Mar 12, 2024

  • Added link to structure rewrite description in commit
  • implemented proper scheduling with recurrence based on python-dateutils
  • can use more tests, but that can be added later (test suggestions welcome, part from the create/delete/update tests that are there/planned)

@Marenz Marenz force-pushed the beginnings branch 2 times, most recently from fd6c777 to d24ec9f Compare March 13, 2024 18:04
README.md Outdated Show resolved Hide resolved
Marenz added 4 commits March 18, 2024 11:30
For this repository, we decided to apply the structure rewrite outlined in the following
document:

https://github.com/frequenz-floss/frequenz-sdk-python/wiki/APIs-stack-and-repositories-structure#completely-rewrite-of-the-structure

Signed-off-by: Mathias L. Baumann <[email protected]>
Signed-off-by: Mathias L. Baumann <[email protected]>
Signed-off-by: Mathias L. Baumann <[email protected]>
Signed-off-by: Mathias L. Baumann <[email protected]>
Marenz added 3 commits March 18, 2024 11:31
Also add some tests for it.

Signed-off-by: Mathias L. Baumann <[email protected]>
Signed-off-by: Mathias L. Baumann <[email protected]>
@shsms
Copy link

shsms commented Mar 18, 2024

Looks good to me.

"""A highlevel interface for the dispatch API.

This class provides a highlevel interface to the dispatch API. It
allows to receive new dispatches and ready dispatches.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion:

This class provides a highlevel interface to the dispatch API. It 
allows for receiving of new dispatches, and dispatches labelled as 'ready'.

I would also suggest adding a small blurb about the difference between the two.

) -> None:
"""Update the schedule for a dispatch.

Schedules, reschedules or cancels the dispatch based on the start_time

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify how this happens in this doc?

For example, "- If the start time of the new dispatch is different from the old one, the old one is cancelled."

Copy link

@stefan-brus-frequenz stefan-brus-frequenz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a couple of optional comments about the docs, code overall LGTM, but my python knowledge is limited.

@Marenz Marenz enabled auto-merge March 18, 2024 16:55
@Marenz Marenz added this pull request to the merge queue Mar 18, 2024
Merged via the queue into frequenz-floss:v0.x.x with commit 40456d6 Mar 18, 2024
14 checks passed
@Marenz Marenz deleted the beginnings branch March 18, 2024 17:17
Copy link
Contributor

@llucax llucax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, quite a few comments, in particular about the design.

Important

Please let's not discuss the topics here, as discussion could get lengthy. I will create separate issues (or PRs where possible) and link them when I'm done.

pyproject.toml Show resolved Hide resolved
pyproject.toml Show resolved Hide resolved
pyproject.toml Show resolved Hide resolved
src/frequenz/dispatch/__init__.py Show resolved Hide resolved
src/frequenz/dispatch/__init__.py Show resolved Hide resolved
src/frequenz/dispatch/actor.py Show resolved Hide resolved
src/frequenz/dispatch/actor.py Show resolved Hide resolved
self._schedule_task(dispatch)
)

async def _schedule_task(self, dispatch: Dispatch) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mental picture of how this should have worked was, instead of having one task per dispatch, to just have one task that calculates the next run for all dispatches and task the one that needs to run next, then sleeps only for that time, sends the ready dispatch through the channel, and calculates the next dispatch to be run again.

I think this scheme should be simpler and more memory efficient, as you don't need one task per dispatch. Do you see any advantages with the current approach?

I guess it might be a bit more CPU efficient, as the next dispatch for all dispatches doesn't need to be calculated every time a dispatch is ready, but this could be optimized fairly easily by saving all the next run calculation for all dispatches in a sorted list and only calculate the next run for the dispatch that was just ready. Then we just need to invalidate this list each time dispatches are changed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one might be too much work to implement, so I think it is better to discuss it first, and probably it is better to sense the general feeling here first, if it's not a trivial "let's go with this approach" (or let's NOT) before creating an issue, so please go ahead and leave comments here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my only concern with your suggestion is that it might accidentally skip dispatches that are very close together, though I can't come up for a good explanation why it would, just an .. gut feeling to pay extra attention on that area when implementing it..

I was also wondering if the awaiting for sending might take a bit of time and thus delays the notification for the next very close dispatch to be executed. What if we waited longer than the next dispatches scheduled time, how do we make sure we don't skip it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my only concern with your suggestion is that it might accidentally skip dispatches that are very close together, though I can't come up for a good explanation why it would, just an .. gut feeling to pay extra attention on that area when implementing it..

If done properly, I don't think it should. I didn't invented this, I've seen this (or a similar approach) suggested in libev as a very efficient way to implement timers (see point 4).

I was also wondering if the awaiting for sending might take a bit of time and thus delays the notification for the next very close dispatch to be executed. What if we waited longer than the next dispatches scheduled time, how do we make sure we don't skip it?

If we await on the send takes a long time, I guess the next notification will be delayed too as it will use the same channel to notify, right? We could do the send in a separate task but then we end up with many tasks again, but I think sends will be serialized in any case because they operate in the same channel, so using a different task is probably not having any real effect.

src/frequenz/dispatch/actor.py Show resolved Hide resolved
interval=dispatch.recurrence.interval,
)

return cast(datetime | None, next_run.after(_from, inc=True))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this cast needed because dateutil lacks type info or what?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is too small to create an issue. So also please comment here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yah, I was getting src/frequenz/dispatch/actor.py:279: error: Returning Any from function declared to return "datetime | None" [no-any-return]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, it is probably best to add a comment when using cast, so readers don't have to guess in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cmd:skip-release-notes It is not necessary to update release notes for this PR part:docs Affects the documentation part:tests Affects the unit, integration and performance (benchmarks) tests part:tooling Affects the development tooling (CI, deployment, dependency management, etc.)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants