-
-
Notifications
You must be signed in to change notification settings - Fork 384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delete old pipeline logs after X days or Y new runs #1068
Comments
Indeed, an option to manually delete pipeline runs (for example, after debugging the CI config on a testing branch) would be nice to have. |
I would like to add support for automatic deletion of old pipelines. My plan would be the following:
|
I don't get it. Why not keep the API first approach? Why would you hardcode this feature instead of a reusable API? |
How should any other users benefit from this feature, will you open-source it as a service? If the plan is to introduce it as an internal feature later on, wouldn't those new api endpoints be unnecessary? I guess we don't plan to call the api endpoints from the code? |
Still don't understand your point...
Of course not. Why should I create an external service? I just said, let's keep the PRs small, implement the API first and let's create a second PR that implements the UI components based on the API. I would have also added a CLI command to e.g. run Nobody has said that we don't want proper UI integration. As I wrote in the PR, the API can be called from the UI (that's how the UI works in general, right?). The API can also be called from the CLI (the cli client uses the go sdk, which only makes simple API calls). If this is not true, why do we provide APIs at all? Creating an internal-only function prevents manual cleanups whenever a user wants to do so with any scope they want to clean up. If you don't plan to create a reusable API first, what should the UI implementation even look like? Nevertheless, I thought Woodpecker is following an api-first approach. This means that users are able to write their own CLI clients or even their own web front-ends if they want to. That's why I don't understand why you don't want a public API (especially if such exists already that exposes a lot of Pipeline/Repo related operations). |
I thought about adding an api in the beginning as well. However to have an automatic deletion / retention of pipelines this has to be integrated into some kind of routine. At first there was the idea to have it executed by a cron scheduler, but then someone in chat suggested to simply remove old pipelines each time a new pipeline is created which seems pretty smart in terms of not checking inactive repos on large instances. This would make an api for this kinda obsolete. The main question remaining from what I read is if a keep x pipelines setting or a keep pipelines not older than x days setting is more useful for the majority of users. Keep x pipelines seems to be nice as projects often have a different amounts of activity over time. Based on our UI most users are probably looking at sth like the last 10 pipelines in the UI. So keeping the 100 latest pipeline could do the job. If there is no activity for a year or so on a project you would still have some pipelines once you come back to the project. Keeping pipelines for x days could make more sense for legal requirements, but I haven't heard of someone that this is a thing for CI pipelines yet. |
You still just ignore my main points.
Is not answered.
No because users should still have the ability to delete pipelines whenever they want and the amount they want. I can't understand why you try to avoid an API at all costs. However, I have outlined and explained my points pretty clear multiple times now. |
I have chosen the time bases approach for the API because it can handle "Keep x pipelines" as well. If we prefer this for e.g. the Woodpecker UI, just do:
If some people want to use a time-based approach, they can use the same API while doing it the other way around (implementing an API to "Keep x pipelines") prevents any time-based (e.g. older than 30d) deletion. In the real world scenario, IMO time-based is preferred in general because "Keep x pipelines" has a drawback. Let's say someone is spamming your repo with PR's (by accident or as an "attack") while "Keep 100 pipelines" is configured. You will lose your build history in seconds (yes, one could enabled required approvals). |
To achieve the goal "cleanup the pipeline list / database to not fill up the disk" most user probably don't wont to create an external service / cron-job. Therefore a setting like the following which enables an automatic cleanup would be sufficient for the majority: If a user still has specific needs the |
You are making assumptions about what users want instead of letting them decide for themselves. As I am also a user, I can tell you that I do not want such a setting, and it would not help me at all.
I tried to find a middle ground and clearly showed how this can also be achieved with the proposed API. I was also willing to do exactly what you want for the web UI (just using APIs instead)... That way everyone would get something out of it: you get the setting in the UI, and we get an API. However, I'm out at this point 👋. |
What a hot discussion! It's funny and sad at the same time... TLDR: go on and implement it in your way. Who wants API/whatever will implement it on top of your service.
I doubt it is the main reason. The main is avoiding scheduling service, I think. It is and it was. I don't understand why though...
The second one is Why one should implement functionality, which is orthogonal to what he wants/needs (requirements)?
Note automatic.
Who? Me? Manually? ^ Automatic.
What a beautiful definition of a user!
Exactly! I'm in that bucket of users. We do not consume cool Application Programming Interfaces, nor write frontends. We use GUI (CLI at worst).
Don't want - don't use. It would not help me either, but it might help somebody. Anbraten are going to write internal service (file/class) to clean up logs. That service will be triggered by pipeline creation (I prefer cron, as you know:). There will be retention policy setting in GUI.
But it's a different task. |
You missed the entire point...
|
|
Having an API to delete pipelines is a different feature than having a UI setting to do it? Why do we have an API to delete a Repo if we have a button in the UI to do it? And what is the red button "Delete Repo" doing? It calls the DeleteRepo API. And thats exactly what Im talking about. |
I agree the UI might be the wrong place to automate it. |
Having an API to delete pipelines is a different feature than having an UI to set a retention policy / setting up a trigger. Also, I've just read through #3506:
And this issue title:
I understood it. If Anbranten wanted just button to delete pipeline/s (or logs?;), then your points are absolutely valid. |
You are right.
If that would have been the mail problem I could understand it as well but nobody said something like this not in the PR not in this discussion. Anyway I agree it might be better to separate these topics. |
I've been thinking for a while... Hooking into pipeline lifecycle is fine if we could decouple it. Instead of calling directly
Looking forward to implementation of this feature, cause I could apply this approach to the agents cleaning also. |
Clear and concise description of the problem
From https://codeberg.org/Codeberg-CI/feedback/issues/63:
It would be nice to have an option to delete old logs of pipeline entries. Either automatic (delete logs of succesful runs after X days or Y new runs?) or with some sort of checkbox selection and action?
Suggested solution
An option that automatically deletes old pipeline logs
Alternative
No response
Additional context
No response
Validations
The text was updated successfully, but these errors were encountered: