Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding cost script to create a costs summary #26

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

ksinghal28
Copy link
Member

Created a cost script that parses the costs TSV file and adds up costs for the same task.
Edited the dockerfile to add cost_script.py to it.
Edited the requirements.txt file so that the docker also installed numpy, pandas, and regex.

scripts/cost_script.py Show resolved Hide resolved
@@ -110,6 +110,16 @@ This functionality is also wrapped into estimate\_billing.py under the
I'd still run these separately just to have both, but if you're only
after the CSV this may be more convenient.

# cost\_script.py

This is a script to be used on the costs tsv.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggested edit:
`Takes the output of costs_json_to_csv.py and collapses tasks that have been rerun (due to failure or premption) and those that have been split into shards, giving one cost for the entire task.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of now, I only collapsed tasks that had been split into shards, and not ones that were recorded as 'retry'.
So, if there were rows as-
doBqsr.bqsr_shard-14_retry1
doBqsr.bqsr_shard-14
doBqsr.bqsr_shard-13_retry1
doBqsr.bqsr_shard-13
I collapsed them into-
doBqsr.bqsr
and
doBqsr.bqsr_retry1

That said, if we want to collapse all 4 of those into doBqsr.bqsr I can change the code to do that.

I'll also edit my comment with what you mentioned and make it clearer!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, I think upon further reflection, keeping retries separate is probably the right move here, so yeah, you got it right. Nice work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants