Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resolves: 117 discoverer #182

Merged
merged 60 commits into from
Dec 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
e5bf5f2
start discoverer reframe config #117
JavierCladellas Nov 13, 2024
c59132f
start discoverer machine config #117
JavierCladellas Nov 13, 2024
a02b1ee
Merge branch 'master' into 117-discoverer
JavierCladellas Nov 26, 2024
742457a
rm tmp and cache dirs from discoverer #117
JavierCladellas Nov 26, 2024
c52ec5b
Merge branch 'master' into 117-discoverer
JavierCladellas Dec 3, 2024
162fcba
up discoverer config #117
JavierCladellas Dec 3, 2024
21f51d3
rm discoverer (will use girder to store for safety) #117
JavierCladellas Dec 3, 2024
72cfa03
rename environ #117
JavierCladellas Dec 3, 2024
21b991c
up heat thermal bridges config
JavierCladellas Dec 4, 2024
004378c
start dry-run mode #175
JavierCladellas Dec 4, 2024
990cdb9
Add initial support for dry runs #175
JavierCladellas Dec 4, 2024
7f64972
add omp_num_threads just in case #117
JavierCladellas Dec 4, 2024
de18c39
up timeout #117
JavierCladellas Dec 4, 2024
f951e0a
add plots config + add checkout in HPC res + up script options #117
JavierCladellas Dec 4, 2024
4485546
typo... [ci skip]
JavierCladellas Dec 4, 2024
f58d291
add python version #117
JavierCladellas Dec 4, 2024
a43ab8d
upgrade pip [ci skip]
JavierCladellas Dec 4, 2024
a21259d
use factory pattern on hpcDispatch
JavierCladellas Dec 4, 2024
624c9de
add machine shell scripts
JavierCladellas Dec 4, 2024
4bef946
up benchmark workflow to submit jobs differently #117 [ci skip]
JavierCladellas Dec 4, 2024
36813df
rm rendering and report resetting (new strat needed)
JavierCladellas Dec 4, 2024
60bbaf6
to json -> to dict
JavierCladellas Dec 4, 2024
b186a99
add executable permissions #117
JavierCladellas Dec 5, 2024
a3f3889
change name [skip ci]
JavierCladellas Dec 5, 2024
fb6934b
remove concrete hpc system classes [ci skip]
JavierCladellas Dec 5, 2024
4a664bd
add partition parameter to hpc system interface [ci skip]
JavierCladellas Dec 5, 2024
95c02af
try with python 3.9 #117 [ci skip]
JavierCladellas Dec 5, 2024
67b5b4f
try accept python 3.6 #117
JavierCladellas Dec 5, 2024
700f2f1
install . in discoverer shell script #117
JavierCladellas Dec 5, 2024
0aeeca8
rm 3.6 [ci skip]
JavierCladellas Dec 5, 2024
e78192d
change . to dist/*.whl [ci skip]
JavierCladellas Dec 5, 2024
cadbd46
create venv in script rather in ci [ci skip]
JavierCladellas Dec 5, 2024
5a25ea7
back to 3.8 syntax #117
JavierCladellas Dec 5, 2024
a7c73ab
compat with 3.8
JavierCladellas Dec 5, 2024
46f25f8
try add typing_extensions #117
JavierCladellas Dec 5, 2024
dc4d8d1
add --ignore-installed and --upgrade to reqs [ci-skip]
JavierCladellas Dec 5, 2024
be64b01
use specific env version of python #117
JavierCladellas Dec 5, 2024
623f1a0
ensurepip #117 [ci skip]
JavierCladellas Dec 5, 2024
00633ba
add force-reinstall flag [ci skip] #117
JavierCladellas Dec 5, 2024
ca96061
rm flags #117 [ci skip]
JavierCladellas Dec 5, 2024
fa9cc7a
rm ensure pip [ci skip]
JavierCladellas Dec 5, 2024
4ba3faa
add -I (ignore existing) #117 [ci skip]
JavierCladellas Dec 5, 2024
bdb77e0
up reqs #117
JavierCladellas Dec 5, 2024
6a80e46
rm openmpi module load #117
JavierCladellas Dec 5, 2024
4b4c21a
rm module, use direct path #117 [ci skip]
JavierCladellas Dec 5, 2024
8b1c833
up discoverer reframe config #117  [ci skip]
JavierCladellas Dec 5, 2024
13124bb
add default vals to workflow dispatch #117 [ci skip]
JavierCladellas Dec 5, 2024
48f2005
rm apptainer from prepare_cmds #117 [ci skip]
JavierCladellas Dec 5, 2024
4f13a49
try fix artifact uploading #117 [ci skip]
JavierCladellas Dec 5, 2024
659eb99
stop using move_results #117 [ci-skip]
JavierCladellas Dec 5, 2024
fd65dbc
remove expandvars #117 [ci skip]
JavierCladellas Dec 5, 2024
ee5a3ce
Merge branch 'master' into 117-discoverer
JavierCladellas Dec 6, 2024
a30d58a
rm not discoverer modules #117
JavierCladellas Dec 9, 2024
6d1c866
rm prepare_cmds #117
JavierCladellas Dec 9, 2024
dcda346
add optional env_variables to schema #117
JavierCladellas Dec 9, 2024
e7d5e31
add example of env_variables #117
JavierCladellas Dec 9, 2024
c2a594c
rm platform env variables
JavierCladellas Dec 9, 2024
45274b4
set env variables
JavierCladellas Dec 9, 2024
562253e
Merge branch 'master' into 117-discoverer
JavierCladellas Dec 10, 2024
d3f4733
Merge branch 'master' into 117-discoverer
JavierCladellas Dec 10, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 25 additions & 43 deletions .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,19 @@ on:
machines_config:
description: 'Machine related configurations'
required: True
default: 67504e9a4c9ccbdde21a46fe
benchmark_config:
description: 'Applcation related configuration'
required: True
default: 67504e9a4c9ccbdde21a4701
plots_config:
description: 'Plots related configuration'
required: True
default: 675053424c9ccbdde21a470a
girder_folder_id:
description: 'ID of the folder to upload to'
required: True
default: 67504ecd4c9ccbdde21a4704

jobs:

Expand Down Expand Up @@ -68,25 +75,28 @@ jobs:
girder-download -gid $machine_cfg_id -o ./tmp/ -fn "machines_config.json"
env:
GIRDER_API_KEY: ${{secrets.GIRDER}}
- id: hpc-systems
name: Set HPC systems matrix
run: |
source .venv/bin/activate
matrix=$(hpc-dispatch -mp ./tmp/machines_config.json -o ./tmp/machines/)
echo $matrix
echo "matrix={ include : $matrix }" >> $GITHUB_OUTPUT
- name: Donwload benchmark configuration
run: |
source .venv/bin/activate

if [[ "${{ github.event_name }}" == "workflow_dispatch" ]]; then
bench_cfg_id=${{ github.event.inputs.benchmark_config }};
plots_cfg_id=${{ github.event.inputs.plots_config }};
elif [[ "${{ github.event_name}}" == "repository_dispatch" ]]; then
bench_cfg_id=${{ github.event.client_payload.benchmark_config }};
plots_cfg_id=${{ github.event.client_payload.plots_config }};
fi
girder-download -gid $bench_cfg_id -o ./tmp/ -fn "benchmark_config.json"
girder-download -gid $plots_cfg_id -o ./tmp/ -fn "plots.json"
env:
GIRDER_API_KEY: ${{secrets.GIRDER}}
- id: hpc-systems
name: Set HPC systems matrix
run: |
source .venv/bin/activate
matrix=$(hpc-dispatch -mcp ./tmp/machines_config.json -mod ./tmp/machines/ -bcp ./tmp/benchmark_config.json -pcp ./tmp/plots.json)
echo $matrix
echo "matrix={ include : $matrix }" >> $GITHUB_OUTPUT
- name: pull_images
run: |
source .venv/bin/activate
Expand All @@ -102,6 +112,7 @@ jobs:
name: config-artifacts
path: |
./tmp/benchmark_config.json
./tmp/plots.json
./tmp/machines/

benchmark:
Expand All @@ -113,6 +124,7 @@ jobs:
timeout-minutes: 7200
name: ${{matrix.machine}}
steps:
- uses: actions/checkout@v4
- name: Download wheel
uses: actions/download-artifact@v4
with:
Expand All @@ -123,20 +135,13 @@ jobs:
with:
name: config-artifacts
path: ./tmp/
- name: Create Virtual Environment
run: |
python3 -m venv .venv
source .venv/bin/activate
pip3 install -r requirements.txt
- name: Execute benchmarks
run: |
source .venv/bin/activate
execute-benchmark -ec ./${{matrix.machine_cfg}} --config ./tmp/benchmark_config.json --move-results ./tmp/results/ -v
run: ${{matrix.submit_command}}
- name: Upload reframe report
uses: actions/upload-artifact@v4
with:
name: benchmark-results
path: ./tmp/results/
name: benchmark-results-${{matrix.machine}}
path: ${{matrix.reports_path}}

results:
runs-on: self-ubuntu-22.04
Expand All @@ -148,8 +153,9 @@ jobs:
- name: Download results
uses: actions/download-artifact@v4
with:
name: benchmark-results
pattern: benchmark-results-*
path: ./tmp/results/
merge-multiple: false
- name: Create Virtual Environment
run: |
python3 -m venv .venv
Expand All @@ -168,28 +174,4 @@ jobs:
girder-upload --directory $new_foldername --girder_id $girder_upload_id
rm -r $new_foldername
env:
GIRDER_API_KEY: ${{ secrets.GIRDER }}
- name: Reset reports
run: |
rm -r ./docs/modules/ROOT/pages/applications/
rm -r ./docs/modules/ROOT/pages/machines/
rm -r ./docs/modules/ROOT/pages/reports/
rm -r ./docs/modules/ROOT/pages/use_cases/
rm -r ./reports/
- name: Render reports
run: |
source .venv/bin/activate
render-benchmarks
env:
GIRDER_API_KEY: ${{ secrets.GIRDER }}

- name: Create Pull Request
uses: peter-evans/create-pull-request@v7
with:
title: "Add benchmark for ${{ needs.factory.outputs.executable_name }} - ${{ needs.factory.outputs.use_case }}"
body: |
Auto-generated by [create-pull-request][1]
[1]: https://github.com/peter-evans/create-pull-request
reviewers: JavierCladellas
env:
GITHUB_TOKEN: ${{ secrets.CR_PAT }}
GIRDER_API_KEY: ${{ secrets.GIRDER }}
25 changes: 16 additions & 9 deletions config/toolbox_heat/thermal_bridges_case_3.json
Original file line number Diff line number Diff line change
@@ -1,17 +1,18 @@
{
"executable": "feelpp_toolbox_heat",
"output_directory": "{{machine.output_app_dir}}/toolboxes/heat/thermal_bridges_case_3",
"use_case_name": "thermal_bridges_case_3",
"timeout":"0-01:00:00",
"output_directory": "{{machine.output_app_dir}}/toolboxes/heat/ThermalBridgesENISO10211/Case3",
"use_case_name": "ThermalBridgesENISO10211",
"timeout":"0-00:10:00",
"platforms": {
"apptainer":{
"image": {
"name":"{{machine.containers.apptainer.image_base_dir}}/feelpp.sif"
"name":"{{machine.containers.apptainer.image_base_dir}}/feelpp-noble.sif"
},
"input_dir":"/input_data/",
"options": [
"--home {{machine.output_app_dir}}",
"--bind {{machine.input_dataset_base_dir}}/{{use_case_name}}/:{{platforms.apptainer.input_dir}}"
"--bind {{machine.input_dataset_base_dir}}/{{use_case_name}}/:{{platforms.apptainer.input_dir}}",
"--env OMP_NUM_THREADS=1"
],
"append_app_option":[]
},
Expand All @@ -21,15 +22,17 @@
}
},
"options": [
"--config-files {{platforms.{{machine.platform}}.input_dir}}/case3.cfg",
"--config-files /usr/share/feelpp/data/testcases/toolboxes/heat/cases/Building/ThermalBridgesENISO10211/case3.cfg {{platforms.{{machine.platform}}.input_dir}}/{{parameters.solver.value}}.cfg",
"--directory {{output_directory}}/{{instance}}",
"--repository.case {{use_case_name}}",
"--fail-on-unknown-option 1",
"--heat.scalability-save=1",
"--repository.append.np 0",
"--case.discretization {{parameters.discretization.value}}",
"--heat.json.patch='{\"op\": \"replace\",\"path\": \"/Meshes/heat/Import/filename\",\"value\": \"$cfgdir/{{parameters.meshes.value}}/case3_p{{parameters.nb_tasks.tasks.value}}.json\" }'"
"--heat.json.patch='{\"op\": \"replace\",\"path\": \"/Meshes/heat/Import/filename\",\"value\": \"{{platforms.{{machine.platform}}.input_dir}}/partitioning/case3/{{parameters.meshes.value}}/case3_p{{parameters.nb_tasks.tasks.value}}.json\" }'"
],
"env_variables":{
"OMP_NUM_THREADS":1
},
"outputs": [
{
"filepath": "{{output_directory}}/{{instance}}/{{use_case_name}}/heat.measures/values.csv",
Expand Down Expand Up @@ -67,7 +70,7 @@
{
"name": "nb_tasks",
"sequence": [
{"tasks":128,"nodes":1,"exclusive_access":true}
{"tasks":128,"tasks_per_node":128,"exclusive_access":true}
]
},
{
Expand All @@ -77,6 +80,10 @@
{
"name": "discretization",
"sequence": ["P1"]
},
{
"name": "solver",
"sequence": ["gamg"]
}
]
}
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ render-benchmarks = "feelpp.benchmarking.report.__main__:main_cli"
execute-benchmark = "feelpp.benchmarking.reframe.__main__:main_cli"
girder-download = "feelpp.benchmarking.scripts.girder:download_cli"
girder-upload = "feelpp.benchmarking.scripts.girder:upload_cli"
hpc-dispatch = "feelpp.benchmarking.scripts.hpcSystems:parseHpcSystems_cli"
hpc-dispatch = "feelpp.benchmarking.scripts.hpcSystems:hpcSystemDispatcher_cli"

[tool.pytest.ini_options]
minversion = "6.0"
Expand Down
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,5 @@ pandas
nbmake
traitlets
tabulate
typing-extensions>=4.12.2
.
15 changes: 12 additions & 3 deletions src/feelpp/benchmarking/reframe/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,15 @@ def createReportFolder(self,executable,use_case):

return str(self.report_folder_path)

def buildExecutionMode(self):
"""Write the ReFrame execution flag depending on the parser arguments.
Examples are --dry-run or -r
"""
if self.parser.args.dry_run:
return "--dry-run"
else:
return "-r"

def buildCommand(self,timeout):
assert self.report_folder_path is not None, "Report folder path not set"
cmd = [
Expand All @@ -47,7 +56,7 @@ def buildCommand(self,timeout):
f"-J '#SBATCH --time={timeout}'",
f'--perflogdir={os.path.join(self.machine_config.reframe_base_dir,"logs")}',
f'{"-"+"v"*self.parser.args.verbose if self.parser.args.verbose else ""}',
'-r',
f'{self.buildExecutionMode()}'
]
return ' '.join(cmd)

Expand All @@ -56,7 +65,7 @@ def main_cli():
parser = Parser()
parser.printArgs()

machine_reader = ConfigReader(parser.args.machine_config,MachineConfig)
machine_reader = ConfigReader(parser.args.machine_config,MachineConfig,dry_run=parser.args.dry_run)
machine_reader.updateConfig()

#Sets the cachedir and tmpdir directories for containers
Expand All @@ -82,7 +91,7 @@ def main_cli():
configs = [config_filepath]
if parser.args.plots_config:
configs += [parser.args.plots_config]
app_reader = ConfigReader(configs,ConfigFile)
app_reader = ConfigReader(configs,ConfigFile,dry_run=parser.args.dry_run)
executable_name = os.path.basename(app_reader.config.executable).split(".")[0]
report_folder_path = cmd_builder.createReportFolder(executable_name,app_reader.config.use_case_name)
app_reader.updateConfig(machine_reader.processor.flattenDict(machine_reader.config,"machine"))
Expand Down
8 changes: 5 additions & 3 deletions src/feelpp/benchmarking/reframe/config/configMachines.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,13 @@ class Container(BaseModel):

@field_validator("cachedir","tmpdir","image_base_dir",mode="before")
@classmethod
def checkDirectories(cls,v):
def checkDirectories(cls,v, info):
"""Checks that the directories exists"""
if v and not os.path.exists(v):
raise FileNotFoundError(f"Cannot find {v}")
if info.context.get("dry_run", False):
print(f"Dry Run: Skipping directory check for {v}")
else:
raise FileNotFoundError(f"Cannot find {v}")

return v

Expand All @@ -36,7 +39,6 @@ class MachineConfig(BaseModel):
#TODO: maybe skipJsonSchema or something like that.
environment_map: Optional[Dict[str,List[str]]] = {}


@model_validator(mode="after")
def parseTargets(self):
if not self.targets:
Expand Down
9 changes: 6 additions & 3 deletions src/feelpp/benchmarking/reframe/config/configReader.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,12 +70,15 @@ def decode(self, s: str):

class ConfigReader:
""" Class to load config files"""
def __init__(self, config_paths, schema):
def __init__(self, config_paths, schema, dry_run=False):
"""
Args:
config_paths (str | list[str]) : Path to the config JSON file. If a list is provided, files will be merged.
"""
self.schema = schema
self.context = {
"dry_run":dry_run
}
self.config = self.load(
config_paths if type(config_paths) == list else [config_paths],
schema
Expand All @@ -97,7 +100,7 @@ def load(self,config_paths, schema):
with open(config, "r") as cfg:
self.config.update(json.load(cfg, cls=JSONWithCommentsDecoder))

self.config = schema(**self.config)
self.config = schema.model_validate(self.config, context=self.context)

return self.config

Expand All @@ -109,7 +112,7 @@ def updateConfig(self, flattened_replace = None):
"""
if not flattened_replace:
flattened_replace = self.processor.flattenDict(self.config.model_dump())
self.config = self.schema(**self.processor.recursiveReplace(self.config.model_dump(),flattened_replace))
self.config = self.schema.model_validate(self.processor.recursiveReplace(self.config.model_dump(),flattened_replace), context=self.context)

def __repr__(self):
return json.dumps(self.config.dict(), indent=4)
25 changes: 19 additions & 6 deletions src/feelpp/benchmarking/reframe/config/configSchemas.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,17 +44,29 @@ class Image(BaseModel):
protocol:Optional[Literal["oras","docker","library","local"]] = None
name:str

@model_validator(mode="after")
def extractProtocol(self):
@field_validator("protocol",mode="before")
@classmethod
def extractProtocol(cls, v, info):
""" Extracts the image protocol (oras, docker, etc..) or if a local image is provided.
If local, checks if the image exists """

if "://" in self.name:
self.protocol = self.name.split("://")[0]
name = info.data.get("name","")
if "://" in name:
return name.split("://")[0]
else:
self.protocol = "local"
return "local"

return self
@field_validator("name", mode="before")
@classmethod
def checkImage(cls,v,info):
if info.data["protocol"] == "local":
if not os.path.exists(v):
if info.context.get("dry_run", False):
print(f"Dry Run: Skipping image check for {v}")
else:
raise FileExistsError(f"Cannot find image {v}")

return v


class Platform(BaseModel):
Expand All @@ -74,6 +86,7 @@ class ConfigFile(BaseModel):
output_directory:Optional[str] = ""
use_case_name: str
options: List[str]
env_variables:Optional[Dict] = {}
outputs: List[AppOutput]
scalability: Scalability
sanity: Sanity
Expand Down
Loading
Loading