Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: use poetry for reproducible virtual environments #209

Draft
wants to merge 45 commits into
base: main
Choose a base branch
from

Conversation

VassilisVassiliadis
Copy link
Contributor

@VassilisVassiliadis VassilisVassiliadis commented Jun 21, 2024

Description of the change

  • Use poetry:
    1. to build reproducible virtual environments
    2. during CI/CD testing
    3. to build the container image
  • Update test_launch_script.py to handle exceptions that accelerate.commands.launch.launch_command(args) raises when the accelerate option num_processes is greater than 1
  • Update tests in test_launch_script.py to try and use a free port. This is useful if you have a default accelerate_config.yml file (e.g. under ~/.cache/transformers) with a num_processes > 1 field.
  • Updated tests that set the training argument --eval_strategy epoch to fallback to --evaluation_strategy for backwards compatibility with older version of transformers (e.g. 4.39.3 and 4.40.2).
    • due to fms_acceleration having a constraint to transformers<4.40.0 transformers gets set to 4.39.3 when you use poetry to install fms-hf-tuning (even if you do not add the cli arg --with fms-accel). fms_acceleration is no longer an optional package of fms-hf-tuning, there's an explicit constraint for transformers which restricts the package to <=4.40.2
    • This PR introduces a method get_train_args() in test_sft_trainer.py for building the configs.TrainingArguments object

Related issue number

Contributes to #60

How to verify the PR

Use poetry install to install fms-hf-tuning from source. You will end up installing the python dependencies listed in poetry.lock

Was the PR tested

  • I have added >=1 unit test(s) for every new method I have added.
  • I have ensured all unit tests pass

pyproject.toml Outdated
optional=true

[tool.poetry.group.fms-accel.dependencies]
fms_acceleration = {git = "https://github.com/foundation-model-stack/fms-acceleration.git", subdirectory="plugins/framework", rev = "40aad4683e01be3faebbcaae2670524c431cbf3e"}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this doesn't make sense, so I'll revert this change. Poetry already takes care of "pinning" this dependency for the reproducible virtual environments it creates (i.e. the poetry.lock file).

We just have to run poetry lock (or manually update python dependencies) whenever fms_acceleration releases a new version and we'd like to update the dependencies of fms-hf-tuning to match those of fms_acceleration.

For example, at the moment fms_acceleration has a constraint transformers<4.40.0 which propagates to fms-hf-tuning and sets transformers to 4.39.3.

@VassilisVassiliadis VassilisVassiliadis marked this pull request as ready for review June 24, 2024 09:01
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@@ -515,7 +515,7 @@ def main(**kwargs): # pylint: disable=unused-argument
exp_metadata,
)
except Exception as e: # pylint: disable=broad-except
logging.error(traceback.format_exc())
logging.get_logger("error").error(traceback.format_exc())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious what's this change for ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logging.error isn't part of transformers.utils.logging .

For example, when you run this:

 python -c 'from transformers.utils import logging; logging.error("foo")'

python raises the following exception:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
AttributeError: module 'transformers.utils.logging' has no attribute 'error'

so I just created a logger object called "error" and had that print the traceback. I assume that this line of code was originally using the logging module for which logging.error("foo") works

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I don't remember how I made sft_trainer.py execute that code.

@VassilisVassiliadis
Copy link
Contributor Author

I updated the description of the issue with a couple more changes that I had to do for the tests to pass.

build/Dockerfile Outdated Show resolved Hide resolved
tox.ini Outdated Show resolved Hide resolved
tox.ini Outdated Show resolved Hide resolved
tox.ini Outdated Show resolved Hide resolved
tox.ini Outdated Show resolved Hide resolved
@tedhtchang
Copy link
Collaborator

tedhtchang commented Jul 1, 2024

For reviewers
Keep in mind this change won't affect fms-hf-tuning packages users who do pip install fms-hf-tuning. pip installs the appropriate indirect (latest) dependencies versions for their environment.

For package developers, these are the most used poetry commands. So you may follow those examples to verify the PR:

# (NOT optional) Installed poetry in a dedicated venv [manually](https://python-poetry.org/docs/#installing-manually)
python3 -m venv $VENV_PATH
$VENV_PATH/bin/pip install -U pip setuptools
$VENV_PATH/bin/pip install poetry
export PATH=$PATH:$VENV_PATH/bin/poetry # verify poetry -h prints help text

# To verify the PR, first clone and create a new venv.
git clone https://github.com/VassilisVassiliadis/fms-hf-tuning.git -b poetry
# If not already in a venv poetry will create and activate one automatically. So w/o activate a venv:
poetry install
poetry shell # to active the venv created by poetry.
# Run pip list in the shell to verify dependencies from the poetry.lock are installed + fms-hf-tuning in editing mode.

# Install deps with dev group
poetry install --extras=dev

# Install deps without fms-hf-tuning in editing mode(dependencies only)
poetry install --no-root

# Update packages version in lock file. 
# Supposedly dependabot should update the lock file and create PR
poetry update --lock # optionally re-run poetry install to update the packages in venv.
# verify with `git diff poetry.lock` that some versions are bumped.

# Verify tox tasks ran withtou issue
tox -e py,fmt,lint,build,twinecheck,coverage

# Verified image has deps specified in the lock file
docker build . -t sft-trainer:dev -f build/Dockerfile
docker run -it sft-trainer:dev pip freeze

@tedhtchang
Copy link
Collaborator

@VassilisVassiliadis some files have been updated could you rebase. thanks

VassilisVassiliadis and others added 10 commits July 2, 2024 08:50
Signed-off-by: Vassilis Vassiliadis <[email protected]>
pip install `fms-hf-tuning[flash-attn]` is equivalent to `poetry install --extras flash-attn`
not `poetry install --with flash-attn`.

Signed-off-by: Vassilis Vassiliadis <[email protected]>
Co-authored-by: ted chang <[email protected]>
Signed-off-by: Vassilis Vassiliadis <[email protected]>
Signed-off-by: Vassilis Vassiliadis <[email protected]>
@VassilisVassiliadis
Copy link
Contributor Author

I rebased the PR and removed the instructions to install fms-accell using poetry. I replaced that text with a link to the relevant section in the README.md file.

@VassilisVassiliadis
Copy link
Contributor Author

I also tested building the dockerfile for the last release of fms-hf-tuning on pyli i.e.

 docker build --progress plain  . -t sft-trainer:dev -f build/Dockerfile --build-arg WHEEL_VERSION=0.4.0

In a future PR we could have the images built for a release actually install the dependencies from the poetry.lock file associated with that release (assuming there's a way to guess the git-tag/commit that was used to produce the pypi release)

tedhtchang
tedhtchang previously approved these changes Jul 2, 2024
Copy link
Collaborator

@tedhtchang tedhtchang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VassilisVassiliadis Thanks this looks good. And I agree. We need a mechanism to build/rebuild any release image with consistent dependencies for a given commit/tag/sha.

@Ssukriti @anhuong @jbusche Could you take a look the PR.

@jbusche
Copy link
Collaborator

jbusche commented Jul 2, 2024

Hi @tedhtchang, I tried it and it looked good...
I was able to poetry install and try the

poetry update --lock
git diff poetry.lock

and see that it had an update:

 [[package]]
 name = "orjson"
-version = "3.10.5"
+version = "3.10.6"

I was then able to install aim with the original orjson and then update it:

poetry install --extras=aim
poetry shell
pip list |grep orj
orjson                   3.10.5
poetry update --lock
git diff poetry.lock
-version = "3.10.5"
+version = "3.10.6"
poetry install --extras=aim
Installing dependencies from lock file
Package operations: 0 installs, 1 update, 0 removals
 - Updating orjson (3.10.5 -> 3.10.6)

jbusche
jbusche previously approved these changes Jul 2, 2024
Copy link
Collaborator

@jbusche jbusche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, tested on my Red Hat Server

```

If you wish to use [fms-acceleration](https://github.com/foundation-model-stack/fms-acceleration) follow the instructions in [this section of README.md](README.md#fms-acceleration).

Copy link
Collaborator

@Ssukriti Ssukriti Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Alternatively, you could continue `pip install -e . `, if you do not wish to leverage the lock file and have environmental contrainsts

@Ssukriti Ssukriti dismissed stale reviews from jbusche and tedhtchang via fcba45f July 3, 2024 17:42
Copy link
Collaborator

@Ssukriti Ssukriti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed some changes, please review.

Additionally in release process, @tedhtchang please add a post step to run poetry update and commit the new lock file. This has to be done post release by maintainers to keep it updated


# Install from the wheel
# Install using poetry if PyPi wheel_version is empty else download the wheel from PyPi
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Install using poetry if PyPi wheel_version is empty else download the wheel from PyPi
# Install using poetry if PyPi wheel_version is empty else download the wheel from PyPi
# If creating your own dockerfile we suggest to use poetry export for a reproducible environment

Ssukriti added 2 commits July 3, 2024 12:57
Signed-off-by: Sukriti-Sharma4 <[email protected]>
Signed-off-by: Sukriti-Sharma4 <[email protected]>
@@ -43,6 +43,8 @@ jobs:
run: |
python -m pip install --upgrade pip
python -m pip install tox
python -m pip install poetry --user
Copy link
Collaborator

@fabianlim fabianlim Jul 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is there a need to install this to --user? If you can install peotry the same way as tox, that this obliviates the need for the PATH setting below.

If the aim is for isolation, then I dont think installing in the user space may achieve it. Since any further pip install commands will still find the peoetry in user and try to update it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There quite a few instances of these in other worflows as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tox always creates it's own isolated venv.
Agree it would probably work without --user

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, tox and poetry should go in the same virtual-environment and that environment should not be the one that we install fms-hf-tuning in. Because tox creates a new virtual environment for the environments it processes (e.g. py, `fmt, etc) poetry and tox will be outside the tested virtual-environment.

We don't need to run tox "inside" one of the virtual environments that tox creates but we do need to run poetry. Here's how I chose to do that.

First, I installed poetry under the user pip install directory (which by default for linux is ~/.local/. Then I updated the file that $GITHUB_ENV points to so that in subsequent steps the $PATH environment variable contains the path ~/.local/bin.

This ensures that the poetry commands running inside tox can use the poetry executable. Alternatively, we can create a new virtual environment under a directory of our choosing e.g. /tmp/isolated in there, we install just poetry and then update $GITHUB_ENV so that $PATH includes /tmp/isolated/bin.

@@ -42,6 +42,8 @@ If additional new Python module dependencies are required, think about where to
- If they're optional dependencies for additional functionality, then put them in the pyproject.toml file like were done for [flash-attn](https://github.com/foundation-model-stack/fms-hf-tuning/blob/main/pyproject.toml#L44) or [aim](https://github.com/foundation-model-stack/fms-hf-tuning/blob/main/pyproject.toml#L45).
- If it's an additional dependency for development, then add it to the [dev](https://github.com/foundation-model-stack/fms-hf-tuning/blob/main/pyproject.toml#L43) dependencies.

if using `poetry` to install dependencies, you can optionally leverage [poetry add](https://python-poetry.org/docs/cli/#add) to add new dependencies to pyproject.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this is abit confusing. If we have a peoetry flow, and there is a new dep, shouldnt we have some guidance on when it has to be added to the lockfile?

Also some guidance to say that for optional deps it should not be added.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true, maybe we can continue adding to pyproject.yml , like we do now and then we have to run poetry update ?
I can change it to that instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There're 2 ways to add a new dependency. One is to use poetry i.e. poetry add the other is to manually edit the pyproject.toml file and then run poetry update which will then ensure poetry.lock is consistent with pyproject.toml (that's basically what poetry add does on your behalf).

If you opt for the 2nd route but forget to run poetry update then the next time you run poetry install (or the CI/CD runs it) you'll get an exception that the poetry.lock and pyproject.toml files are inconsistent and that you should run the poetry update command.

@fabianlim
Copy link
Collaborator

@VassilisVassiliadis there is a PR here to add back the [.fms-accel] dep. It was removed before with the intention of being added back #223

  • why will poetry constraint packages under optional dependencies? that makes no sense


We have committed a `poetry.lock` file to allow reproducible enviornments. If building from source, you can clone the repository and use poetry to install as mentioned in [development docs](/CONTRIBUTING.md#development)

If building in a dockerfile use `poetry export --format requirements.txt` can install same dependencies from the lock file. Maintainers regularly update the lock file.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line is not very clear.. this will do not any installation correct? Its just an export of the locked deps to a requirements file?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya its just to export the requirements and same lock file and then we have to install. hmm should be just link our dockerfile as example?

Copy link
Contributor Author

@VassilisVassiliadis VassilisVassiliadis Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VassilisVassiliadis there is a PR here to add back the [.fms-accel] dep. It was removed before with the intention of being added back #223

@fabianlim I'll rebase the PR once #223 makes its way into the main branch.

why will poetry constraint packages under optional dependencies? that makes no sense

I'm actually not sure. I assume that they do this so that poetry install and poetry install -E fms-accel both end up installing the same non-optional dependencies (in our case transformers).

@Ssukriti Ssukriti marked this pull request as draft July 8, 2024 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants