Skip to content

Commit

Permalink
unify how to freeze some parameters for coca pre-training (#526)
Browse files Browse the repository at this point in the history
Summary:

1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune.
2. add configuration of using MLP instead of attention pooler for vision adapter;
3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's  output module and LLAVA, they are using bias=True (which is default value in Linear).

Differential Revision:
D54559503

Privacy Context Container: 303860477774201
  • Loading branch information
zhangtemplar authored and facebook-github-bot committed Mar 20, 2024
1 parent dbeed97 commit da89229
Showing 1 changed file with 11 additions and 1 deletion.
12 changes: 11 additions & 1 deletion tests/test_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -192,8 +192,18 @@ def assert_expected_namedtuple(


def init_weights_with_constant(model: nn.Module, constant: float = 1.0) -> None:
for p in model.parameters():
for n, p in model.named_parameters():
nn.init.constant_(p, constant)
# reduce the change to the tests
for k in {
"text_projection.bias",
"pooled_projection.bias",
"output_projection.bias",
"vision_proj.bias",
}:
if n.endswith(k):
nn.init.constant_(p, 0.0)
break


def tensor_hash(x: torch.tensor, scaling=0.05, buckets=1000) -> torch.tensor:
Expand Down

0 comments on commit da89229

Please sign in to comment.