Replies: 5 comments
-
thanks for your information! |
Beta Was this translation helpful? Give feedback.
-
Although it's ultimately up to you, I disagree with closing this issue. I think there are other useful features in sd-meh, for example weights clipping. Something you may want to consider is using the library directly. If the extension uses the library instead of re-implementing everything, you would only have to bump the version of sd-meh when new useful merge techniques are found. But again, up to you. |
Beta Was this translation helpful? Give feedback.
-
model-mixer can merge block levels and internally even key levels, which explains its high speed. please tell me if I'm wrong. |
Beta Was this translation helpful? Give feedback.
-
what does clip_weights do? it's a very simple algorithm to reduce over-fitting results. def clip_weights_key(thetas, merged_weights, key):
t0 = thetas["model_a"][key]
t1 = thetas["model_b"][key]
maximums = torch.maximum(t0, t1)
minimums = torch.minimum(t0, t1)
return torch.minimum(torch.maximum(merged_weights, minimums), maximums) as you can see this procedure is a key level, and could be applied to model-mixer easily. |
Beta Was this translation helpful? Give feedback.
-
see also #113 |
Beta Was this translation helpful? Give feedback.
-
https://github.com/s1dlx/meh is a small library that has a good number of merge methods, some of which are not in supermerger. One very important feature is "weights clipping", which allows to merge models using add difference alpha=1.0 with limited distortion by clipping the weights to the original models A and B. There's also rebasin, which makes it possible to reduce the loss when merging using weighted sum.
Note that the library does not yet support SDXL in the main branch.
I did contribute to it a little bit, which is why I know about this library. Just wanted to mention this in case you were considering adding more merge options.
Beta Was this translation helpful? Give feedback.
All reactions