Skip to content

Replication of the Anthropic interpretability paper "Toy Models of Superposition" by Elhage et al. (2022)

Notifications You must be signed in to change notification settings

ishanjmukherjee/toy-models-of-superposition-replication

Repository files navigation

About

Replication of the Anthropic interpretability paper "Toy Models of Superposition" by Elhage et al. (2022)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published