Semantics of LLM, Weight Tying, and a story.
This summer, I was lucky enough to being accepted to ICML 2024 as spotlight poster. I told this story on reddit (you can see the original post here). However, now that I have a blog, I thought it would be nice to keep it here as well. Well, enough with the introduction, let's get to the story behind the paper titled By tying embedding you are assuming the distributional hypothesis.
Date: 26 September, 2024 | Estimated Reading Time: ~11 min