Phrase embeddings in context #108

jnferfer · 2024-02-12T15:32:52Z

Hi,

I need to get the embeddings of a word or a phrase within a sentence. This sentence is the context of the word/phrase.

For example, I need the different embedding values of big apple in these two sentences:

I'm living in the Big Apple since 2012
I ate a big apple yesterday

When using model.encode() I can set the parameter output_value to token_embeddings to get token embeddings. However, I don't know how to properly map the output vectors to the target tokens corresponding to the big apple text. Is there a straightforward approach for this?

Thanks!

The text was updated successfully, but these errors were encountered:

hongjin-su · 2024-04-12T17:42:59Z

You may first check the tokenization of the sentences, record the indices of desired words, e.g., big apple, and find token embeddings following the indices.

jnferfer · 2024-04-21T10:12:32Z

Thanks! Then, if I want to get a single embedding for "big apple", how should I proceed? I'm trying to get the average embedding of "big" and "apple", but I sometimes get odd results when comparing the average embedding against others.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phrase embeddings in context #108

Phrase embeddings in context #108

jnferfer commented Feb 12, 2024

hongjin-su commented Apr 12, 2024

jnferfer commented Apr 21, 2024 •

edited

Loading

Phrase embeddings in context #108

Phrase embeddings in context #108

Comments

jnferfer commented Feb 12, 2024

hongjin-su commented Apr 12, 2024

jnferfer commented Apr 21, 2024 • edited Loading

jnferfer commented Apr 21, 2024 •

edited

Loading