Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

KNNVectorFieldMapper unable to get algorithm params from settings when built from merge #288

Closed
jmazanec15 opened this issue Dec 29, 2020 · 0 comments
Assignees
Labels
bug Issue that exposes a bug

Comments

@jmazanec15
Copy link
Member

When the KNNVectorFieldMapper is built from a mapper merge, the mapper is unable to read the algorithm parameters that are stored as index settings. Because of this, it falls back to default settings. Because spaceType is one of the parameters, this can lead to incorrect graphs being built for an index. For example, a user may create a cosinesimil index and the graphs will be built using the l2 case.

Relevant lines of code:
1 https://github.com/opendistro-for-elasticsearch/k-NN/blob/master/src/main/java/com/amazon/opendistroforelasticsearch/knn/index/KNNVectorFieldMapper.java#L124-L163
2. https://github.com/opendistro-for-elasticsearch/k-NN/blob/master/src/main/java/com/amazon/opendistroforelasticsearch/knn/index/KNNVectorFieldMapper.java#L349
3. https://github.com/elastic/elasticsearch/blob/7.10/server/src/main/java/org/elasticsearch/index/mapper/ParametrizedFieldMapper.java#L104

For basic use cases where the mapping does not change after index creation, this bug will not make any impact. However, this bug may impact indices with dynamic mappings.

Reproduction steps:

curl -X PUT "localhost:9200/myindex" -H 'Content-Type: application/json' -d'
{
  "settings" : {
      "index": {
        "knn": "true",
        "knn.space_type": "cosinesimil"
      }
  },
  "mappings": {
    "properties": {
      "embedding": {
        "type": "knn_vector",
        "dimension": 2
      },
      "metadata" : {
        "type" : "object",
        "dynamic" : true
      }
    }
  }
}
'

curl -X POST "localhost:9200/myindex/_doc" -H 'Content-Type: application/json' -d'
{
  "metadata": {"test": "object"},
  "embedding":  [1, 2]
}
'

curl -X POST "localhost:9200/myindex/_search" -H 'Content-Type: application/json' -d'
{
  "size" : 1,
  "query": {
    "knn": {
      "embedding": {
        "vector": [3, 4],
        "k": 1
      }
    }
  }
}
'

This change should resolve the problem. In the future, we should switch these algorithm parameters from index settings to mapping parameters (tracking this in #282).

Impacted ODFE Version:

  1. 1.12
  2. 1.11
  3. 1.10

Related issues:

  1. Make HNSW algorithm parameters mapping parameters #282
  2. Wrong kNN similarity score when indexing with dynamic object  #287
  3. https://discuss.opendistrocommunity.dev/t/cosine-similarity-formula/4390/12
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Issue that exposes a bug
Projects
None yet
Development

No branches or pull requests

1 participant