[Issue] Cross cluster replication is too slow between Opensearch clusters deployed on two separate kubernetes clusters. #1385

VinDataR · 2024-05-06T05:41:06Z

Describe the bug

Environment information:

VERSON: opensearch:2.9.0-release-4.14.0-29.12.2023

We have two opensearch clusters deployed on two different Kubernetes clusters in Azure.

We have enabled cross cluster replication between these two opensearch clusters deployed on two seaprate Kuberenetes clusters.

Whenever we push some documents into indices to the leader cluster and it is taking much longer time to replica.
We have tried with indices which are in KiloBytes as well as larger indices which are in GigaBytes.

In both cases, there is a definite lag we could observe from the time of pushing the documents in the leader site and the same documents being available in the follower site.

We had pushed an index 'projecttask' on the leader site. These are stats from leader site.

health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open projecttask atOBXJVWT_6wp7cwV_Y0Mw 3 2 4657988 1786411 51.3gb 17.1gb

When we kept monitoring the remote site, the replication status on follower site was showing as BOOTSTRAPPING for a long time. Now status is showing as SYNCING but the time duration is taking a long time.

Posted output of replication status API at several times yesterday to monitor the status. If you can see the bytes_percent and files_percent, it is progressing very gradually. Seeking help on how to troubleshoot this further and to speed up the replication.

curl -XGET -k -u ':' 'http://localhost:9200/_plugins/_replication/projecttask/_status?pretty'

{
  "status" : "BOOTSTRAPPING",
  "reason" : "User initiated",
  "leader_alias" : "my-connection-test",
  "leader_index" : "projecttask",
  "follower_index" : "projecttask",
  "syncing_details" : {
    "leader_checkpoint" : 2161975,
    "follower_checkpoint" : 2161975,
    "seq_no" : 2161975
  },
  "bootstrap_details" : {
    "bytes_total" : 12246723560,
    "bytes_recovered" : 62150110,
    "bytes_percent" : 0.50800186,
    "files_total" : 419,
    "files_recovered" : 12,
    "files_percent" : 2.8643699,
    "start_time" : 1714390992205,
    "running_time" : 209247
  }
}

{
  "status" : "BOOTSTRAPPING",
  "reason" : "User initiated",
  "leader_alias" : "my-connection-test",
  "leader_index" : "projecttask",
  "follower_index" : "projecttask",
  "syncing_details" : {
    "leader_checkpoint" : 2161975,
    "follower_checkpoint" : 2161975,
    "seq_no" : 2161975
  },
  "bootstrap_details" : {
    "bytes_total" : 12246723560,
    "bytes_recovered" : 62150110,
    "bytes_percent" : 0.50800186,
    "files_total" : 419,
    "files_recovered" : 12,
    "files_percent" : 2.8643699,
    "start_time" : 1714390992205,
    "running_time" : 229621
  }
}

{
  "status" : "BOOTSTRAPPING",
  "reason" : "User initiated",
  "leader_alias" : "my-connection-test",
  "leader_index" : "projecttask",
  "follower_index" : "projecttask",
  "syncing_details" : {
    "leader_checkpoint" : 2161975,
    "follower_checkpoint" : 2161975,
    "seq_no" : 2161975
  },
  "bootstrap_details" : {
    "bytes_total" : 12246723560,
    "bytes_recovered" : 166715937,
    "bytes_percent" : 1.361734,
    "files_total" : 419,
    "files_recovered" : 29,
    "files_percent" : 6.9193783,
    "start_time" : 1714390992205,
    "running_time" : 1846339
  }
}

{
  "status" : "BOOTSTRAPPING",
  "reason" : "User initiated",
  "leader_alias" : "my-connection-test",
  "leader_index" : "projecttask",
  "follower_index" : "projecttask",
  "syncing_details" : {
    "leader_checkpoint" : 2161975,
    "follower_checkpoint" : 2161975,
    "seq_no" : 2161975
  },
  "bootstrap_details" : {
    "bytes_total" : 12246723560,
    "bytes_recovered" : 166715937,
    "bytes_percent" : 1.361734,
    "files_total" : 419,
    "files_recovered" : 29,
    "files_percent" : 6.9193783,
    "start_time" : 1714390992205,
    "running_time" : 2521097
  }
}

{
  "status" : "BOOTSTRAPPING",
  "reason" : "User initiated",
  "leader_alias" : "my-connection-test",
  "leader_index" : "projecttask",
  "follower_index" : "projecttask",
  "syncing_details" : {
    "leader_checkpoint" : 2161975,
    "follower_checkpoint" : 2161975,
    "seq_no" : 2161975
  },
  "bootstrap_details" : {
    "bytes_total" : 12246723560,
    "bytes_recovered" : 282059954,
    "bytes_percent" : 2.3038514,
    "files_total" : 419,
    "files_recovered" : 31,
    "files_percent" : 7.391076,
    "start_time" : 1714390992205,
    "running_time" : 2800792
  }
}

{
  "status" : "SYNCING",
  "reason" : "User initiated",
  "leader_alias" : "my-connection-test",
  "leader_index" : "projecttask",
  "follower_index" : "projecttask",
  "syncing_details" : {
    "leader_checkpoint" : 6474917,
    "follower_checkpoint" : 6474917,
    "seq_no" : 6474917
  }
}

 curl -XGET -vkguadmin:admin "http://localhost:9200/_cluster/settings?pretty"
Note: Unnecessary use of -X or --request, GET is already inferred.
*   Trying 127.0.0.1:9200...
* Connected to localhost (127.0.0.1) port 9200
* Server auth using Basic with user 'admin'
> GET /_cluster/settings?pretty HTTP/1.1
> Host: localhost:9200
> Authorization: Basic YWRtaW46YWRtaW4=
> User-Agent: curl/8.3.0
> Accept: */*
>
< HTTP/1.1 200 OK
< content-type: application/json; charset=UTF-8
< content-length: 471
<
{
  "persistent" : {
    "cluster" : {
      "remote" : {
        "my-connection-test" : {
          "mode" : "proxy",
          "proxy_address" : "*.*.*.100:9300"
        }
      }
    },
    "plugins" : {
      "index_state_management" : {
        "template_migration" : {
          "control" : "-1"
        }
      }
    },
    "logger" : {
      "org" : {
        "opensearch" : {
          "transport" : "DEBUG"
        }
      }
    }
  },
  "transient" : { }
}

curl -XGET -vkguadmin:admin "http://localhost:9200/_plugins/_replication/autofollow_stats

{
  "num_success_start_replication" : 2,
  "num_failed_start_replication" : 0,
  "num_failed_leader_calls" : 0,
  "failed_indices" : [ ],
  "autofollow_stats" : [
    {
      "name" : "replicate-all-rule",
      "pattern" : "*",
      "num_success_start_replication" : 2,
      "num_failed_start_replication" : 0,
      "num_failed_leader_calls" : 0,
      "failed_indices" : [ ],
      "last_execution_time" : 1714468771677
    }
  ]
curl -XGET -vkguadmin:admin "http://localhost:9200/_plugins/_replication/follower_stats"
Note: Unnecessary use of -X or --request, GET is already inferred.
*   Trying 127.0.0.1:9200...
* Connected to localhost (127.0.0.1) port 9200
* Server auth using Basic with user 'admin'
> GET /_plugins/_replication/follower_stats HTTP/1.1
> Host: localhost:9200
> Authorization: Basic YWRtaW46YWRtaW4=
> User-Agent: curl/8.3.0
> Accept: */*
>
< HTTP/1.1 200 OK
< content-type: application/json; charset=UTF-8
< content-length: 1844
<
{
"num_syncing_indices" : 4,
"num_bootstrapping_indices" : 0,
"num_paused_indices" : 2,
"num_failed_indices" : 1,
"num_shard_tasks" : 6,
"num_index_tasks" : 4,
"operations_written" : 0,
"operations_read" : 0,
"failed_read_requests" : 60,
"throttled_read_requests" : 0,
"failed_write_requests" : 0,
"throttled_write_requests" : 0,
"follower_checkpoint" : 6475327,
"leader_checkpoint" : 6475327,
"total_write_time_millis" : 0,
"index_stats" : {
  "security-auditlog-2024.04.08" : {
    "operations_written" : 0,
    "operations_read" : 0,
    "failed_read_requests" : 11,
    "throttled_read_requests" : 0,
    "failed_write_requests" : 0,
    "throttled_write_requests" : 0,
    "follower_checkpoint" : 18,
    "leader_checkpoint" : 18,
    "total_write_time_millis" : 0
  },
  "projecttask" : {
    "operations_written" : 0,
    "operations_read" : 0,
    "failed_read_requests" : 8,
    "throttled_read_requests" : 0,
    "failed_write_requests" : 0,
    "throttled_write_requests" : 0,
    "follower_checkpoint" : 6474917,
    "leader_checkpoint" : 6474917,
    "total_write_time_millis" : 0
  },
  "test-01" : {
    "operations_written" : 0,
    "operations_read" : 0,
    "failed_read_requests" : 9,
    "throttled_read_requests" : 0,
    "failed_write_requests" : 0,
    "throttled_write_requests" : 0,
    "follower_checkpoint" : 392,
    "leader_checkpoint" : 392,
    "total_write_time_millis" : 0
  },
  "auto-01" : {
    "operations_written" : 0,
    "operations_read" : 0,
    "failed_read_requests" : 32,
    "throttled_read_requests" : 0,
    "failed_write_requests" : 0,
    "throttled_write_requests" : 0,
    "follower_checkpoint" : 0,
    "leader_checkpoint" : 0,
    "total_write_time_millis" : 0
  }
}

Related component

Plugins

To Reproduce

Enable autofollow in follower site.
Create new index in leader site.
Check for index replication in follower site.
Check for Replication status which is in bootstrapping mode for the test index and it takes longer for the entire data to get replicated.

Expected behavior

The time taken to replicate the leader index to the follower site should not take hours.

Additional Details

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

OS: [e.g. iOS]
Version [e.g. 22]

Additional context
Add any other context about the problem here.

The text was updated successfully, but these errors were encountered:

VinDataR · 2024-05-06T05:43:23Z

We deleted the follower index again and it got recreated but took several hours to replicate 14GB of index data to the secondary site. Now the status of the replication status is showing as below;

{
  "status" : "PAUSED",
  "reason" : "AutoPaused:  + [[projecttask-clone][0] - org.opensearch.index.seqno.RetentionLeaseNotFoundException - \"retention lease with ID [replication:dev2-opensearch2x-cluster:uWToVkHzTomO2pkkqmbMSg:[projecttask-clone][0]] not found\"], ",
  "leader_alias" : "my-connection-test",
  "leader_index" : "projecttask-clone",
  "follower_index" : "projecttask-clone"
}

VinDataR · 2024-05-13T05:09:45Z

Can someone please acknowledge and help on this issue.

ankitkala · 2024-05-14T03:45:14Z

The replication isn't expected to be this slow. For all replicating indices, we usually see data replication within 1 minute(most of the time under 20 seconds).
For bootstrapping, several hours to replicate 14GB seems too high. From our testing, we had replicated 1TB data(across region) under 20 minutes without any additional changes.

There isn't a easy way to figure out why performance is slow here. My suspicion is that either:

the leader cluster is slow & taking too long to respond.
or the network is slow and file copy is taking too long.

Can you check the system metrics(CPU/JVM/Disk) for leader's data nodes?
Can you revisit the network to ensure that we aren't hitting the bandwidth limits?

VinDataR · 2024-05-14T08:45:18Z

@ankitkala ,

We were able to deduce the source of issue to the number of socket connections to the remote cluster.

 $ curl -XGET -k -H 'Content-Type: application/json' -u '****:******' 'http://localhost:9200/_remote/info?pretty'

{
  "Dev2-to-uat" : {
    "connected" : true,
    "mode" : "proxy",
    "proxy_address" : "*.*.*.*:9300",
    "server_name" : "",
    "num_proxy_sockets_connected" : 18,
    "max_proxy_socket_connections" : 18,
    "initial_connect_timeout" : "3m",
    "skip_unavailable" : false
  }
}

Here we could see num_proxy_sockets_connected has reached the value of num_proxy_sockets_connected which is 18. We could the replication tasks also stuck at this number;

$ curl -XGET -u ****:***** 'http://localhost:9200/_cat/tasks?pretty'
cluster:admin/plugins/replication/autofollow[c] VvoF6xL_RaiFJ9lxCS0aDg:507516  cluster:1                      persistent 1715341742736 11:49:02 2.9d        *.*.14.202 platform-opensearch2x-data-1
cluster:indices/admin/replication[c]            VvoF6xL_RaiFJ9lxCS0aDg:507533  cluster:2                      persistent 1715341742993 11:49:02 2.9d        *.*.14.202 platform-opensearch2x-data-1
cluster:indices/shards/replication[c]           VvoF6xL_RaiFJ9lxCS0aDg:507579  cluster:3                      persistent 1715341743980 11:49:03 2.9d        *.*.14.202 platform-opensearch2x-data-1
cluster:indices/admin/replication[c]            L-gJPsFhSLGaUig2CaNp6w:478218  cluster:4                      persistent 1715342823679 12:07:03 2.9d        *.*.9.14   platform-opensearch2x-data-4
cluster:indices/shards/replication[c]           aP9XS6j7TBm0o34pqp5zMg:472346  cluster:5                      persistent 1715343888989 12:24:48 2.9d        *.*.11.35  platform-opensearch2x-data-0
cluster:indices/admin/replication[c]            aP9XS6j7TBm0o34pqp5zMg:1196554 cluster:6                      persistent 1715581542551 06:25:42 5h          *.*.11.35  platform-opensearch2x-data-0
cluster:indices/shards/replication[c]           VvoF6xL_RaiFJ9lxCS0aDg:1326777 cluster:7                      persistent 1715581543309 06:25:43 5h          *.*.14.202 platform-opensearch2x-data-1
cluster:indices/admin/replication[c]            GQ_QGxxEQBizEIv8VCbeng:1177761 cluster:9                      persistent 1715586495397 07:48:15 3.6h        *.*.16.104 platform-opensearch2x-data-2
cluster:indices/shards/replication[c]           3_Ir3geFTpCrV1AirO8g_A:1271639 cluster:13                     persistent 1715586686679 07:51:26 3.6h        *.*.9.112  platform-opensearch2x-data-3
cluster:indices/admin/replication[c]            3_Ir3geFTpCrV1AirO8g_A:1290991 cluster:14                     persistent 1715592123245 09:22:03 2h          *.*.9.112  platform-opensearch2x-data-3
cluster:indices/admin/replication[c]            L-gJPsFhSLGaUig2CaNp6w:1312434 cluster:15                     persistent 1715592123638 09:22:03 2h          *.*.9.14   platform-opensearch2x-data-4
cluster:indices/shards/replication[c]           GQ_QGxxEQBizEIv8VCbeng:1201388 cluster:16                     persistent 1715592124436 09:22:04 2h          *.*.16.104 platform-opensearch2x-data-2
cluster:indices/shards/replication[c]           aP9XS6j7TBm0o34pqp5zMg:1227312 cluster:17                     persistent 1715592184310 09:23:04 2h          *.*.11.35  platform-opensearch2x-data-0
cluster:indices/admin/replication[c]            VvoF6xL_RaiFJ9lxCS0aDg:1377296 cluster:18                     persistent 1715598672724 11:11:12 16.6m       *.*.14.202 platform-opensearch2x-data-1
cluster:indices/admin/replication[c]            aP9XS6j7TBm0o34pqp5zMg:1246481 cluster:19                     persistent 1715598673055 11:11:13 16.6m       *.*.11.35  platform-opensearch2x-data-0
cluster:monitor/tasks/lists                     FqZ867VYQfyWO6a6ABrFYA:1112689 -                              transport  1715599674584 11:27:54 984.4micros *.*.10.214 platform-opensearch2x-master-0

We increased the proxy_socket_connections to 100 first and we could all the pending indexes got immediately replicated. (NOTE: cluster settings API didn't accept max_proxy_socket_connections field but accpeted max_proxy_socket_connections)

{
  "persistent" : {
    "cluster" : {
      "remote" : {
        "Dev2-to-uat" : {
          "mode" : "proxy",
          "proxy_address" : "*.*.*.*:9300",
          "max_proxy_socket_connections" : "100"
        }
      }

We coould see new replicationt asks getting created and successful replicaiton of the pending indices.

$ curl -XGET -u admin:admin 'http://localhost:9200/_cat/tasks?pretty'
cluster:admin/plugins/replication/autofollow[c] VvoF6xL_RaiFJ9lxCS0aDg:507516  cluster:1                      persistent 1715341742736 11:49:02 2.9d        *.*.14.202 platform-opensearch2x-data-1
cluster:indices/admin/replication[c]            VvoF6xL_RaiFJ9lxCS0aDg:507533  cluster:2                      persistent 1715341742993 11:49:02 2.9d        *.*.14.202 platform-opensearch2x-data-1
cluster:indices/shards/replication[c]           VvoF6xL_RaiFJ9lxCS0aDg:507579  cluster:3                      persistent 1715341743980 11:49:03 2.9d        *.*.14.202 platform-opensearch2x-data-1
cluster:indices/admin/replication[c]            L-gJPsFhSLGaUig2CaNp6w:478218  cluster:4                      persistent 1715342823679 12:07:03 2.9d        *.*.9.14   platform-opensearch2x-data-4
cluster:indices/shards/replication[c]           aP9XS6j7TBm0o34pqp5zMg:472346  cluster:5                      persistent 1715343888989 12:24:48 2.9d        *.*.11.35  platform-opensearch2x-data-0
cluster:indices/admin/replication[c]            aP9XS6j7TBm0o34pqp5zMg:1196554 cluster:6                      persistent 1715581542551 06:25:42 5.3h        *.*.11.35  platform-opensearch2x-data-0
cluster:indices/shards/replication[c]           VvoF6xL_RaiFJ9lxCS0aDg:1326777 cluster:7                      persistent 1715581543309 06:25:43 5.3h        *.*.14.202 platform-opensearch2x-data-1
cluster:indices/admin/replication[c]            GQ_QGxxEQBizEIv8VCbeng:1177761 cluster:9                      persistent 1715586495397 07:48:15 3.9h        *.*.16.104 platform-opensearch2x-data-2
cluster:indices/shards/replication[c]           3_Ir3geFTpCrV1AirO8g_A:1271639 cluster:13                     persistent 1715586686679 07:51:26 3.8h        *.*.9.112  platform-opensearch2x-data-3
cluster:indices/admin/replication[c]            3_Ir3geFTpCrV1AirO8g_A:1290991 cluster:14                     persistent 1715592123245 09:22:03 2.3h        *.*.9.112  platform-opensearch2x-data-3
cluster:indices/admin/replication[c]            L-gJPsFhSLGaUig2CaNp6w:1312434 cluster:15                     persistent 1715592123638 09:22:03 2.3h        *.*.9.14   platform-opensearch2x-data-4
cluster:indices/shards/replication[c]           GQ_QGxxEQBizEIv8VCbeng:1201388 cluster:16                     persistent 1715592124436 09:22:04 2.3h        *.*.16.104 platform-opensearch2x-data-2
cluster:indices/shards/replication[c]           aP9XS6j7TBm0o34pqp5zMg:1227312 cluster:17                     persistent 1715592184310 09:23:04 2.3h        *.*.11.35  platform-opensearch2x-data-0
cluster:indices/admin/replication[c]            VvoF6xL_RaiFJ9lxCS0aDg:1377296 cluster:18                     persistent 1715598672724 11:11:12 33.3m       *.*.14.202 platform-opensearch2x-data-1
cluster:indices/admin/replication[c]            aP9XS6j7TBm0o34pqp5zMg:1246481 cluster:19                     persistent 1715598673055 11:11:13 33.3m       *.*.11.35  platform-opensearch2x-data-0
cluster:indices/admin/replication[c]            GQ_QGxxEQBizEIv8VCbeng:1227566 cluster:20                     persistent 1715600571948 11:42:51 1.7m        *.*.16.104 platform-opensearch2x-data-2
cluster:indices/admin/replication[c]            3_Ir3geFTpCrV1AirO8g_A:1321408 cluster:21                     persistent 1715600630643 11:43:50 43.3s       *.*.9.112  platform-opensearch2x-data-3
cluster:indices/shards/replication[c]           L-gJPsFhSLGaUig2CaNp6w:1347842 cluster:22                     persistent 1715600631184 11:43:51 42.8s       *.*.9.14   platform-opensearch2x-data-4
cluster:indices/shards/replication[c]           aP9XS6j7TBm0o34pqp5zMg:1252379 cluster:23                     persistent 1715600631643 11:43:51 42.3s       *.*.11.35  platform-opensearch2x-data-0
cluster:indices/shards/replication[c]           VvoF6xL_RaiFJ9lxCS0aDg:1383053 cluster:24                     persistent 1715600633473 11:43:53 40.5s       *.*.14.202 platform-opensearch2x-data-1
cluster:indices/admin/replication[c]            VvoF6xL_RaiFJ9lxCS0aDg:1383188 cluster:25                     persistent 1715600661004 11:44:21 13s         *.*.14.202 platform-opensearch2x-data-1
cluster:indices/shards/replication[c]           VvoF6xL_RaiFJ9lxCS0aDg:1383233 cluster:26                     persistent 1715600661641 11:44:21 12.3s       *.*.14.202 platform-opensearch2x-data-1
cluster:indices/shards/replication[c]           VvoF6xL_RaiFJ9lxCS0aDg:1383299 cluster:27                     persistent 1715600670570 11:44:30 3.4s        *.*.14.202 platform-opensearch2x-data-1
cluster:monitor/tasks/lists                     FqZ867VYQfyWO6a6ABrFYA:1115330 -                              transport  1715600674028 11:44:34 1ms         *.*.10.214 platform-opensearch2x-master-0

However, now when check the remote/info API, the num_proxy_sockets_connections and max_proxy_socket_connections are showing as same number.

$ curl -XGET -k -H 'Content-Type: application/json' -u '****:*****' 'http://localhost:9200/_remote/info?pretty'
{
  "Dev2-to-uat" : {
    "connected" : true,
    "mode" : "proxy",
    "proxy_address" : "*.*.*.*:9300",
    "server_name" : "",
    "max_proxy_socket_connections" : 100,
    "max_proxy_socket_connections" : 100,
    "initial_connect_timeout" : "3m",
    "skip_unavailable" : false
  }
}

Now, when I try to create a new index on the Leader, it is not getting replicated until I increase the proxy_socket_connections to a higher value, thereby allowing the follower to create new replication connections.
I tried digging a lot but couldn't find much documentation on Proxy mode for remote connections in opensearch.

Is there any way we can set just the max_proxy_socket_connections without affecting the num_proxy_socket_connections ?

dblock · 2024-06-24T16:33:37Z

Catch All Triage - 1 2 3 4 5 6

VinDataR added bug Something isn't working untriaged labels May 6, 2024

dblock transferred this issue from opensearch-project/OpenSearch May 6, 2024

dblock removed the untriaged label Jun 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Issue] Cross cluster replication is too slow between Opensearch clusters deployed on two separate kubernetes clusters. #1385

[Issue] Cross cluster replication is too slow between Opensearch clusters deployed on two separate kubernetes clusters. #1385

VinDataR commented May 6, 2024

VinDataR commented May 6, 2024

VinDataR commented May 13, 2024

ankitkala commented May 14, 2024

VinDataR commented May 14, 2024 •

edited

Loading

dblock commented Jun 24, 2024

[Issue] Cross cluster replication is too slow between Opensearch clusters deployed on two separate kubernetes clusters. #1385

[Issue] Cross cluster replication is too slow between Opensearch clusters deployed on two separate kubernetes clusters. #1385

Comments

VinDataR commented May 6, 2024

Describe the bug

Related component

To Reproduce

Expected behavior

Additional Details

VinDataR commented May 6, 2024

VinDataR commented May 13, 2024

ankitkala commented May 14, 2024

VinDataR commented May 14, 2024 • edited Loading

dblock commented Jun 24, 2024

VinDataR commented May 14, 2024 •

edited

Loading