Myluzh Blog

通过elasticdump迁移ElasticSearch数据

发布时间: 2025-2-10 文章作者: myluzh 分类名称: NOTE 朗读文章


0x01 前言
有一台日志es,需要把旧的es数据迁移到新的es上。
在迁移Elasticsearch (ES) 数据时,使用 elasticdump 是一个常见的方法。elasticdump 是一个开源工具,用于将 Elasticsearch 集群中的数据导出和导入。

0x02 部署elasticdump
1、由于我这边是k8s集群,直接跑一个pod起来即可。镜像是node:14,自带npm。方便安装elasticdump。
apiVersion: apps/v1
kind: Deployment
metadata:
  name: es-migration
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: es-migration
  template:
    metadata:
      labels:
        app: es-migration
    spec:
      containers:
        - name: es-migration
          image: 172.30.82.223:5443/base/node:14
          command: ["sleep", "infinity"]
          resources:
            requests:
              cpu: "100m"
              memory: "128Mi"
            limits:
              cpu: "500m"
              memory: "512Mi"
2、安装elasticdump
进入容器后安装elasticdump
root@es-migration-5987b859b4-7zjwb:/# npm install -g elasticdump

npm WARN deprecated querystring@0.2.0: The querystring API is considered Legacy. new code should use the URLSearchParams API instead.
npm WARN deprecated lodash.isequal@4.5.0: This package is deprecated. Use require('node:util').isDeepStrictEqual instead.
npm WARN deprecated s3signed@0.1.0: This module is no longer maintained. It is provided as is.
/usr/local/bin/elasticdump -> /usr/local/lib/node_modules/elasticdump/bin/elasticdump
/usr/local/bin/multielasticdump -> /usr/local/lib/node_modules/elasticdump/bin/multielasticdump
npm WARN requestretry@7.1.0 requires a peer of request@2.*.* but none is installed. You must install peer dependencies yourself.

+ elasticdump@6.115.0
added 144 packa

0x03 开始迁移数据
1、编写了一个批量迁移脚本,直接改改参数就能用
elasticdump.sh
#!/bin/bash

# 旧ES地址
SOURCE_ES_ADDR="172.16.12.66:9200"
SOURCE_ES_USER="elastic"
SOURCE_ES_PASS="SaRorwWAC9aSOR6asyBD"

# 新ES地址
DEST_ES_ADDR="elasticsearch-logging.base.svc.cluster.local:9200"
DEST_ES_USER="elastic"
DEST_ES_PASS="TmFr38nhOjaMYH4OR2y7"

# 迁移的limit值
MAX_LIMIT=10000

# 获取所有索引
echo "获取 $SOURCE_ES_ADDR 所有索引"
indices=$(curl -s -u $SOURCE_ES_USER:$SOURCE_ES_PASS "$SOURCE_ES_ADDR/_cat/indices?h=index" | grep -v '^\.') 

# 检查是否有索引
if [ -z "$indices" ]; then
    echo "没有找到任何索引,迁移终止!"
    exit 1
fi

echo "找到以下索引:"
echo "$indices"

# 开启迁移
for index in $indices; do
    echo "正在迁移索引: $index"

    # 1. 迁移索引 mapping
    echo "迁移 mapping: $index"
    elasticdump --input="http://$SOURCE_ES_USER:$SOURCE_ES_PASS@$SOURCE_ES_ADDR/$index" \
                --output="http://$DEST_ES_USER:$DEST_ES_PASS@$DEST_ES_ADDR/$index" \
                --type=mapping \
                --limit=$MAX_LIMIT || { echo "迁移 mapping 失败: $index"; continue; }

    # 2. 迁移索引数据
    echo "迁移数据: $index"
    elasticdump --input="http://$SOURCE_ES_USER:$SOURCE_ES_PASS@$SOURCE_ES_ADDR/$index" \
                --output="http://$DEST_ES_USER:$DEST_ES_PASS@$DEST_ES_ADDR/$index" \
                --type=data \
                --limit=$MAX_LIMIT || { echo "迁移数据失败: $index"; continue; }
    
    echo "索引迁移成功: $index"
done

2、开始迁移
root@es-migration-5987b859b4-7zjwb:/# chmod +x elasticdump.sh
root@es-migration-5987b859b4-7zjwb:/# nohup bash elasticdump.sh > elasticdump.log 2>&1 &
[1] 354
root@es-migration-5987b859b4-7zjwb:/# cat  elasticdump.log
获取 172.16.12.66:9200 所有索引
找到以下索引: dapr-2025.01.21 dapr-2025.02.10 ....
正在迁移索引: dapr-2025.01.21
迁移 mapping: dapr-2025.01.21
Mon, 10 Feb 2025 01:57:47 GMT | starting dump
Mon, 10 Feb 2025 01:57:47 GMT | got 1 objects from source elasticsearch (offset: 0)
Mon, 10 Feb 2025 01:57:48 GMT | sent 1 objects to destination elasticsearch, wrote 1
Mon, 10 Feb 2025 01:57:48 GMT | got 0 objects from source elasticsearch (offset: 1)
Mon, 10 Feb 2025 01:57:48 GMT | Total Writes: 1
Mon, 10 Feb 2025 01:57:48 GMT | dump complete
迁移数据: dapr-2025.01.21
Mon, 10 Feb 2025 01:57:49 GMT | starting dump
Mon, 10 Feb 2025 01:57:50 GMT | got 10000 objects from source elasticsearch (offset: 0)
Mon, 10 Feb 2025 01:57:51 GMT | sent 10000 objects to destination elasticsearch, wrote 10000
Mon, 10 Feb 2025 01:57:52 GMT | got 10000 objects from source elasticsearch (offset: 10000)
Mon, 10 Feb 2025 01:57:53 GMT | sent 10000 objects to destination elasticsearch, wrote 10000
Mon, 10 Feb 2025 01:57:54 GMT | got 10000 objects from source elasticsearch (offset: 20000)
...



标签: elastic es elasticdump

发表评论