DataX 编译插件

我出于自身需要,已经编译好了doriswriter和clickhousewriter,有相同需求的可以直接云盘下载。

https://wwsx.lanzouw.com/b00y9w9ovg 密码:byxl

前置准备工作:准备一个JDK8及以上,安装好maven。

拉取DataX源码

这里有两种方法,一种是通过git直接拉取DataX的项目代码。另外一种则是下载项目提供的打包源码。

  1. 拉取源代码 建立一个空文件夹,然后在文件夹内执行如下命令:
1
2
git clone https://github.com/alibaba/DataX.git
cd DataX
  1. 下载源码 直接下载压缩包,然后解压打开文件夹即可。

全量编译项目

执行命令:mvn clean package。这个命令会清理之前的构建输出 (clean) 并重新打包项目 (package)。它会遍历所有模块,执行测试,并生成最终的 jar 文件和其他构建产物。这一步需要保证网络畅通,因为编译时Maven需要从远程仓管下载所需的依赖库。

顺利地话到这里就解决了,但是很显然没这么简单。接下来会语句一堆报错,而且还是我们所不关心的插件。那可不可以使用尚硅谷教程里提供的已经编译好的DataX呢?

单独编译需要的插件

注:Doris官网中编译doriswriter插件是在Linux中执行shell命令。我懒得再建一个Linux环境。就放弃了这条路线。 使用Linux的可以直接查看Doris官网的相关部分

尚硅谷提供的已经编译好的压缩包虽然好,但是我所需要的doriswriter插件是很早期的插件。和我现在用的Doris2.1.7不兼容。 那我就需要自行编译doriswriter插件,然后将编译好的jar包替换尚硅谷提供的压缩包里的jar包。

在doriswriter目录下执行mvn clean package -DskipTests就可以得到所需jar包。

然后就遇见了以下报错:

1
2
3
[ERROR] Failed to execute goal on project doriswriter: Could not resolve dependencies for project com.alibaba.datax:doriswriter:jar:0.0.1-SNAPSHOT: The following artifacts could not be
 resolved: com.alibaba.datax:datax-common:jar:0.0.1-SNAPSHOT, com.alibaba.datax:plugin-rdbms-util:jar:0.0.1-SNAPSHOT: Could not find artifact com.alibaba.datax:datax-common:jar:0.0.1-S
NAPSHOT in central (https://maven.aliyun.com/repository/central) -> [Help 1]

从错误信息来看,Maven 在尝试编译 doriswriter 插件时无法找到依赖项 datax-common 和 plugin-rdbms-util。这是因为这些依赖项没有在中央仓库中可用,它们是 DataX 项目内部的模块。

所以需要先确保 DorisWriter 插件依赖的模块( datax-common 和 plugin-rdbms-util)已经被正确安装到本地 Maven 仓库中。

1
2
3
4
5
6
7
8
cd common
mvn clean install -DskipTests

cd ../plugin-rdbms-util
mvn clean install -DskipTests

cd ../doriswriter/
mvn clean package -DskipTests
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
D:\DeskTop\煊\笔\DataX\DataX-datax_v202309\common>mvn clean install -DskipTests
[INFO] Scanning for projects...
[INFO]
[INFO] -------------------< com.alibaba.datax:datax-common >-------------------
[INFO] Building datax-common 0.0.1-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ datax-common ---
[INFO] Deleting D:\DeskTop\煊\笔\DataX\DataX-datax_v202309\common\target
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ datax-common ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 6 resources
[INFO]
[INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ datax-common ---
[INFO] Compiling 45 source files to D:\DeskTop\煊\笔\DataX\DataX-datax_v202309\common\target\classes
[INFO]
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ datax-common ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory D:\DeskTop\煊\笔\DataX\DataX-datax_v202309\common\src\test\resources
[INFO]
[INFO] --- maven-compiler-plugin:2.3.2:testCompile (default-testCompile) @ datax-common ---
[INFO] No sources to compile
[INFO]
[INFO] --- maven-surefire-plugin:2.12.4:test (default-test) @ datax-common ---
[INFO] Tests are skipped.
[INFO]
[INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ datax-common ---
[INFO] Building jar: D:\DeskTop\煊\笔\DataX\DataX-datax_v202309\common\target\datax-common-0.0.1-SNAPSHOT.jar
[INFO]
[INFO] --- maven-install-plugin:2.4:install (default-install) @ datax-common ---
[INFO] Installing D:\DeskTop\煊\笔\DataX\DataX-datax_v202309\common\target\datax-common-0.0.1-SNAPSHOT.jar to D:\maven-3.8.6\respository\com\alibaba\datax\datax-common\0.0.1-SNAPSHOT\datax-common-0.0.1-SNAPSHOT.jar
[INFO] Installing D:\DeskTop\煊\笔\DataX\DataX-datax_v202309\common\pom.xml to D:\maven-3.8.6\respository\com\alibaba\datax\datax-common\0.0.1-SNAPSHOT\datax-common-0.0.1-SNAPSHOT.pom
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  2.041 s
[INFO] Finished at: 2025-02-03T17:26:33+08:00
[INFO] ------------------------------------------------------------------------

D:\DeskTop\煊\笔\DataX\DataX-datax_v202309\common>cd ../plugin-rdbms-util

D:\DeskTop\煊\笔\DataX\DataX-datax_v202309\plugin-rdbms-util>mvn clean install -DskipTests
[INFO] Scanning for projects...
[INFO]
[INFO] ----------------< com.alibaba.datax:plugin-rdbms-util >-----------------
[INFO] Building plugin-rdbms-util 0.0.1-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
Downloading from spring: https://maven.aliyun.com/repository/spring/com/alibaba/datax/datax-all/0.0.1-SNAPSHOT/maven-metadata.xml
Downloading from central: https://maven.aliyun.com/repository/central/com/alibaba/datax/datax-all/0.0.1-SNAPSHOT/maven-metadata.xml
[INFO]
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ plugin-rdbms-util ---
[INFO] Deleting D:\DeskTop\煊\笔\DataX\DataX-datax_v202309\plugin-rdbms-util\target
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ plugin-rdbms-util ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 0 resource
[INFO]
[INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ plugin-rdbms-util ---
[INFO] Compiling 25 source files to D:\DeskTop\煊\笔\DataX\DataX-datax_v202309\plugin-rdbms-util\target\classes
[INFO]
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ plugin-rdbms-util ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory D:\DeskTop\煊\笔\DataX\DataX-datax_v202309\plugin-rdbms-util\src\test\resources
[INFO]
[INFO] --- maven-compiler-plugin:2.3.2:testCompile (default-testCompile) @ plugin-rdbms-util ---
[INFO] No sources to compile
[INFO]
[INFO] --- maven-surefire-plugin:2.12.4:test (default-test) @ plugin-rdbms-util ---
[INFO] Tests are skipped.
[INFO]
[INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ plugin-rdbms-util ---
[INFO] Building jar: D:\DeskTop\煊\笔\DataX\DataX-datax_v202309\plugin-rdbms-util\target\plugin-rdbms-util-0.0.1-SNAPSHOT.jar
[INFO]
[INFO] --- maven-install-plugin:2.4:install (default-install) @ plugin-rdbms-util ---
[INFO] Installing D:\DeskTop\煊\笔\DataX\DataX-datax_v202309\plugin-rdbms-util\target\plugin-rdbms-util-0.0.1-SNAPSHOT.jar to D:\maven-3.8.6\respository\com\alibaba\datax\plugin-rdbms-util\0.0.1-SNAPSHOT\plugin-rdbms-util-0.0.1-SNAPSHOT.jar
[INFO] Installing D:\DeskTop\煊\笔\DataX\DataX-datax_v202309\plugin-rdbms-util\pom.xml to D:\maven-3.8.6\respository\com\alibaba\datax\plugin-rdbms-util\0.0.1-SNAPSHOT\plugin-rdbms-util-0.0.1-SNAPSHOT.pom
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  2.183 s
[INFO] Finished at: 2025-02-03T17:27:38+08:00
[INFO] ------------------------------------------------------------------------

D:\DeskTop\煊\笔\DataX\DataX-datax_v202309\plugin-rdbms-util>cd ../doriswriter/

D:\DeskTop\煊\笔\DataX\DataX-datax_v202309\doriswriter>mvn clean package -DskipTests
[INFO] Scanning for projects...
[INFO]
[INFO] -------------------< com.alibaba.datax:doriswriter >--------------------
[INFO] Building doriswriter 0.0.1-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ doriswriter ---
[INFO] Deleting D:\DeskTop\煊\笔\DataX\DataX-datax_v202309\doriswriter\target
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ doriswriter ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 0 resource
[INFO]
[INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ doriswriter ---
[INFO] Compiling 13 source files to D:\DeskTop\煊\笔\DataX\DataX-datax_v202309\doriswriter\target\classes
[INFO]
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ doriswriter ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory D:\DeskTop\煊\笔\DataX\DataX-datax_v202309\doriswriter\src\test\resources
[INFO]
[INFO] --- maven-compiler-plugin:2.3.2:testCompile (default-testCompile) @ doriswriter ---
[INFO] No sources to compile
[INFO]
[INFO] --- maven-surefire-plugin:2.12.4:test (default-test) @ doriswriter ---
[INFO] Tests are skipped.
[INFO]
[INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ doriswriter ---
[INFO] Building jar: D:\DeskTop\煊\笔\DataX\DataX-datax_v202309\doriswriter\target\doriswriter-0.0.1-SNAPSHOT.jar
[INFO]
[INFO] --- maven-assembly-plugin:2.2-beta-5:single (dwzip) @ doriswriter ---
[INFO] Reading assembly descriptor: src/main/assembly/package.xml
[INFO] Copying files to D:\DeskTop\煊\笔\DataX\DataX-datax_v202309\doriswriter\target\datax
[WARNING] Assembly file: D:\DeskTop\煊\笔\DataX\DataX-datax_v202309\doriswriter\target\datax is not a regular file (it may be a directory). It cannot be attached to the project build for installation or deployment.
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  2.912 s
[INFO] Finished at: 2025-02-03T17:28:12+08:00
[INFO] ------------------------------------------------------------------------

成功完成编译,这时候就可以在doriswriter/target目录下看到生成的 JAR 文件

然后再替换旧的插件。旧的插件放在datax\datax\plugin\writer\doriswriter。 注意不是在datax\plugin\writer\doriswriter里。

补充报错

实际在使用时还遇见了找不到 com.alibaba.fastjson2.JSON 类。的报错。这个错误通常是由于缺少必要的依赖库或版本不匹配导致的。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
DataX运行报错Exception in thread "Thread-1" java.lang.NoClassDefFoundError: com/alibaba/fastjson2/JSON
	at com.alibaba.datax.plugin.writer.doriswriter.DorisStreamLoadObserver.put(DorisStreamLoadObserver.java:186)
	at com.alibaba.datax.plugin.writer.doriswriter.DorisStreamLoadObserver.streamLoad(DorisStreamLoadObserver.java:63)
	at com.alibaba.datax.plugin.writer.doriswriter.DorisWriterManager.asyncFlush(DorisWriterManager.java:163)
	at com.alibaba.datax.plugin.writer.doriswriter.DorisWriterManager.access$000(DorisWriterManager.java:19)
	at com.alibaba.datax.plugin.writer.doriswriter.DorisWriterManager$1.run(DorisWriterManager.java:134)
	at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.ClassNotFoundException: com.alibaba.fastjson2.JSON
	at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
	... 6 more

datax\datax\plugin\writer\doriswriter\lib下有Doris插件所需的工具包,这里有一个fastjson-1.1.46.sec10.jar。果然是版本依赖错误。 直接下载最新版本的fastjson2,替换这个jar包。 下面这个命令直接在lib文件夹下执行,会直接下在jar包到当前目录下。下载好后记得删除或者将原来的jar包改后缀。

1
wget https://repo1.maven.org/maven2/com/alibaba/fastjson2/fastjson2/2.0.3/fastjson2-2.0.3.jar

然后就能正常执行了。

页面浏览量Loading
网站总访客数:Loading
网站总访问量:Loading
使用 Hugo 构建
主题 StackJimmy 设计