https://groups.google.com/forum/#!msg/druid-user/vpAOj9KIoTg/EkivfryCBgAJ위 url과 같이 수행시계속해서 default aws 값을 읽어오려고 한다. (druid 에서 설정한 s3 설정값을 못 읽어옴)
확인결과org.apache.hadoop:hadoop-aws:3.0.0-alpha1 -> aws-java-sdk 1.10.6 버전을 참조org.apache.hadoop:hadoop-aws:2.7.2 -> aws-java-sdk 1.7.4 버전을 참조
Org수정
aws-java-sdk | 1.10.21 | 1.10.21 |
hadoop-client | 2.3.0 | 2.7.2 |
hadoop-aws | X | 3.0.0.-alpha1 |
aws-java-sdkhadoop-clienthadoop-aws결과
1.10.21 | 2.7.2 | 3.0.0-alpha1 | OK |
1.10.21 | 2.3.0 | 3.0.0-alpha1 | OK |
1.7.4 | 2.7.2 | 2.7.2 | X (workaround) |
* 0.9.0 소스를 받아서 빌드 -> 0.9.1로 버전으로 나옴
- indexing-hadoop/src/main/java/io/druid/indexer/JobHelper.java
401번째 줄에 (case "s3a": ) 추가
1. 다운받은 소스에서 명령어 실행
java -cp "lib/*" -Ddruid.extensions.directory="extensions" -Ddruid.extensions.hadoopDependenciesDir="hadoop-dependencies" io.druid.cli.Main tools pull-deps --no-default-hadoop -h "org.apache.hadoop:hadoop-client:2.7.2" -h "org.apache.hadoop:hadoop-aws:3.0.0-alpha1"
명령어 실행
* 안될경우 복사하여 해당 경로에 붙여넣기
2. cat conf/_common/common.runtime.properties
#s3 extension 추가
druid.extensions.loadList=["druid-s3-extensions"]
#deep storage s3로 설정
druid.storage.type=s3
druid.storage.bucket=your-bucket
druid.storage.baseKey=druid/segments
druid.s3.accessKey=XXXXXXXXXXXXXX
druid.s3.secretKey=XXXXXXXXXXXXXX
3. MiddleManager에 peon HadoopCoordinates 설정
$ cat conf/druid/middleManager/runtime.properties
> druid.indexer.task.defaultHadoopCoordinates=["org.apache.hadoop:hadoop-client:2.3.0", "org.apache.hadoop:hadoop-aws:3.0.0-alpha1"]
4. Start Historical Node
- hadoop-aws classpath 추가
$ nohup java `cat conf/druid/middleManager/jvm.config | xargs` -cp conf/druid/_common:conf/druid/middleManager:lib/*:hadoop-dependencies/hadoop-aws/3.0.0-alpha1/* io.druid.cli.Main server middleManager > ~/druid/log/middleManager.log 2>&1&
5. index file에 s3 access/secret key 추가
"jobProperties" : {
"fs.s3.impl" : "org.apache.hadoop.fs.s3a.S3AFileSystem",
"fs.s3n.impl" : "org.apache.hadoop.fs.s3a.S3AFileSystem",
"fs.s3a.endpoint" : "s3.ap-northeast-2.amazonaws.com",
"fs.s3a.access.key" : "XXXXXXX",
"fs.s3a.secret.key" : "XXXXXXXXX",
"mapreduce.job.classloader": "true"
6. index file에 hadoopDependencyCoordinates 추가
-> hadoopDependencyCoordinates" : ["org.apache.hadoop:hadoop-client:2.7.2", "org.apache.hadoop:hadoop-aws:3.0.0-alpha1"]
7. jets3t Properties 추가
: historical node에서 s3a 파일 Access 시 필요
$ cat conf/_common/jets3t.properties
s3service.s3-endpoint = s3.ap-northeast-2.amazonaws.com
storage-service.request-signature-version=AWS4-HMAC-SHA256
'IT' 카테고리의 다른 글
Apache Spark란? (0) | 2021.11.08 |
---|---|
Cookie vs Local Storage vs Session Storage (0) | 2021.11.05 |
ELK stack이란? (0) | 2021.11.05 |
Druid (0) | 2021.11.05 |
Apache Kafka (0) | 2021.11.05 |
댓글