Datahub安装配置—————附带详细步骤
⽂章⽬录
0 作⽤
使⽤Datahub从原始数据库抽取数据表的schema信息。
1 安装docker
$ yum -y install docker
# 启动docker
$ sudo systemctl start docker
# 测试是否正确安装
$ sudo docker run hello-world
安装docker-compo【docker的服务编排⼯具,主要是⽤来构建多个服务】 :
$ curl -L "/docker/compo/releas/download/1.27.4/docker-compo-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compo $ chmod +x /usr/local/bin/docker-compo
重启docker:
守护进程重启
$ sudo systemctl daemon-reload
重启docker服务
$ sudo systemctl restart docker
终身寿险
检查启动:
$ docker container ls
股权激励案例2 安装和启动Datahub
两穴$ cd /opt
$ yum -y install git
$ git --version
$ git clone /linkedin/datahub.git
$ cd /opt/datahub/docker
$ source ./quickstart.sh
扩展
$ python3 -m pip install --upgrade pip wheel tuptools认真是一种态度
$ python3 -m pip uninstall datahub acryl-datahub ||true# sanity check - ok if it fails
$ python3 -m pip install --upgrade acryl-datahub
新手出纳怎样记流水账$ datahub version
$ datahub docker quickstart
3 使⽤Datahub导⼊数据
$ pip install pymysql
$ cd /opt/datahub/docker/ingestion
修改yml配置⽂件:
$ l
修改内容为:
修改的地⽅为:
了解用英语怎么说type:数据主题
urname、password、host_port、databa【可不填写,没起作⽤】:数据库账户、密码、ip地址和端⼝sink:数据⽬的地
source:
type: "mysql"
config:
urname: "root"
password: "root"
databa: "hero"光明的反义词
host_port: "192.168.101.110:9876"
sink:
type: "datahub-rest"
config:
rver: 'localhost:8080'
另⼀种格式的l:
source:
type: mysql
config:
urname: "root"
password: "root"
databa: "hero"
host_port: "192.168.101.177:3306"
table_pattern:
deny:
# Note that the deny patterns take precedence over the allow patterns.
- "performance_schema"
allow:
- "schema1.table2"
# Although the 'table_pattern' enables you to skip everything from certain schemas,
# having another option to allow/deny on schema level is an optimization for the ca when there is a large number
# of schemas that one wants to skip and you want to avoid the time to needlessly fetch tho tables only to filter
# them out afterwards via the table_pattern.
schema_pattern:
deny:
- "garbage_schema"
allow:
- "schema1"
sink:
type: "datahub-rest"
config:
rver: 'datahub-gms:8080'
导⼊数据:
$ datahub ingest -l
命令执⾏成功:
如果导⼊成功,则会出现下⾯的界⾯:
如果是通过调⽤API()的⽅法得到schema,可以使⽤下⾯的⽅法调⽤,调⽤的api为:10.20.3.32:9002/api/v2/datats/urn:li:datat: (urn:li:dataPlatform:mysql,db_st_cdc,PROD)/schema
数据为:
{
"schema":{
"schemaless": fal,
"rawSchema":"",
"keySchema": null,
"columns":[
{
"id": null,
"sortID":0,
"parentSortID":0,
"fieldName":"id",
"parentPath": null,
"fullFieldPath":"id",
"dataType":"VARCHAR(length=128)",
"comment":"",
"commentCount": null,公司企业文化内容
"partitionedStr": null,
"partitioned": fal,
"nullableStr": null,
"nullable": fal,
"indexedStr": null,
"indexed": fal,
"distributedStr": null,
"distributed": fal,
"treeGridClass": null