Clickhou官方测试数据集之SSB

更新时间:2023-07-08 15:42:30 阅读: 评论:0

Clickhou官⽅测试数据集之SSB 0.运⾏环境:
centos 7.6
clickhou 20.4.4.4
RAM:3G
磁盘:vmware 虚拟磁盘 60G
物理磁盘:NVME-SSD
1.SSB概述
概述:
2.SSB操作步骤:
1.安装需要的软件:
yum -y install gcc gcc-c++ make cmake git
2.下载代码:
git /vadimtk/ssb-dbgen.git
3.编译⽣成数据:
cd ssb-dbgen
make
$ ./dbgen -s 10 -T c
$ ./dbgen -s 10 -T l
$ ./dbgen -s 10 -T p
$ ./dbgen -s 10 -T s
$ ./dbgen -s 10 -T d
说明:
c--customer.tbl
p--part.tbl
s--supplier.tbl
d--date.tbl
l--lineorder.tbl
上⾯的表数据可以⽤如下的命令⼀次性⽣成:(for all SSBM tables)
$ dbgen -s 10 -T a
-- 查看⽣成的数据:
# du -sh *.tbl
32M    customer.tbl
272K    date.tbl
6.5G    lineorder.tbl
77M    part.tbl
1.9M    supplier.tbl
4.在clickhou client中执⾏表定义的脚本:
-- 创建表:
CREATE TABLE customer
(
C_CUSTKEY      UInt32,
C_NAME          String,
C_ADDRESS      String,
C_CITY          LowCardinality(String),
C_NATION        LowCardinality(String),
C_REGION        LowCardinality(String),
C_PHONE        String,
C_MKTSEGMENT    LowCardinality(String)
)
ENGINE = MergeTree ORDER BY (C_CUSTKEY);
CREATE TABLE lineorder
(
LO_ORDERKEY            UInt32,
LO_LINENUMBER          UInt8,
LO_CUSTKEY              UInt32,
LO_PARTKEY              UInt32,
LO_SUPPKEY              UInt32,
LO_ORDERDATE            Date,
LO_ORDERPRIORITY        LowCardinality(String),
LO_SHIPPRIORITY        UInt8,广州沙面岛
LO_QUANTITY            UInt8,
LO_EXTENDEDPRICE        UInt32,
LO_ORDTOTALPRICE        UInt32,
LO_DISCOUNT            UInt8,
LO_REVENUE              UInt32,
LO_SUPPLYCOST          UInt32,
LO_TAX                  UInt8,
LO_COMMITDATE          Date,
LO_SHIPMODE            LowCardinality(String)
)
ENGINE = MergeTree PARTITION BY toYear(LO_ORDERDATE) ORDER BY (LO_ORDERDATE, LO_ORDERKEY);
CREATE TABLE part
(
P_PARTKEY      UInt32,
P_NAME          String,
P_MFGR          LowCardinality(String),
P_CATEGORY      LowCardinality(String),
P_BRAND        LowCardinality(String),
P_COLOR        LowCardinality(String),
P_TYPE          LowCardinality(String),
P_SIZE          UInt8,
P_CONTAINER    LowCardinality(String)
)
ENGINE = MergeTree ORDER BY P_PARTKEY;
CREATE TABLE supplier
(
S_SUPPKEY      UInt32,
S_NAME          String,
S_ADDRESS      String,死徒
S_CITY          LowCardinality(String),
S_NATION        LowCardinality(String),
S_REGION        LowCardinality(String),
S_PHONE        String
)
ENGINE = MergeTree ORDER BY S_SUPPKEY;
5.将⽣成的数据导⼊数据库:
导⼊数据:
$ clickhou-client  --databa ssb --query "INSERT INTO customer FORMAT CSV" < customer.tbl
$ clickhou-client  --databa ssb --query "INSERT INTO part FORMAT CSV" < part.tbl
$ clickhou-client  --databa ssb --query "INSERT INTO supplier FORMAT CSV" < supplier.tbl
$ clickhou-client  --databa ssb --query "INSERT INTO lineorder FORMAT CSV" < lineorder.tbl
6.⽣成测试数据:将“星型模式”转换为⾮规范化的“平⾯模式”
SET max_memory_usage = 30000000000;
CREATE TABLE lineorder_flat
ENGINE = MergeTree
PARTITION BY toYear(LO_ORDERDATE)
ORDER BY (LO_ORDERDATE, LO_ORDERKEY) AS
SELECT
l.LO_ORDERKEY AS LO_ORDERKEY,
l.LO_LINENUMBER AS LO_LINENUMBER,
l.LO_CUSTKEY AS LO_CUSTKEY,
l.LO_PARTKEY AS LO_PARTKEY,
l.LO_SUPPKEY AS LO_SUPPKEY,
l.LO_ORDERDATE AS LO_ORDERDATE,
活动策划网l.LO_ORDERPRIORITY AS LO_ORDERPRIORITY,
l.LO_SHIPPRIORITY AS LO_SHIPPRIORITY,
l.LO_QUANTITY AS LO_QUANTITY,
l.LO_EXTENDEDPRICE AS LO_EXTENDEDPRICE,
l.LO_ORDTOTALPRICE AS LO_ORDTOTALPRICE,
l.LO_DISCOUNT AS LO_DISCOUNT,
l.LO_REVENUE AS LO_REVENUE,
l.LO_SUPPLYCOST AS LO_SUPPLYCOST,
l.LO_TAX AS LO_TAX,
l.LO_COMMITDATE AS LO_COMMITDATE,
l.LO_SHIPMODE AS LO_SHIPMODE,
c.C_NAME AS C_NAME,
c.C_ADDRESS AS C_ADDRESS,
c.C_CITY AS C_CITY,
c.C_NATION AS C_NATION,
c.C_REGION AS C_REGION,
c.C_PHONE AS C_PHONE,
c.C_MKTSEGMENT AS C_MKTSEGMENT,
s.S_NAME AS S_NAME,
s.S_ADDRESS AS S_ADDRESS,
s.S_CITY AS S_CITY,
s.S_NATION AS S_NATION,
s.S_REGION AS S_REGION,
s.S_PHONE AS S_PHONE,
p.P_NAME AS P_NAME,
p.P_MFGR AS P_MFGR,
p.P_CATEGORY AS P_CATEGORY,
p.P_BRAND AS P_BRAND,
p.P_COLOR AS P_COLOR,
p.P_TYPE AS P_TYPE,
p.P_SIZE AS P_SIZE,
p.P_CONTAINER AS P_CONTAINER
FROM lineorder AS l
INNER JOIN customer AS c ON c.C_CUSTKEY = l.LO_CUSTKEY
INNER JOIN supplier AS s ON s.S_SUPPKEY = l.LO_SUPPKEY
INNER JOIN part AS p ON p.P_PARTKEY = l.LO_PARTKEY;
0 rows in t. Elapd: 169.826 c. Procesd 61.11 million rows, 2.63 GB (359.82 thousand rows/s., 15.51 MB/s.)
Clickhou> OPTIMIZE TABLE ssb.lineorder_flat FINAL ;
OPTIMIZE TABLE ssd.lineorder_flat FINAL  ;
OPTIMIZE TABLE ssb.lineorder_flat FINAL
7.查询导⼊的数据信息:
lect
databa,
table,
formatReadableSize(size) as size,
formatReadableSize(size) as size,
formatReadableSize(bytes_on_disk) as bytes_on_disk,
formatReadableSize(data_uncompresd_bytes) as data_uncompresd_bytes,
formatReadableSize(data_compresd_bytes) as data_compresd_bytes,
compress_rate,读报纸的英文
rows,
days,
formatReadableSize(avgDaySize) as avgDaySize
from
(
lect
databa,
table,
sum(bytes) as size,
sum(rows) as rows,
min(min_date) as min_date,
max(max_date) as max_date,
sum(bytes_on_disk) as bytes_on_disk,
sum(data_uncompresd_bytes) as data_uncompresd_bytes,
sum(data_compresd_bytes) as data_compresd_bytes,
(data_compresd_bytes / data_uncompresd_bytes) * 100 as compress_rate,
max_date - min_date as days,
size / (max_date - min_date) as avgDaySize
from system.parts
where active
and databa='ssb'
group by
databa,
table
);
┌─databa─┬─table──────────┬─size───────┬─bytes_on_disk─┬─data_uncompresd_bytes─┬─data_compresd_bytes─┬──────compress_rate─┬─────ro │ ssb      │ supplier      │ 771.98 KiB │ 771.98 KiB    │ 1.11 MiB                │ 771.02 KiB            │  67.91314318351702 │    20000 │    0 │ inf YiB    │
│ ssb      │ part          │ 13.79 MiB  │ 13.79 MiB    │ 19.59 MiB              │ 13.76 MiB            │  70.2197570555349 │  800000 │    0 │ inf YiB    │
│ ssb      │ customer      │ 11.50 MiB  │ 11.50 MiB    │ 16.89 MiB              │ 11.49 MiB            │  67.99832662134101 │  300000 │    0 │ inf YiB    │
│ ssb      │ lineorder_flat │ 5.19 GiB  │ 5.19 GiB      │ 9.70 GiB                │ 5.18 GiB              │ 53.379435463024095 │ 59986052 │ 2405 │ 2.21 MiB  │
└──────────┴────────────────┴────────────┴───────────────┴─────────────────────────┴───────────────────────┴────────────────────
⾏数:
Clickhou> lect count(1) from lineorder;
┌─count(1)─┐
│ 59986052 │
└──────────┘
1 rows in t. Elapd: 0.009 c.
433阵型Clickhou> lect count(1) from lineorder_flat;
┌─count(1)─┐
excel加斜线│ 59986052 │
└──────────┘
1 rows in t. Elapd: 0.00
2 c.
野棱角
3.SSB测试脚本:
3.1.1
SELECT sum(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue
FROM lineorder_flat
WHERE toYear(LO_ORDERDATE) = 1993 AND LO_DISCOUNT BETWEEN 1 AND 3 AND LO_QUANTITY < 25;
┌───────revenue─┐
│ 4472807765583 │
└───────────────┘
1 rows in t. Elapd: 0.174 c. Procesd 9.11 million rows, 72.86 MB (52.48 million rows/s., 419.87 MB/s.)
3.1.2
SELECT sum(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue
FROM lineorder_flat
WHERE toYYYYMM(LO_ORDERDATE) = 199401 AND LO_DISCOUNT BETWEEN 4 AND 6 AND LO_QUANTITY BETWEEN 26 AND 35;
┌──────revenue─┐
│ 965049065847 │
└──────────────┘
1 rows in t. Elapd: 0.035 c. Procesd 786.43 thousand rows, 6.29 MB (22.67 million rows/s., 181.33 MB/s.)
3.1.3
SELECT sum(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue
FROM lineorder_flat
WHERE toISOWeek(LO_ORDERDATE) = 6 AND toYear(LO_ORDERDATE) = 1994
AND LO_DISCOUNT BETWEEN 5 AND 7 AND LO_QUANTITY BETWEEN 26 AND 35;
┌──────revenue─┐
│ 261680925983 │
老师再见了└──────────────┘
1 rows in t. Elapd: 0.011 c. Procesd 212.99 thousand rows, 1.64 MB (20.16 million rows/s., 155.64 MB/s.)
Clickhou> lect distinct LO_DISCOUNT  from lineorder_flat order by LO_DISCOUNT ;
Clickhou> lect distinct LO_DISCOUNT  from lineorder_flat order by LO_DISCOUNT ;
SELECT DISTINCT LO_DISCOUNT
FROM lineorder_flat
ORDER BY LO_DISCOUNT ASC
┌─LO_DISCOUNT─┐
│          0 │
│          1 │
│          2 │
│          3 │
│          4 │
│          5 │
│          6 │
│          7 │
│          8 │
│          9 │
│          10 │
└─────────────┘
11 rows in t. Elapd: 0.079 c. Procesd 59.99 million rows, 59.99 MB (755.22 million rows/s., 755.22 MB/s.)

本文发布于:2023-07-08 15:42:30,感谢您对本站的认可!

本文链接:https://www.wtabcd.cn/fanwen/fan/89/1073143.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:数据   脚本   磁盘   转换   星型   需要
相关文章
留言与评论(共有 0 条评论)
   
验证码:
推荐文章
排行榜
Copyright ©2019-2022 Comsenz Inc.Powered by © 专利检索| 网站地图