ClickHouse存储引擎之MySQL引擎

一、介绍

​ ClickHouse提供了MySQL库引擎可以将MySQL中的表映射到ClickHouse中,并允许用户通过clickhouse对数据进行insert和update操作。实际的操作其实都在MySQL服务器上完成,MySQL引擎会将查询转换为MySQL语法并发送到MySQL服务器上面并实现数据交互,但无法通过clickhouse对MySQL数据执行rename、create table、alter等操作。

二、使用

1、ClickHouse建库语法

1
2
CREATE DATABASE [IF NOT EXISTS] db_name [ON CLUSTER cluster]
ENGINE = MySQL('host:port', 'database', 'user', 'password')

MySQL引擎参数:

  1. host:port:MySQL地址
  2. database:MySQL数据库名,不指定则表示全部映射
  3. user:使用的用户名
  4. password:用户密码

2、使用示例

2.1 环境准备

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
-- 创建映射库,本次使用的clickhouse是由docker启动的,故使用以下命令进入clickhouse客户端
[root@xxxx docker_compose]# docker exec -it 817eeb4e2569 clickhouse-client
-- 用户需有对应操作的权限
clickhouse-server_1 :) create database test_4098 \
:-] ENGINE = MySQL('x.x.x.x:4098', 'dbzz_dbreport', \
:-] 'dba', 'xxxxxx')

CREATE DATABASE test_4098
ENGINE = MySQL('x.x.x.x:4098', 'dbzz_dbreport', 'dba', 'xxxxxx')

Ok.

0 rows in set. Elapsed: 0.030 sec.

clickhouse-server_1 :) show databases;

SHOW DATABASES

┌─name──────┐
│ datasets │
default
system
test
│ test_4098 │
└───────────┘

5 rows in set. Elapsed: 0.002 sec.
-- 创建成功之后在MySQL数据库中执行 show processlist; 可以看到成功创建的连接

-- 在对应MySQL库上面创建一个测试表,并插入一条数据
(dba:4098)@[dbzz_dbreport]>create table clickhouse_test_table (
-> `int_id` INT NOT NULL AUTO_INCREMENT,
-> `float` FLOAT NOT NULL,
-> PRIMARY KEY (`int_id`));
Query OK, 0 rows affected (0.09 sec)
(dba:4098)@[dbzz_dbreport]>insert into clickhouse_test_table values(1, 2);
Query OK, 1 row affected (0.04 sec)

​ 上图所在的metadata目录是clickhouse的元数据目录,里面记录了所有建库、建表的语句,可以看出本次测试的test_4098库仅有建库语句,表并不是通过clickhouse创建的;上一级的data目录记录了所有库具体的数据文件信息,可看出并没有test_4098和test_4098_ro库相关数据信息。

2.2 clickhouse测试

  • 在clickhouse上执行show tables
1
2
3
4
5
6
7
8
9
10
# clickhouse对应日志如下,可以看到将show tables语句转换为了SELECT name FROM system.tables WHERE database = 'test_4098'语句
2020.06.10 16:10:12.871697 [ 81 ] {3c0db0d4-5f48-4896-8418-6d1e0211a3c8} <Debug> executeQuery: (from 127.0.0.1:55258) SHOW TABLES
2020.06.10 16:10:12.871899 [ 81 ] {3c0db0d4-5f48-4896-8418-6d1e0211a3c8} <Debug> executeQuery: (internal) SELECT name FROM system.tables WHERE database = 'test_4098'
2020.06.10 16:10:12.872213 [ 81 ] {3c0db0d4-5f48-4896-8418-6d1e0211a3c8} <Trace> AccessRightsContext (default): Access granted: SELECT(database, name) ON system.tables
2020.06.10 16:10:12.872344 [ 81 ] {3c0db0d4-5f48-4896-8418-6d1e0211a3c8} <Trace> InterpreterSelectQuery: FetchColumns -> Complete
2020.06.10 16:10:12.872580 [ 99 ] {3c0db0d4-5f48-4896-8418-6d1e0211a3c8} <Trace> AccessRightsContext (default): Access granted: SHOW ON *.*
2020.06.10 16:10:12.876803 [ 81 ] {3c0db0d4-5f48-4896-8418-6d1e0211a3c8} <Information> executeQuery: Read 27 rows, 1.36 KiB in 0.005 sec., 5360 rows/sec., 269.67 KiB/sec.
2020.06.10 16:10:12.876840 [ 81 ] {3c0db0d4-5f48-4896-8418-6d1e0211a3c8} <Debug> MemoryTracker: Peak memory usage (for query): 0.00 B.
2020.06.10 16:10:12.876934 [ 81 ] {} <Debug> MemoryTracker: Peak memory usage (total): 0.00 B.
2020.06.10 16:10:12.876956 [ 81 ] {} <Information> TCPHandler: Processed in 0.005 sec.
  • clickhouse上查询表中信息
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- 执行对应表查询
clickhouse-server_1 :) select * from clickhouse_test_table;

SELECT *
FROM clickhouse_test_table

┌─int_id─┬─float─┐
12
└────────┴───────┘

1 rows in set. Elapsed: 0.034 sec.

-- clickhouse日志显示,可以看到会先和对应的MySQL实例建立连接
-- 查询语句被转换为了 SELECT(int_id, float) ON test_4098.clickhouse_test_table
2020.06.10 16:13:14.851866 [ 81 ] {54864811-5b01-4f41-9abf-b4280e7c277b} <Debug> executeQuery: (from 127.0.0.1:55258) SELECT * FROM clickhouse_test_table
2020.06.10 16:13:14.864183 [ 81 ] {54864811-5b01-4f41-9abf-b4280e7c277b} <Trace> AccessRightsContext (default): Access granted: SELECT(int_id, float) ON test_4098.clickhouse_test_table
2020.06.10 16:13:14.868456 [ 81 ] {54864811-5b01-4f41-9abf-b4280e7c277b} <Information> Application: MYSQL: Connecting to dbzz_dbreport@10.148.16.25:4098 as user dba
2020.06.10 16:13:14.880487 [ 81 ] {54864811-5b01-4f41-9abf-b4280e7c277b} <Trace> InterpreterSelectQuery: FetchColumns -> Complete
2020.06.10 16:13:14.882556 [ 81 ] {54864811-5b01-4f41-9abf-b4280e7c277b} <Information> executeQuery: Read 1 rows, 8.00 B in 0.031 sec., 32 rows/sec., 261.44 B/sec.
2020.06.10 16:13:14.882602 [ 81 ] {54864811-5b01-4f41-9abf-b4280e7c277b} <Debug> MemoryTracker: Peak memory usage (for query): 0.00 B.
2020.06.10 16:13:14.884205 [ 81 ] {} <Debug> MemoryTracker: Peak memory usage (total): 0.00 B.
2020.06.10 16:13:14.884845 [ 81 ] {} <Information> TCPHandler: Processed in 0.033 sec.

-- 打开MySQL全日志后,看到所有操作均在MySQL上执行
2020-06-10T08:54:11.541873Z 74904749 Query SELECT TABLE_NAME AS table_name, CREATE_TIME AS modification_time FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'dbzz_dbreport'
2020-06-10T08:54:11.545698Z 74904749 Query SELECT TABLE_NAME AS table_name, CREATE_TIME AS modification_time FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'dbzz_dbreport'
2020-06-10T08:54:11.549140Z 74904749 Query SELECT TABLE_NAME AS table_name, CREATE_TIME AS modification_time FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'dbzz_dbreport'
2020-06-10T08:54:11.552753Z 74904749 Query SELECT TABLE_NAME AS table_name, CREATE_TIME AS modification_time FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'dbzz_dbreport'
2020-06-10T08:54:11.556415Z 75644195 Query SELECT `int_id`, `float` FROM `dbzz_dbreport`.`clickhouse_test_table`
  • 在clickhouse上插入数据
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
-- 执行对应插入数据操作
clickhouse-server_1 :) insert into clickhouse_test_table values(3,4)

INSERT INTO clickhouse_test_table VALUES

Ok.

1 rows in set. Elapsed: 0.042 sec.

-- clickhouse日志显示:在连接已经存在的情况下,可以看到进行权限检查后执行了insert操作
2020.06.10 16:18:51.172346 [ 81 ] {2a1b1d66-dfed-460e-89fe-e7cc9ae3cfa6} <Debug> executeQuery: (from 127.0.0.1:55258) insert into clickhouse_test_table values
2020.06.10 16:18:51.177580 [ 81 ] {2a1b1d66-dfed-460e-89fe-e7cc9ae3cfa6} <Trace> AccessRightsContext (default): Access granted: INSERT(int_id, float) ON test_4098.clickhouse_test_table
2020.06.10 16:18:51.209389 [ 81 ] {2a1b1d66-dfed-460e-89fe-e7cc9ae3cfa6} <Debug> MemoryTracker: Peak memory usage (for query): 0.00 B.
2020.06.10 16:18:51.209481 [ 81 ] {} <Debug> MemoryTracker: Peak memory usage (total): 0.00 B.
2020.06.10 16:18:51.209545 [ 81 ] {} <Information> TCPHandler: Processed in 0.037 sec.

-- 解析MySQL二进制日志,可看出写入操作在MySQL上进行
BEGIN
/*!*/;
# at 340
#200610 16:18:51 server id 210894098 end_log_pos 415 CRC32 0x46a5ba76 Table_map: `dbzz_dbreport`.`clickhouse_test_table` mapped to number 2067
# at 415
#200610 16:18:51 server id 210894098 end_log_pos 459 CRC32 0xd8f0a3ac Write_rows: table id 2067 flags: STMT_END_F
### INSERT INTO `dbzz_dbreport`.`clickhouse_test_table`
### SET
### @1=3
### @2=4
# at 459
#200610 16:18:51 server id 210894098 end_log_pos 490 CRC32 0xcb88aafc Xid = 36747174953
COMMIT/*!*/;