创建分区表

最近更新时间: 2026-03-13 09:03:00

在TDSQL PG数据库中,分区表 是一个逻辑上的大表, 通过 分区策略 或规则 关联1个或多个子表(称为 分区 ), 分区表不存储数据,数据库引擎将根据 分区表定义的 策略 对数据进行处理,存储到指定的子表中。而子表还可以定义为分区表,称为子分区,这样就构成 多级分区表。使用分区表可提高各种应用程序的性能、可用性和管理性,并有助于降低存储数据的总拥有成本。TDSQL PG 提供丰富多样的分区策略和扩展 来满足不同业务需求。
分区特点:

  1. 每个分区都有一个由其分区边界定义的数据子集。
  2. 每个分区是相互独立的,具有相同的逻辑属性,例如列名、数据类型和约束.
  3. 每个分区具有自己的名称和可选的存储特性, 从而能更精细的粒度级别管理和访问这些数据库对象。
  4. 分区对应用是完全透明的,不影响应用的业务逻辑。

分区键

分区键 是分区表中一个或多个列的集合,用于确定分区表中的每一行所在的分区, 插入到表中的数据行将根据分区键中的列或表达式的值路由到分区。
例如,对于一个用户表,可以指定 user_id 列作为 Range 分区的键。数据库根据此列中的用户号是否在指定范围内,将行分配给分区。
TDSQL PG 使用分区键自动地将插入、更新和删除操作定向到合适的分区。

分区创建

TDSQL PG 分区语法声明方式 创建,分2个步骤: 先建立分区表,然后建立分区子表。

创建分区表

语法如下:

CREATE TABLE table_name (  ...  ) [ PARTITION BY { RANGE | LIST | HASH } ( { column_name | ( expression ) }

参数说明:
PARTITION BY子句指定对表进行分区的策略,包括范围,列表,散列三种,带括号的列或表达式列表形成表的分区键。

创建分区子表

语法如下

/* -- 创建分区表语法 -- */
CREATE TABLE table_name PARTITION OF parent_table
    [ ( { column_name [ WITH OPTIONS ] [ column_constraint [ ... ] ] | table_constraint } [, ... ] ) ]
    FOR VALUES partition_bound_spec

参数说明:
partition_bound_spec: FOR VALUES WITH '(' hash_partbound ')'
| FOR VALUES IN_P '(' partbound_datum_list ')'
| FOR VALUES FROM '(' range_datum_list ')' TO '(' range_datum_list ')'
| DEFAULT
hash_partbound: MODULUS x, REMAINDER y;
partbound_datum_list:常量,number,NULL,MINVALUE, MAXVALUE;
range_datum_list:常量,计算结果为常量的表达式,MAXVALUE, MINVALUE等;

说明

  • partition_bound_spec 必须对应父表的分区方法和分区键,且不能与父表现有分区重叠,VALUES WITH (xxx) 用于散列分区,VALUES IN_P (xxx) 用于列表分区,VALUES FROM ( xxx ) TO ( xxx) 形式用于范围分区。
  • partition_bound_spec 中 partbound_datum_list 是一个文字、NULL、MINVALUE 或MAXVALUE,每个文字值必须是可对相应的分区键列类型强制的数字常量,或是该类型的有效输入的字符串文字。

示例

假如有一个大型的冰激凌公司,每天想记录测量最高温度以及每个区域的冰激凌销售情况,大部分查询只会访问上周的、上月的或者上季度的数据,该表的主要用途是为管理层提供在线报告,为减少旧数据量,只保留最近3年的数据,则可使用如下步骤使用分区技术:

  1. 通过指定PARTITION BY子句 创建分区表(measurement表), 该子句包括分区方法(RANGE)以及用作分区键的列表(logdate)。

    CREATE TABLE measurement (
     city_id         int not null,
     logdate         date not null,
     peaktemp        int,
     unitsales       int
    ) PARTITION BY RANGE (logdate);
    
  2. 创建分区: 每个分区的定义必须指定对应父表的分区方法和分区键的边界, 分区子表以普通PostgreSQL表的方式创建, 可为每个分区单独指定存储参数。

    注意:

    如果指定边界使新分区的值与已有分区中值重叠,则会产生错误, 即向父表中插入无法映射到任何现有分区的数据将会导致错误,这种情况需手工增加一个合适的分区。

    CREATE TABLE measurement_y2006m02 PARTITION OF measurement FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
    CREATE TABLE measurement_y2006m03 PARTITION OF measurement FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
    CREATE TABLE measurement_y2007m11 PARTITION OF measurement FOR VALUES FROM ('2007-11-01') TO ('2007-12-01');
    CREATE TABLE measurement_y2007m12 PARTITION OF measurement FOR VALUES FROM ('2007-12-01') TO ('2008-01-01');
    CREATE TABLE measurement_y2008m01 PARTITION OF measurement FOR VALUES FROM ('2008-01-01') TO ('2008-02-01')
        WITH (parallel_workers = 4);
    
  3. 在分区表的键列上创建一个索引,会自动在每个分区上创建一个索引,且后续创建或绑定(attach)的分区也将包含该索引。

    CREATE INDEX ON measurement (logdate);
    
  4. 查看分区表相关信息

    tdsql=# \d+ measurement
                                  Table "public.MEASUREMENT"
      Column   |  Type   | Collation | Nullable | Default | Storage | Stats target | Description
    -----------+---------+-----------+----------+---------+---------+--------------+-------------
     CITY_ID   | INTEGER |           | not null |         | plain   |              |
     LOGDATE   | DATE    |           | not null |         | plain   |              |
     PEAKTEMP  | INTEGER |           |          |         | plain   |              |
     UNITSALES | INTEGER |           |          |         | plain   |              |
    Partition key: RANGE (LOGDATE)
    Indexes:
     "MEASUREMENT_LOGDATE_IDX" btree (LOGDATE)
    Partitions: MEASUREMENT_Y2006M02 FOR VALUES FROM ('2006-02-01 00:00:00') TO ('2006-03-01 00:00:00'),
             MEASUREMENT_Y2006M03 FOR VALUES FROM ('2006-03-01 00:00:00') TO ('2006-04-01 00:00:00'),
             MEASUREMENT_Y2007M11 FOR VALUES FROM ('2007-11-01 00:00:00') TO ('2007-12-01 00:00:00'),
             MEASUREMENT_Y2007M12 FOR VALUES FROM ('2007-12-01 00:00:00') TO ('2008-01-01 00:00:00'),
             MEASUREMENT_Y2008M01 FOR VALUES FROM ('2008-01-01 00:00:00') TO ('2008-02-01 00:00:00')
    Distribute By: SHARD(CITY_ID)
    Location Nodes: ALL DATANODES
    
    tdsql=# \d+ measurement_y2008m01
                              Table "public.MEASUREMENT_Y2008M01"
      Column   |  Type   | Collation | Nullable | Default | Storage | Stats target | Description
    -----------+---------+-----------+----------+---------+---------+--------------+-------------
     CITY_ID   | INTEGER |           | not null |         | plain   |              |
     LOGDATE   | DATE    |           | not null |         | plain   |              |
     PEAKTEMP  | INTEGER |           |          |         | plain   |              |
     UNITSALES | INTEGER |           |          |         | plain   |              |
    Partition of: MEASUREMENT FOR VALUES FROM ('2008-01-01 00:00:00') TO ('2008-02-01 00:00:00')
    Partition constraint: ((LOGDATE IS NOT NULL) AND (LOGDATE >= '2008-01-01 00:00:00'::DATE) AND (LOGDATE < '2008-02-01 00:00:00'::DATE))
    Indexes:
     "MEASUREMENT_Y2008M01_LOGDATE_IDX" btree (LOGDATE)
    Distribute By: SHARD(CITY_ID)
    Location Nodes: ALL DATANODES
    Options: parallel_workers=4
    

分区维护

通常创建完分区表的各个分区子表后并非保持静态不变,对于移除旧分区数据,为新数据周期性地增加新分区的需求比比皆是,所以需要分区维护操作包括如何删除分区,拆分分区,合并分区,添加新分区等操作 来管理分区。

删除分区

对于某个业务如不需要某个分区子表的数据,可通过 drop table ... 命令删除分区(包括该分区的表定义 和 数据 ), 不过需注意:

  1. 通过drop table 方式删除后,数据将清除,子表不能作为普通表独立存在,安全的做法使用 解绑分区 alter table ... detach partition ... 方式。
  2. 要在父表上拿到 ACCESS EXCLUSIVE锁才能操作。
    相关操作示例如下:
tdsql=# DROP TABLE measurement_y2006m02;

/* -- 查看分区表信息 -- */
tdsql=# \d+ measurement
                                 Table "public.MEASUREMENT"
  Column   |  Type   | Collation | Nullable | Default | Storage | Stats target | Description
-----------+---------+-----------+----------+---------+---------+--------------+-------------
 CITY_ID   | INTEGER |           | not null |         | plain   |              |
 LOGDATE   | DATE    |           | not null |         | plain   |              |
 PEAKTEMP  | INTEGER |           |          |         | plain   |              |
 UNITSALES | INTEGER |           |          |         | plain   |              |
Partition key: RANGE (LOGDATE)
Indexes:
    "MEASUREMENT_LOGDATE_IDX" btree (LOGDATE)
Partitions: MEASUREMENT_Y2006M03 FOR VALUES FROM ('2006-03-01 00:00:00') TO ('2006-04-01 00:00:00'),
            MEASUREMENT_Y2007M11 FOR VALUES FROM ('2007-11-01 00:00:00') TO ('2007-12-01 00:00:00'),
            MEASUREMENT_Y2007M12 FOR VALUES FROM ('2007-12-01 00:00:00') TO ('2008-01-01 00:00:00'),
            MEASUREMENT_Y2008M01 FOR VALUES FROM ('2008-01-01 00:00:00') TO ('2008-02-01 00:00:00')
Distribute By: SHARD(CITY_ID)
Location Nodes: ALL DATANODES

/* -- 查看删除分区子表信息 -- */
tdsql=# \d+ MEASUREMENT_Y2006M02
没有找到任何名称为 "MEASUREMENT_Y2006M02" 的关联。	-- 表示已经无该关系表,无法使用SQL 命令来操作该分区子表

清理分区数据

如果想清理某个分区子表的数据,而不删除该子分区子表的定义。
方法一:
可以通过 alter table ... truncate ... partition( 分区子表名 ) 方式清除某个分区的数据

/* -- 写入数据 -- */
tdsql=# INSERT INTO measurement VALUES(1, '2006-03-01 12:23:32', 1, 100);
tdsql=# INSERT INTO measurement VALUES(1, '2007-11-01 12:23:32', 2, 200);
tdsql=# INSERT INTO measurement VALUES(1, '2007-12-01 12:23:32', 3, 300);
tdsql=# INSERT INTO measurement VALUES(1, '2008-01-01 12:23:32', 4, 400);

/* -- 验证各个分区数据 -- */
tdsql=# select "tableoid"::regclass, count(*) from measurement group by "tableoid" order by 1;
       tableoid       | COUNT
----------------------+-------
 MEASUREMENT_Y2006M03 |     1
 MEASUREMENT_Y2007M11 |     1
 MEASUREMENT_Y2007M12 |     1
 MEASUREMENT_Y2008M01 |     1
(4 rows)

/* -- Truncate 子表, 将只清理该分区数据 -- */
tdsql=# alter table measurement truncate partition(measurement_y2006m03);

/* -- 验证各个分区数据 -- */
tdsql=# select "tableoid"::regclass, count(*) from measurement group by "tableoid" order by 1;
       tableoid       | COUNT
----------------------+-------
 MEASUREMENT_Y2007M11 |     1
 MEASUREMENT_Y2007M12 |     1
 MEASUREMENT_Y2008M01 |     1
(3 rows)

方法二:
也可通过 delete from ... partition( 分区子表名 ) 方式清除某个分区的数据。

/* -- delete分区., 将只清理该分区数据 -- */
tdsql=# delete from measurement partition(measurement_y2007m12);

/* -- 验证各个分区数据 -- */
tdsql=# select "tableoid"::regclass, count(*) from measurement group by "tableoid" order by 1;

       tableoid       | COUNT 
----------------------+-------
 MEASUREMENT_Y2008M01 |     1
 MEASUREMENT_Y2007M11 |     1
(2 rows)

解绑 && 绑定分区

如果想把某个分区子表从当前分区表分离出去,而保留该分区子表的数据,使得 SQL(查询,修改,删除,统计等)操作分区表时,将不再考虑该分区子表的数据,则可通过 detach 方式 把某个分区子表从分区表中解绑,使它作为一个独立的普通表存在, 而往分区表读写数据时将不再访问该分区子表,不过可通过单独的命令操作该分区子表,而解绑后的分区子表可通过 attach 方式 重新加入到分区表中,具体操作示例如下:

/* -- 构造数据 -- */
tdsql=# INSERT INTO measurement VALUES(1, '2006-03-01 12:23:32', 1, 100);
tdsql=# INSERT INTO measurement VALUES(1, '2006-03-02 12:23:32', 2, 200);
tdsql=# select * from measurement;
 CITY_ID |       LOGDATE       | PEAKTEMP | UNITSALES
---------+---------------------+----------+-----------
       1 | 2006-03-01 12:23:32 |        1 |       100
       1 | 2006-03-02 12:23:32 |        2 |       200
(2 rows)

tdsql=# select * from measurement partition(measurement_y2006m03);
 CITY_ID |       LOGDATE       | PEAKTEMP | UNITSALES
---------+---------------------+----------+-----------
       1 | 2006-03-01 12:23:32 |        1 |       100
       1 | 2006-03-02 12:23:32 |        2 |       200
(2 rows)

/* -- 解绑分区子表 -- */
tdsql=# ALTER TABLE measurement DETACH PARTITION measurement_y2006m03;

/* -- 查看分区表信息 -- */
tdsql=# \d+ measurement
                                 Table "public.MEASUREMENT"
  Column   |  Type   | Collation | Nullable | Default | Storage | Stats target | Description
-----------+---------+-----------+----------+---------+---------+--------------+-------------
 CITY_ID   | INTEGER |           | not null |         | plain   |              |
 LOGDATE   | DATE    |           | not null |         | plain   |              |
 PEAKTEMP  | INTEGER |           |          |         | plain   |              |
 UNITSALES | INTEGER |           |          |         | plain   |              |
Partition key: RANGE (LOGDATE)
Indexes:
    "MEASUREMENT_LOGDATE_IDX" btree (LOGDATE)
Partitions: MEASUREMENT_Y2007M11 FOR VALUES FROM ('2007-11-01 00:00:00') TO ('2007-12-01 00:00:00'),
            MEASUREMENT_Y2007M12 FOR VALUES FROM ('2007-12-01 00:00:00') TO ('2008-01-01 00:00:00'),
            MEASUREMENT_Y2008M01 FOR VALUES FROM ('2008-01-01 00:00:00') TO ('2008-02-01 00:00:00')
Distribute By: SHARD(CITY_ID)
Location Nodes: ALL DATANODES

/* -- 查看解绑后该分区子表信息 -- */
tdsql=# \d+ measurement_y2006m03
                             Table "public.MEASUREMENT_Y2006M03"
  Column   |  Type   | Collation | Nullable | Default | Storage | Stats target | Description
-----------+---------+-----------+----------+---------+---------+--------------+-------------
 CITY_ID   | INTEGER |           | not null |         | plain   |              |
 LOGDATE   | DATE    |           | not null |         | plain   |              |
 PEAKTEMP  | INTEGER |           |          |         | plain   |              |
 UNITSALES | INTEGER |           |          |         | plain   |              |
Indexes:
    "MEASUREMENT_Y2006M03_LOGDATE_IDX" btree (LOGDATE)
Distribute By: SHARD(CITY_ID)
Location Nodes: ALL DATANODES

/* -- 1. 查看解绑后分区表数据 -- */
tdsql=# select * from measurement;
 CITY_ID | LOGDATE | PEAKTEMP | UNITSALES
---------+---------+----------+-----------
(0 rows)

/* -- 2. 查看解绑后分区子表数据, 将出错 -- */
tdsql=# select * from measurement partition(measurement_y2006m03);
ERROR:  relation MEASUREMENT_Y2006M03 is not a child table of MEASUREMENT .

/* -- 3. 直接查看解绑后分区子表数据 -- */
tdsql=# select * from measurement_y2006m03;
 CITY_ID |       LOGDATE       | PEAKTEMP | UNITSALES
---------+---------------------+----------+-----------
       1 | 2006-03-01 12:23:32 |        1 |       100
       1 | 2006-03-02 12:23:32 |        2 |       200
(2 rows)

/* -- 4. 验证写入分区子表数据 -- */
tdsql=# insert into measurement_y2006m03 values(1, '2006-03-03 12:23:32', 3, 300);
tdsql=# select * from MEASUREMENT_Y2006M03;
 CITY_ID |       LOGDATE       | PEAKTEMP | UNITSALES
---------+---------------------+----------+-----------
       1 | 2006-03-01 12:23:32 |        1 |       100
       1 | 2006-03-02 12:23:32 |        2 |       200
       1 | 2006-03-03 12:23:32 |        3 |       300
(3 rows)

/* -- 5. 回归验证分区表,则无数据 -- */
tdsql=# select * from measurement;
 CITY_ID | LOGDATE | PEAKTEMP | UNITSALES
---------+---------+----------+-----------
(0 rows)

/* -- 6. 通过绑定操作可以把表重新绑定到分区表上 -- */
tdsql=# alter table measurement attach partition measurement_y2006m03 FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');

/* -- 7. 查看绑定的分区表信息 -- */
tdsql=# \d+ measurement;
                                 Table "public.MEASUREMENT"
  Column   |  Type   | Collation | Nullable | Default | Storage | Stats target | Description
-----------+---------+-----------+----------+---------+---------+--------------+-------------
 CITY_ID   | INTEGER |           | not null |         | plain   |              |
 LOGDATE   | DATE    |           | not null |         | plain   |              |
 PEAKTEMP  | INTEGER |           |          |         | plain   |              |
 UNITSALES | INTEGER |           |          |         | plain   |              |
Partition key: RANGE (LOGDATE)
Indexes:
    "MEASUREMENT_LOGDATE_IDX" btree (LOGDATE)
Partitions: MEASUREMENT_Y2006M03 FOR VALUES FROM ('2006-03-01 00:00:00') TO ('2006-04-01 00:00:00'),
            MEASUREMENT_Y2007M11 FOR VALUES FROM ('2007-11-01 00:00:00') TO ('2007-12-01 00:00:00'),
            MEASUREMENT_Y2007M12 FOR VALUES FROM ('2007-12-01 00:00:00') TO ('2008-01-01 00:00:00'),
            MEASUREMENT_Y2008M01 FOR VALUES FROM ('2008-01-01 00:00:00') TO ('2008-02-01 00:00:00')
Distribute By: SHARD(CITY_ID)
Location Nodes: ALL DATANODES

/* -- 8. 查看绑定的分区表数据 -- */
tdsql=# select * from measurement;
 CITY_ID |       LOGDATE       | PEAKTEMP | UNITSALES
---------+---------------------+----------+-----------
       1 | 2006-03-01 12:23:32 |        1 |       100
       1 | 2006-03-02 12:23:32 |        2 |       200
       1 | 2006-03-03 12:23:32 |        3 |       300
(3 rows)

拆分分区

当分区表内数据分布倾斜或不均衡时,可通过拆分分区使得数据分布在各个分区上保持均衡,TDSQL PG支持RANGE、LIST的一级和多级分区拆分,分区拆分通常用于将DEFAULT分区拆分到新的分区,或者将一个较大的分区拆分成两个分区。
范围(RANGE)分区拆分语法:

ALTER TABLE table_name SPLIT PARTITION part_3 AT ( { numeric_literal | string_literal } ) INTO (PARTITION part_1, PARTITION part_2);

列表(LIST)分区拆分语法:

ALTER TABLE table_name SPLIT PARTITION part_3 VALUES ( { numeric_literal | string_literal | NULL } , { numeric_literal | string_literal | NULL } )
        INTO (PARTITION part_1, PARTITION part_2);

例子:把默认分区分成两个分区

/* -- 1. 创建分区表 -- */
tdsql=# CREATE TABLE tbl_split (
                id serial,
                create_time timestamp(0) without time zone
            ) PARTITION BY RANGE(create_time);
tdsql=# CREATE TABLE tbl_split_p1 PARTITION OF tbl_split FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
tdsql=# CREATE TABLE tbl_split_p2 PARTITION OF tbl_split FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');
tdsql=# CREATE TABLE tbl_split_def PARTITION OF tbl_split DEFAULT;

/* -- 2. 查看分区信息 -- */
tdsql=# \d+ tbl_split;
                                                              Table "public.TBL_SPLIT"
   Column    |              Type              | Collation | Nullable |                Default                | Storage | Stats target | Description
-------------+--------------------------------+-----------+----------+---------------------------------------+---------+--------------+-------------
 ID          | INTEGER                        |           | not null | NEXTVAL('TBL_SPLIT_ID_SEQ'::REGCLASS) | plain   |              |
 CREATE_TIME | TIMESTAMP(0) WITHOUT TIME ZONE |           |          |                                       | plain   |              |
Partition key: RANGE (CREATE_TIME)
Partitions: TBL_SPLIT_P1 FOR VALUES FROM ('2024-01-01 00:00:00') TO ('2024-02-01 00:00:00'),
            TBL_SPLIT_P2 FOR VALUES FROM ('2024-02-01 00:00:00') TO ('2024-03-01 00:00:00'),
            TBL_SPLIT_DEF DEFAULT
Distribute By: SHARD(ID)
Location Nodes: ALL DATANODES

/* -- 3. 写入数据 -- */
INSERT INTO tbl_split (id, create_time) VALUES (1,'2024-01-01');
INSERT INTO tbl_split (id, create_time) VALUES (2,'2024-02-01');
INSERT INTO tbl_split (id, create_time) VALUES (3,'2024-03-01');
INSERT INTO tbl_split (id, create_time) VALUES (4,'2024-04-01');
INSERT INTO tbl_split (id, create_time) VALUES (5,'2024-05-01');

/* -- 4. 查看分区数据分布 -- */
tdsql=# select "tableoid"::regclass, count(*) from tbl_split group by "tableoid" order by 1;
   tableoid    | COUNT
---------------+-------
 TBL_SPLIT_P1  |     1
 TBL_SPLIT_P2  |     1
 TBL_SPLIT_DEF |     3
(3 rows)

/* -- 5. 拆分默认分区 -- */
tdsql=# alter table tbl_split split partition tbl_split_def at ('2024-04-01')
                into (partition tbl_split_p3, partition tbl_split_def);

/* -- 6. 查看拆分后的分区表信息 -- */
tdsql=# \d+ tbl_split
                                                              Table "public.TBL_SPLIT"
   Column    |              Type              | Collation | Nullable |                Default                | Storage | Stats target | Description
-------------+--------------------------------+-----------+----------+---------------------------------------+---------+--------------+-------------
 ID          | INTEGER                        |           | not null | NEXTVAL('TBL_SPLIT_ID_SEQ'::REGCLASS) | plain   |              |
 CREATE_TIME | TIMESTAMP(0) WITHOUT TIME ZONE |           |          |                                       | plain   |              |
Partition key: RANGE (CREATE_TIME)
Partitions: TBL_SPLIT_P1 FOR VALUES FROM ('2024-01-01 00:00:00') TO ('2024-02-01 00:00:00'),
            TBL_SPLIT_P2 FOR VALUES FROM ('2024-02-01 00:00:00') TO ('2024-03-01 00:00:00'),
            TBL_SPLIT_P3 FOR VALUES FROM ('2024-03-01 00:00:00') TO ('2024-04-01 00:00:00'),
            TBL_SPLIT_DEF DEFAULT
Distribute By: SHARD(ID)
Location Nodes: ALL DATANODES

/* -- 7. 查看拆分后的分区表数据分布 -- */
tdsql=# select "tableoid"::regclass, count(*) from tbl_split group by "tableoid" order by 1;
   tableoid    | COUNT
---------------+-------
 TBL_SPLIT_P1  |     1
 TBL_SPLIT_P2  |     1
 TBL_SPLIT_P3  |     1
 TBL_SPLIT_DEF |     2
(4 rows)

合并分区

TDSQL PG支持分区表合并,通过分区合并,可将两个分区的数据合并到另一个分区中。分区合并通常用于将一些较小的列表分区或相邻的较小的范围分区合并,减少分区数量,从而提高效率。

注意:

  • 不能用于散列(HASH)分区下的子分区的合并。
  • 范围(RANGE)分区合并时,允许合并两临近范围的分区到另外分区,不相邻分区无法合并。合并结果分区将继承两个源分区的最大边界。
  • 暂不支持UPDATE [ GLOBAL ] INDEXES。

语法:

ALTER TABLE table_name MERGE PARTITIONS ( part1, part2 ) INTO PARTITION part3 ;

例子:合并两个分区

/* -- 1. 创建分区信息 -- */
tdsql=# CREATE TABLE tbl_merge (
                    id serial,
                    create_time timestamp(0) without time zone
            ) PARTITION BY RANGE(create_time);
tdsql=# CREATE TABLE tbl_merge_p1 PARTITION OF tbl_merge FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
tdsql=# CREATE TABLE tbl_merge_p2 PARTITION OF tbl_merge FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');
tdsql=# CREATE TABLE tbl_merge_p3 PARTITION OF tbl_merge FOR VALUES FROM ('2024-03-01') TO ('2024-04-01');
tdsql=# CREATE TABLE tbl_merge_def PARTITION OF tbl_merge DEFAULT;

/* -- 2. 查看分区信息 -- */
tdsql=# \d+ tbl_merge;
                                                              Table "public.TBL_MERGE"
   Column    |              Type              | Collation | Nullable |                Default                | Storage | Stats target | Description
-------------+--------------------------------+-----------+----------+---------------------------------------+---------+--------------+-------------
 ID          | INTEGER                        |           | not null | NEXTVAL('TBL_MERGE_ID_SEQ'::REGCLASS) | plain   |              |
 CREATE_TIME | TIMESTAMP(0) WITHOUT TIME ZONE |           |          |                                       | plain   |              |
Partition key: RANGE (CREATE_TIME)
Partitions: TBL_MERGE_P1 FOR VALUES FROM ('2024-01-01 00:00:00') TO ('2024-02-01 00:00:00'),
            TBL_MERGE_P2 FOR VALUES FROM ('2024-02-01 00:00:00') TO ('2024-03-01 00:00:00'),
            TBL_MERGE_P3 FOR VALUES FROM ('2024-03-01 00:00:00') TO ('2024-04-01 00:00:00'),
            TBL_MERGE_DEF DEFAULT
Distribute By: SHARD(ID)
Location Nodes: ALL DATANODES

/* -- 3. 写入数据 -- */
tdsql=# INSERT INTO tbl_merge (id, create_time) VALUES (1,'2024-01-01');
tdsql=# INSERT INTO tbl_merge (id, create_time) VALUES (2,'2024-02-01');
tdsql=# INSERT INTO tbl_merge (id, create_time) VALUES (3,'2024-03-01');
tdsql=# INSERT INTO tbl_merge (id, create_time) VALUES (4,'2024-04-01');
tdsql=# INSERT INTO tbl_merge (id, create_time) VALUES (5,'2024-05-01');

/* -- 4. 查看分区数据分布 -- */
tdsql=# select "tableoid"::regclass, count(*) from tbl_merge group by "tableoid" order by 1;
   tableoid    | COUNT
---------------+-------
 TBL_MERGE_P1  |     1
 TBL_MERGE_P2  |     1
 TBL_MERGE_P3  |     1
 TBL_MERGE_DEF |     2
(4 rows)

/* -- 5. 合并 tbl_merge_p3 和 tbl_merge_def 分区 -- */
tdsql=# alter table tbl_merge merge partitions(tbl_merge_p3, tbl_merge_def) into partition tbl_merge_def;

/* -- 6. 查看合并后的分区表信息 -- */
tdsql=# \d+ tbl_merge
                              Table "public.TBL_MERGE"
   Column    |              Type              | Collation | Nullable |                Default                | Storage | Stats target | Description
-------------+--------------------------------+-----------+----------+---------------------------------------+---------+--------------+-------------
 ID          | INTEGER                        |           | not null | NEXTVAL('TBL_MERGE_ID_SEQ'::REGCLASS) | plain   |              |
 CREATE_TIME | TIMESTAMP(0) WITHOUT TIME ZONE |           |          |                                       | plain   |              |
Partition key: RANGE (CREATE_TIME)
Partitions: TBL_MERGE_P1 FOR VALUES FROM ('2024-01-01 00:00:00') TO ('2024-02-01 00:00:00'),
            TBL_MERGE_P2 FOR VALUES FROM ('2024-02-01 00:00:00') TO ('2024-03-01 00:00:00'),
            TBL_MERGE_DEF DEFAULT
Distribute By: SHARD(ID)
Location Nodes: ALL DATANODES

/* -- 7. 查看合并后的分区表数据分布 -- */
tdsql=# select "tableoid"::regclass, count(*) from tbl_merge group by "tableoid" order by 1;
   tableoid    | COUNT
---------------+-------
 TBL_MERGE_P1  |     1
 TBL_MERGE_P2  |     1
 TBL_MERGE_DEF |     3
(3 rows)

分区策略

TDSQL PG 支持以下多种分区策略,用于控制数据库如何将数据放入分区:

  • 范围(RANGE)分区:一种分区技术,根据分区键值的范围将数据映射到分区,最常见的是基于时间字段做分区。
  • 列表(LIST)分区:一种分区技术,数据库使用离散值列表作为每个分区的分区键。
  • 散列(HASH)分区:一种分区技术,散列分区根据指定的分区键的散列算法将数据映射到分区。
  • 多级分区:一种结合两种或多种分区方法的分区技术,可使用范围(RANGE)、列表(LIST)、散列(HASH)根据业务特点进行组合。

范围分区

根据分区键值的范围将数据映射到分区,最常见的是基于时间字段做分区,把数据按不同时间段存储在不同的分区子表中。

范围分区格式

CREATE TABLE table_name (col1, col2, col3 ...) PARTITION BY RANGE ( col1 );
CREATE TABLE table_name_part1 PARTITION OF table_name FOR VALUES FROM ( MINVALUE ) TO ( expr );
CREATE TABLE table_name_part2 PARTITION OF table_name FOR VALUES FROM ( exp1 ) TO ( expr2 );
CREATE TABLE table_name_part3 PARTITION OF table_name FOR VALUES FROM ( exp1 ) TO ( expr2 );
...
CREATE TABLE table_name_part4 PARTITION OF table_name FOR VALUES FROM ( MAXVALUE ) TO ( expr2 );
CREATE TABLE table_name_part5 PARTITION OF table_name DEFAULT;

说明

  • 分区键可以包含多个列或表达式(最多32个)
  • 创建范围分区时,由FROM指定下限是一个包含范围,而用TO指定的上限是排除范围。在FROM列表中指定的值是该分区的相应分区键列的有效值,而TO列表中的值不是。
  • 创建范围分区时,可以使用特殊值MINVALUE 和MAXVALUE,表示列值下限或上限,如 FROM (MINVALUE) TO (10) 表示允许任何小于10的值,FROM (10) TO (MAXVALUE)表示允许任何大于或等于10的值。
  • 默认分区:如写数据不在分区范围,会报告错误,可通过引入默认分区来存储超出范围的数据, 注意点:
    • 一个分区表只能拥有一个 DEFAULT 分区;
    • 对于已经存储在 DEFAULT 分区中的数据,不能再创建相应的分区;
    • 如果将已有的表挂载为 DEFAULT 分区,将会检查该表中的所有数据;如果在已有的分区中存在相同的数据,将会产生一个错误;
    • 散列分区表不支持 DEFAULT 分区。
    • MAXVALUE && MINVALUE: MAXVALUE 和 MINVALUE 是 范围的上限下限, 进一步划分分区的边界。

示例 - 创建时间范围分区,包括默认分区 && MAXVALUE && MINVALUE

创建一个日志分区表,存储2023年1月和2月的日志,使用创建时间作为分区键,按月分区。

/* -- 1. 创建分区表 --*/
create table tbl_log_2023 (id serial, create_time timestamp(0) without time zone, remark char(1)) partition by RANGE(create_time);

/* -- 2. 创建分区子表 --*/
CREATE TABLE tbl_log_202301 PARTITION OF tbl_log_2023 FOR VALUES FROM ('2023-01-01') TO ('2023-02-01');
CREATE TABLE tbl_log_202302 PARTITION OF tbl_log_2023 FOR VALUES FROM ('2023-02-01') TO ('2023-03-01');

/* -- 3. 查看表信息 -- */
tdsql=# \d+ tbl_log_2023

                                                              Table "public.TBL_LOG_2023"
   Column    |              Type              | Collation | Nullable |                 Default                  | Storage  | Stats target | Description
-------------+--------------------------------+-----------+----------+------------------------------------------+----------+--------------+-------------
 ID          | INTEGER                        |           | not null | NEXTVAL('TBL_LOG_2023_ID_SEQ'::REGCLASS) | plain    |              |
 CREATE_TIME | TIMESTAMP(0) WITHOUT TIME ZONE |           |          |                                          | plain    |              |
 REMARK      | CHARACTER(1)                   |           |          |                                          | extended |              |
Partition key: RANGE (CREATE_TIME)
Partitions: TBL_LOG_202301 FOR VALUES FROM ('2023-01-01 00:00:00') TO ('2023-02-01 00:00:00'),
            TBL_LOG_202302 FOR VALUES FROM ('2023-02-01 00:00:00') TO ('2023-03-01 00:00:00')
Distribute By: SHARD(ID)
Location Nodes: ALL DATANODES

/* -- 4. 验证分区 -- */
/* -- 4.1 写入2023 1月到 2月数据 -- */
tdsql=# insert into tbl_log_2023(create_time, remark) values('2023-01-01 10:55:01', 1);
tdsql=# insert into tbl_log_2023(create_time, remark) values('2023-01-02 10:55:01', 1);
tdsql=# insert into tbl_log_2023(create_time, remark) values('2023-02-01 10:55:01', 1);
tdsql=# insert into tbl_log_2023(create_time, remark) values('2023-02-02 10:55:01', 1);

/* -- 4.2 查看TBL_LOG_202301 分区数据 -- */
tdsql=# select * from tbl_log_2023 partition(TBL_LOG_202301) ORDER BY ID;
 ID |     CREATE_TIME     | REMARK
----+---------------------+--------
  1 | 2023-01-01 10:55:01 | 1
  2 | 2023-01-02 10:55:01 | 1
(2 rows)

/* -- 4.3 查看TBL_LOG_202302 分区数据 -- */
tdsql=# select * from tbl_log_2023 partition(TBL_LOG_202302) ORDER BY ID;
 ID |     CREATE_TIME     | REMARK
----+---------------------+--------
  3 | 2023-02-01 10:55:01 | 1
  4 | 2023-02-02 10:55:01 | 1
(2 rows)

/* -- 4.4 统计 TBL_LOG_2023 各个分区数据 -- */
tdsql=# select "tableoid"::regclass, count(*) from tbl_log_2023 group by "tableoid" order by 1;
RELNAME     | COUNT
----------------+-------
TBL_LOG_202301 |     2
TBL_LOG_202302 |     2
(2 rows)

/* -- 5. 使用默认分区  -- */
/* -- 5.1 当前无默认分区,写入范围外的记录将报告错误  -- */
tdsql=# insert into tbl_log_2023(create_time, remark) values('2022-12-01 10:55:01', 1);
ERROR:  no partition of relation "TBL_LOG_2023" found for row
DETAIL:  Partition key of the failing row contains (CREATE_TIME) = (2022-12-01 10:55:01).

tdsql=# insert into tbl_log_2023(create_time, remark) values('2023-03-01 10:55:01', 1);
ERROR:  no partition of relation "TBL_LOG_2023" found for row
DETAIL:  Partition key of the failing row contains (CREATE_TIME) = (2023-03-01 10:55:01).

/* -- 5.2 创建默认分区 -- */
tdsql=# CREATE TABLE tbl_log_2023_default PARTITION OF tbl_log_2023 default;
tdsql=# insert into tbl_log_2023(create_time, remark) values('2022-12-01 10:55:01', 1);
tdsql=# insert into tbl_log_2023(create_time, remark) values('2023-03-01 10:55:01', 1);

tdsql=# select * from tbl_log_2023 partition(tbl_log_2023_default) ORDER BY ID;
 ID |     CREATE_TIME     | REMARK
----+---------------------+--------
  5 | 2022-12-01 10:55:01 | 1
  6 | 2023-03-01 10:55:01 | 1
(2 rows)

/*
 * 6. 设置 MAXVALUE && MINVALUE 边界
 *
 * 如2023年之前的日志建立一个分区,以及超过2023年3月的日志建立一个分区
 */
/* -- 6.1 清理老表 -- */
tdsql=# drop table if exists tbl_log_2023;

/* -- 6.2 创建分区主表 -- */
tdsql=# create table tbl_log_2023(id serial, create_time timestamp(0) without time zone, remark char(1)) partition by RANGE(create_time);

/* -- 6.3 创建分区子表 -- */
tdsql=# CREATE TABLE tbl_log_2023_min PARTITION OF tbl_log_2023 FOR VALUES FROM (MINVALUE) TO ('2023-01-01');
tdsql=# CREATE TABLE tbl_log_202301 PARTITION OF tbl_log_2023 FOR VALUES FROM ('2023-01-01') TO ('2023-02-01');
tdsql=# CREATE TABLE tbl_log_202302 PARTITION OF tbl_log_2023 FOR VALUES FROM ('2023-02-01') TO ('2023-03-01');
tdsql=# CREATE TABLE tbl_log_2023_max PARTITION OF tbl_log_2023 FOR VALUES FROM ('2023-03-01') TO (MAXVALUE);

/* -- 6.4 查看分区表信息 -- */
tdsql=# \d+ tbl_log_2023
                                                              Table "public.TBL_LOG_2023"
   Column    |              Type              | Collation | Nullable |                 Default                  | Storage  | Stats target | Description
-------------+--------------------------------+-----------+----------+------------------------------------------+----------+--------------+-------------
 ID          | INTEGER                        |           | not null | NEXTVAL('TBL_LOG_2023_ID_SEQ'::REGCLASS) | plain    |              |
 CREATE_TIME | TIMESTAMP(0) WITHOUT TIME ZONE |           |          |                                          | plain    |              |
 REMARK      | CHARACTER(1)                   |           |          |                                          | extended |              |
Partition key: RANGE (CREATE_TIME)
Partitions: TBL_LOG_202301 FOR VALUES FROM ('2023-01-01 00:00:00') TO ('2023-02-01 00:00:00'),
            TBL_LOG_202302 FOR VALUES FROM ('2023-02-01 00:00:00') TO ('2023-03-01 00:00:00'),
            TBL_LOG_2023_MAX FOR VALUES FROM ('2023-03-01 00:00:00') TO (MAXVALUE),
            TBL_LOG_2023_MIN FOR VALUES FROM (MINVALUE) TO ('2023-01-01 00:00:00')
Distribute By: SHARD(ID)
Location Nodes: ALL DATANODES

/* -- 6.5 验证 MIN && MAX -- */
tdsql=# insert into tbl_log_2023(create_time, remark) values('2022-12-01 13:55:02', 1);
tdsql=# insert into tbl_log_2023(create_time, remark) values('2023-03-01 14:52:01', 1);

tdsql=# select * from tbl_log_2023 partition(TBL_LOG_2023_MIN) ORDER BY ID;
 ID |     CREATE_TIME     | REMARK
----+---------------------+--------
  1 | 2022-12-01 13:55:02 | 1
(1 row)

tdsql=# select * from tbl_log_2023 partition(TBL_LOG_2023_MAX) ORDER BY ID;
 ID |     CREATE_TIME     | REMARK
----+---------------------+--------
  2 | 2023-03-01 14:52:01 | 1
(1 row)

示例 - 整数范围分区

/* -- 1. 创建分区 -- */
CREATE TABLE t_prt_range (a int) PARTITION BY RANGE(a);
CREATE TABLE t_prt_range_p1 PARTITION OF t_prt_range FOR VALUES FROM (0) TO (250);
CREATE TABLE t_prt_range_p2 PARTITION OF t_prt_range FOR VALUES FROM (250) TO (500);
CREATE TABLE t_prt_range_p3 PARTITION OF t_prt_range DEFAULT;

/* -- 2. 查看分区信息 -- */
tdsql=# \d+ t_prt_range
                                Table "public.T_PRT_RANGE"
 Column |  Type   | Collation | Nullable | Default | Storage | Stats target | Description
--------+---------+-----------+----------+---------+---------+--------------+-------------
 A      | INTEGER |           |          |         | plain   |              |
Partition key: RANGE (A)
Partitions: T_PRT_RANGE_P1 FOR VALUES FROM (0) TO (250),
            T_PRT_RANGE_P2 FOR VALUES FROM (250) TO (500),
            T_PRT_RANGE_P3 DEFAULT
Distribute By: SHARD(A)
Location Nodes: ALL DATANODES

/* -- 3. 写1000条数据 -- */
tdsql=# INSERT INTO t_prt_range SELECT gs FROM generate_series(1,1000) gs;

/* -- 4. 统计各分区数据 -- */
tdsql=# select "tableoid"::regclass, count(*) from t_prt_range group by "tableoid" order by 1;
tableoid    | COUNT
----------------+-------
T_PRT_RANGE_P1 |   249
T_PRT_RANGE_P2 |   250
T_PRT_RANGE_P3 |   501
(3 rows)

列表分区

列表分区通过为分区键指定离散值列表,通过该列表显式控制行映射到分区子表。
列表分区的优点是可以以自然的方式对无序和不相关的数据集进行分组和组织,如以地区列作为分区键的表,分区可能包含值城市a、城市b和城市c等。
语法:

CREATE TABLE table_name (col1, col2, col3 ...) PARTITION BY LIST ( col1 );
CREATE TABLE table_name_part_1 PARTITION OF table_name FOR VALUES IN ( exp1, exp2 ... );
...
CREATE TABLE table_name_part_n PARTITION OF table_name DEFAULT;

说明

  • 分区键由一个或多个列组成, 可指定NULL表示允许分区键列为空。
  • exp1 可以是常量,数字, NULL, TRUE, FALSE 等。

示例

/* -- 1. 创建分区表 -- */
tdsql=# CREATE TABLE tbl_city (id int, city varchar) PARTITION BY LIST (city);
tdsql=# CREATE TABLE tbl_city_super PARTITION OF tbl_city FOR VALUES IN ('城市a', '城市b', '城市c', '城市d');
tdsql=# CREATE TABLE tbl_city_others PARTITION OF tbl_city FOR VALUES IN (NULL);
tdsql=# CREATE TABLE tbl_city_def PARTITION OF tbl_city DEFAULT;

/* -- 2. 查看分区表信息 -- */
tdsql=# \d+ tbl_city
                                       Table "public.TBL_CITY"
 Column |       Type        | Collation | Nullable | Default | Storage  | Stats target | Description
--------+-------------------+-----------+----------+---------+----------+--------------+-------------
 ID     | INTEGER           |           |          |         | plain    |              |
 CITY   | CHARACTER VARYING |           |          |         | extended |              |
Partition key: LIST (CITY)
Partitions: TBL_CITY_OTHERS FOR VALUES IN (NULL),
            TBL_CITY_SUPER FOR VALUES IN ('城市a', '城市b', '城市c', '城市d'),
            TBL_CITY_DEF DEFAULT
Distribute By: SHARD(ID)
Location Nodes: ALL DATANODES

/* -- 3. 验证分区 -- */
tdsql=# INSERT INTO tbl_city VALUES(1, '城市a');		-- 将写入 tbl_city_super 分区
tdsql=# INSERT INTO tbl_city VALUES(2, '城市b');		-- 将写入 tbl_city_super 分区
tdsql=# INSERT INTO tbl_city VALUES(3, '城市e');		-- '城市e' 未在列表范围内 将进入默认分区 tbl_city_def
tdsql=# INSERT INTO tbl_city VALUES(4, '');			-- '' 表示NULL 将写入 tbl_city_others 分区

/* -- 4. 查看各个分区的数据 -- */
tdsql=# select * from tbl_city partition(tbl_city_super);
 ID | CITY
----+------
  1 | 城市a
  2 | 城市b
(2 rows)

tdsql=# select * from tbl_city partition(tbl_city_others);
 ID | CITY
----+------
  4 | null
(1 row)

tdsql=# select * from tbl_city partition(tbl_city_def);
 ID | CITY
----+------
  3 | 城市e
(1 row)

散列分区

散列分区是跨设备均匀分布数据的理想方法,也是范围分区的一种易于使用的替代方案,尤其是当要分区的数据不是历史数据或没有明显的分区键时。散列分区根据指定分区键通过散列算法将数据映射到指定的分区子表,散列算法在分区之间均匀分布行,使分区大小大致相同。通过为每个分区指定模数和余数来对表进行分区,每个分区所持有的行都满足:分区键的值除以其指定的模数将产生为其指定的余数。
语法:

CREATE TABLE table_name (col1, col2, col3 ...) PARTITION BY HASH( col1 );
CREATE TABLE table_name_part_1 PARTITION OF table_name FOR VALUES WITH ( MODULUS x, REMAINDER  y);
CREATE TABLE table_name_part_2 PARTITION OF table_name FOR VALUES WITH ( MODULUS x, REMAINDER  y);
...

说明

  • 当创建哈希分区时,必须指定模数MODULUS 和余数 REMAINER。模数必须是正整数,余数必须是小于模数的非负整数。
  • 通常情况下,当初始设置哈希分区表时,应选择一个与分区数相等的模数,并为每个表分配相同的模数和不同的余数。
  • 散列分区表不支持 DEFAULT 分区。

示例

/* -- 1. 创建分区表 -- */
tdsql=# create table tbl_hash(id integer, product integer) partition by hash (id);
tdsql=# create table tbl_hash_part_0 partition of tbl_hash for values with(modulus 2, remainder 0);
tdsql=# create table tbl_hash_part_1 partition of tbl_hash for values with(modulus 2, remainder 1);

/* -- 2. 查看分区表信息 -- */
tdsql=# \d+ tbl_hash
                                  Table "public.TBL_HASH"
 Column  |  Type   | Collation | Nullable | Default | Storage | Stats target | Description
---------+---------+-----------+----------+---------+---------+--------------+-------------
 ID      | INTEGER |           |          |         | plain   |              |
 PRODUCT | INTEGER |           |          |         | plain   |              |
Partition key: HASH (ID)
Partitions: TBL_HASH_PART_0 FOR VALUES WITH (modulus 2, remainder 0),
            TBL_HASH_PART_1 FOR VALUES WITH (modulus 2, remainder 1)
Distribute By: SHARD(ID)
Location Nodes: ALL DATANODES

/* -- 2.1 查看分区子表表信息 -- */
tdsql=# \d+ tbl_hash_part_0
                              Table "public.TBL_HASH_PART_0"
 Column  |  Type   | Collation | Nullable | Default | Storage | Stats target | Description
---------+---------+-----------+----------+---------+---------+--------------+-------------
 ID      | INTEGER |           |          |         | plain   |              |
 PRODUCT | INTEGER |           |          |         | plain   |              |
Partition of: TBL_HASH FOR VALUES WITH (modulus 2, remainder 0)
Partition constraint: SATISFIES_HASH_PARTITION('17413'::OID, 2, 0, ID)
Distribute By: SHARD(ID)
Location Nodes: ALL DATANODES

/* -- 2.2 查看分区子表表信息 -- */
tdsql=# \d+ tbl_hash_part_1
                              Table "public.TBL_HASH_PART_1"
 Column  |  Type   | Collation | Nullable | Default | Storage | Stats target | Description
---------+---------+-----------+----------+---------+---------+--------------+-------------
 ID      | INTEGER |           |          |         | plain   |              |
 PRODUCT | INTEGER |           |          |         | plain   |              |
Partition of: TBL_HASH FOR VALUES WITH (modulus 2, remainder 1)
Partition constraint: SATISFIES_HASH_PARTITION('17413'::OID, 2, 1, ID)
Distribute By: SHARD(ID)
Location Nodes: ALL DATANODES

/* -- 3. 验证哈希分区 -- */
tdsql=# insert into tbl_hash select generate_series(1,1000),generate_series(1,1000);

/* -- 4. 查看各个分区数据 -- */
tdsql=# select "tableoid"::regclass, count(*) from tbl_hash group by "tableoid" order by 1;
    tableoid     | COUNT
-----------------+-------
 TBL_HASH_PART_0 |   535
 TBL_HASH_PART_1 |   465
(2 rows)

多级分区

一种结合两种或多种分区方法的分区技术,TDSQL PG支持一级分区为RANGE、LIST,二级分区为RANGE、LIST、HASH组合。

示例:

两层分区,第一层按列表分区,第二层按范围分区。

/* -- 1. 建立分区主表 -- */
tdsql=# CREATE TABLE t_prt_list_range (a varchar2, b int) PARTITION BY LIST(a);

/* -- 2. 建立一级分区子表 -- */
tdsql=# CREATE TABLE t_prt_list_range_p1 PARTITION OF t_prt_list_range FOR VALUES IN ('a') PARTITION BY RANGE(b);
tdsql=# CREATE TABLE t_prt_list_range_p2 PARTITION OF t_prt_list_range FOR VALUES IN ('b');
tdsql=# CREATE TABLE t_prt_list_range_p3 PARTITION OF t_prt_list_range DEFAULT;

/* -- 3. 建立二级分区子表 -- */
tdsql=# CREATE TABLE t_prt_list_range_p1_p1 PARTITION OF t_prt_list_range_p1 FOR VALUES FROM (0) TO (200);
tdsql=# CREATE TABLE t_prt_list_range_p1_p2 PARTITION OF t_prt_list_range_p1 FOR VALUES FROM (200) TO (400);
tdsql=# CREATE TABLE t_prt_list_range_p1_p3 PARTITION OF t_prt_list_range_p1 DEFAULT;

/* -- 4. 查看分区信息 -- */
tdsql=# \d+ T_PRT_LIST_RANGE
                              Table "public.T_PRT_LIST_RANGE"
 Column |   Type   | Collation | Nullable | Default | Storage  | Stats target | Description
--------+----------+-----------+----------+---------+----------+--------------+-------------
 A      | VARCHAR2 |           |          |         | extended |              |
 B      | INTEGER  |           |          |         | plain    |              |
Partition key: LIST (A)
Partitions: T_PRT_LIST_RANGE_P1 FOR VALUES IN ('a'), PARTITIONED,
            T_PRT_LIST_RANGE_P2 FOR VALUES IN ('b'),
            T_PRT_LIST_RANGE_P3 DEFAULT
Distribute By: SHARD(A)
Location Nodes: ALL DATANODES

/* -- 4.1 查看子分区信息 -- */
tdsql=# \d+ T_PRT_LIST_RANGE_P1
                             Table "public.T_PRT_LIST_RANGE_P1"
 Column |   Type   | Collation | Nullable | Default | Storage  | Stats target | Description
--------+----------+-----------+----------+---------+----------+--------------+-------------
 A      | VARCHAR2 |           |          |         | extended |              |
 B      | INTEGER  |           |          |         | plain    |              |
Partition of: T_PRT_LIST_RANGE FOR VALUES IN ('a')
Partition constraint: ((A IS NOT NULL) AND ((A)::TEXT = 'a'::VARCHAR2))
Partition key: RANGE (B)
Partitions: T_PRT_LIST_RANGE_P1_P1 FOR VALUES FROM (0) TO (200),
            T_PRT_LIST_RANGE_P1_P2 FOR VALUES FROM (200) TO (400),
            T_PRT_LIST_RANGE_P1_P3 DEFAULT
Distribute By: SHARD(A)
Location Nodes: ALL DATANODES

/* -- 5. 验证多级分区 -- */
tdsql=# INSERT INTO t_prt_list_range SELECT 'a', gs FROM generate_series(1,500) gs;
tdsql=# INSERT INTO t_prt_list_range SELECT 'b', gs FROM generate_series(1,500) gs;
tdsql=# INSERT INTO t_prt_list_range SELECT 'c', gs FROM generate_series(1,500) gs;

/* -- 6. 查看各分区数据分布 -- */
tdsql=# select "tableoid"::regclass, count(*) from t_prt_list_range group by "tableoid" order by 1;
        tableoid        | COUNT
------------------------+-------
 T_PRT_LIST_RANGE_P2    |   500
 T_PRT_LIST_RANGE_P3    |   500
 T_PRT_LIST_RANGE_P1_P1 |   199
 T_PRT_LIST_RANGE_P1_P2 |   200
 T_PRT_LIST_RANGE_P1_P3 |   101
(5 rows)

分区表限制

  • 无法创建跨越所有分区的排除约束,只能是每个子分区的单个约束。
  • 在分区表上支持主键,但不支持引用分区表的外键。
  • 当一个UPDATE导致一行从一个分区移动到另一个分区时,另一个并发的UPDATE或DELETE可能会产生一个串行化错误。假设会话1正在执行一个分区键上的UPDATE,同时一个并发的能看见这个行的会话2执行了对该行的UPDATE或者DELETE操作。在这种情况下,会话2的UPDATE或者DELETE会检测到行的移动,并抛出一个串行化的错误(将总是会返回一个SQLSTATE '40001')。 如果发生这种情况,应用程序可能希望重试该事务。在没有分区表或没有行移动的通常情况下,会话2将识别新更新的行并在新行上执行UPDATE/DELETE。
  • 不允许在同一分区表中混用临时表和持久表,如果分区表是持久的,则其分区也必须是持久的,反之亦然。在使用临时表时,所有的分区都必须来自于同一个会话。