简介
当一个数据库中存在大量外键约束时,您可能发现难以可视化表之间的外键关系。本文将探讨如何编写
SQL 查询来查找 DB2 for Linux, UNIX, and Windows 中的外键关系。
文中将讨论以下变体。
1、给定一个外键父表,返回 RI(参照完整性)子表和后代表,以及从附表到这些子表和后代表的
RI 关系路径。
2、修改所提供的查询,以返回数据库中所有表的结果。
样例模式
清单 1 中所示的样例模式将用于本文中的示例。
清单 1. 样例模式
set schema newton; create table grandparent (i1 int not null primary key, i2 int, i3 int); create table parent (i1 int not null primary key, i2 int); create table parent2 (i1 int not null primary key, i2 int); create table child (i1 int not null primary key, i2 int, i3 int); create table grandchild (i1 int not null primary key, i2 int, i3 int); alter table parent add constraint fkp1 foreign key (i2) references grandparent; alter table parent2 add constraint fkp2 foreign key (i2) references grandparent; alter table child add constraint fk1 foreign key (i2) references parent; alter table child add constraint fk2 foreign key (i3) references parent2; alter table grandchild add constraint fk3 foreign key (i2) references child; alter table grandchild add constraint fk4 foreign key (i3) references parent2; create table gp (i1 int not null, i2 int not null, i3 int, primary key (i1, i2)); create table p1 (i1 int not null primary key, i2 int, i3 int); create table c11 (i1 int not null primary key, i2 int); create table c12 (i1 int not null primary key, i2 int); alter table p1 add constraint fkp1 foreign key (i2, i3) references gp; alter table c11 add constraint fkc11 foreign key (i2) references p1; alter table c12 add constraint fkc12 foreign key (i2) references p1; alter table gp add constraint fkgp1 foreign key (i2) references c12; create table self (i1 int not null primary key, i2 int); alter table self add constraint fk_self foreign key (i2) references self; |
如何显示所有 RI 约束
在最简单的形式中,您可以通过查询目录视图 SYSCAT.REFERENCES
来获取所有外键约束的清单。
ELECT * FROM SYSCAT.REFERENCES |
您可以结合使用结果和 SYSCAT.KEYCOLUSE 来查找外键列。
要生成 RI 约束中使用的外键列的列表(用逗号分隔),可以在将 SYSCAT.REFERENCES
与 SYSCAT.KEYCOLUSE 联结(join)时使用 LISTAGG() 聚合函数,如清单 2
所示。
清单 2. LISTAGG() 聚合函数
select substr(R.reftabschema,1,12) as P_Schema, substr(R.reftabname,1,12) as PARENT,
substr(R.tabschema,1,12) as C_Schema, substr (R.tabname,1,12) as CHILD,
substr(R.constname,1,12) as CONSTNAME,
substr(LISTAGG(C.colname,', ') WITHIN GROUP (ORDER BY C.colname),1,20) as FKCOLS
from syscat.references R, syscat.keycoluse C
where R.constname = C.constname and R.tabschema = C.tabschema and R.tabname = C.tabname
group by R.reftabschema, R.reftabname, R.tabschema, R.tabname, R.constname;
|
结果将类似于清单 3。
清单 3. LISTAGG() 聚合函数的输出
P_SCHEMA PARENT C_SCHEMA CHILD CONSTNAME FKCOLS
------------ ------------ ------------ ------------ ------------ ----------------
NEWTON PARENT NEWTON CHILD FK1 I2
NEWTON PARENT2 NEWTON CHILD FK2 I3
NEWTON CHILD NEWTON GRANDCHILD FK3 I2
NEWTON PARENT2 NEWTON GRANDCHILD FK4 I3
NEWTON P1 NEWTON C11 FKC11 I2
NEWTON P1 NEWTON C12 FKC12 I2
NEWTON C12 NEWTON GP FKGP1 I2
NEWTON GP NEWTON P1 FKP1 I2 , I3
NEWTON GRANDPARENT NEWTON PARENT FKP1 I2
NEWTON GRANDPARENT NEWTON PARENT2 FKP2 I2
NEWTON SELF NEWTON SELF FK_SELF I2
|
但是,在一个复杂数据库中,父表和它的非直接后代之间的关系很难从前面描述的简单查询中看出。
显示一个给定表的所有外键子表和后代表
您可以编写一个递归查询来遍历 RI 关系。但是,如果使用 DB2 for
Linux, UNIX, and Windows, Version 9.7 或更高版本,那么分层查询是递归地遍历
SYSCAT.REFERENCES 中的 RI 关系的更好的替代方法。参阅 参考资料 一节,了解有关的更多信息。
在 DB2 中使用分层查询需要以下 DB2_COMPATIBILITY_VECTOR
设置:
db2set DB2_COMPATIBILITY_VECTOR=08
db2stop
db2start
|
分层查询支持 START WITH 和 CONNECT BY 子句的规范。
例如,清单 4 中所示的查询(查询 1)返回所有 RI 后代和从根表 NEWTON.GRANDPARENT
到它们的惟一路径。
清单 4. 查询 1
with
root_parents (root_parent_schema, root_parent_name) AS
(select * from table(values ('NEWTON', 'GRANDPARENT')))
select
substr(connect_by_root reftabname,1,11) as root,
substr (level, 1,5) as lvl,
-- substr(reftabschema,1,6) as parent_schema,
substr(reftabname,1,11) as parent,
-- substr(tabschema,1,6) as child_schema,
substr(tabname,1,10) as child,
substr(constname,1,5) as cnst,
substr(sys_connect_by_path(reftabname, '->') || '->' ||
substr(tabname,1,20),1,42) as chain
from syscat.references
start with (reftabschema,reftabname) in (select root_parent_schema,
root_parent_name from root_parents)
connect by nocycle prior tabname = reftabname
and tabschema = reftabschema;
|
注意:模式名称已从所有查询的输出中删除,以便得到更好的格式。
查询 1 将返回清单 5 中所示的结果。
清单 5. 查询 1 的输出
ROOT LVL PARENT CHILD CNST CHAIN
----------- ----- ----------- ---------- ----- ------------------------------------------
GRANDPARENT 1 GRANDPARENT PARENT FKP1 ->GRANDPARENT->PARENT
GRANDPARENT 2 PARENT CHILD FK1 ->GRANDPARENT->PARENT->CHILD
GRANDPARENT 3 CHILD GRANDCHILD FK3 ->GRANDPARENT->PARENT->CHILD->GRANDCHILD
GRANDPARENT 1 GRANDPARENT PARENT2 FKP2 ->GRANDPARENT->PARENT2
GRANDPARENT 2 PARENT2 CHILD FK2 ->GRANDPARENT->PARENT2->CHILD
GRANDPARENT 3 CHILD GRANDCHILD FK3 ->GRANDPARENT->PARENT2->CHILD->GRANDCHILD
GRANDPARENT 2 PARENT2 GRANDCHILD FK4 ->GRANDPARENT->PARENT2->GRANDCHILD
|
注意:查询将显示从给定根表到它的每个后代表的所有惟一路径。如果有一个钻石形的
RI 路径,那么两条路径都会显示在结果中。如上面的示例中所示,会显示以下两条路径。
GRANDPARENT->PARENT->CHILD->GRANDCHILD
GRANDPARENT->PARENT2->CHILD->GRANDCHILD
|
以下是使用分层查询相比传统的递归查询的一些优势。
1、递归性由 START WITH 和 CONNECT BY 子句处理,无需编写递归查询。
2、伪列 LEVEL 自动返回从根父表到该表的级数。
3、标量函数 sys_connect_by_path() 构建一个字符串,表示分层结构中从根表到一个节点的一条路径。在前一个示例中,该函数用在
CHAIN 列中来构建来自根表的 RI 关系链。
4、如果存在任何 RI 循环(例如 GP → P1 → C12 → GP),NOCYCLE
子句允许您告诉递归函数在循环中忽略重复行。
显示数据库中所有没有 RI 循环的表的外键子表和后代表关系
如果数据库中没有 RI 循环,那么您可以修改通用表的表达式 root_parents,以包含至少有一个子表但没有父表的所有表,如清单
6 中的查询 2 所示。
清单 6. 查询 2
with
root_parents (root_parent_schema, root_parent_name) AS
(select tabschema, tabname
from syscat.tables
where parents = 0 and children > 0)
select
substr(connect_by_root reftabname,1,11) as root,
substr (level, 1,5) as lvl,
-- substr(reftabschema,1,6) as parent_schema,
substr(reftabname,1,11) as parent,
-- substr(tabschema,1,6) as child_schema,
substr(tabname,1,10) as child,
substr(constname,1,5) as cnst,
substr(sys_connect_by_path(reftabname, '->') || '->' ||
substr(tabname,1,20),1,42) as chain
from syscat.references
start with (reftabschema,reftabname) in (select root_parent_schema,
root_parent_name from root_parents)
connect by prior tabname = reftabname and tabschema = reftabschema;
|
但是,查询 2 不会包含处于 RI 循环中的表,因为这些表拥有的父表和子表数都大于
0。
显示可能具有 RI 循环的外键子表和后代表关系
RI 循环中包含根表的用户输入
从业务逻辑角度讲,RI 循环中的一些表将更适合用作 RI 循环中的根父表。但是,数据库无法轻松地确定
RI 循环中的哪个表应该是根父表。
如果系统中没有太多 RI 循环,您可编写一个半自动查询为每个 RI 循环手动指定一个表作为根父表。
在清单 7 中所示的查询 3 中,通过在通用表的表达式 root_parents
的 UNION ALL 操作中添加一个 VALUES 子句,可以手动指定每个 RI 循环中的根表,查询将返回数据库中的所有
RI 子表和后代表。
清单 7. 查询 3
with
root_parents (root_parent_schema, root_parent_name) AS
(select tabschema, tabname
from syscat.tables
where parents = 0 and children > 0
UNION ALL
select * from table(values ('NEWTON', 'GP'), ('NEWTON', 'SELF')))
select
substr(connect_by_root reftabname,1,11) as root,
substr (level, 1,3) as lvl,
-- substr(reftabschema,1,6) as parent_schema,
substr(reftabname,1,11) as parent,
-- substr(tabschema,1,6) as child_schema,
substr(tabname,1,10) as child,
substr(constname,1,7) as cnstnam,
substr(sys_connect_by_path(reftabname, '->') || '->' ||
substr(tabname,1,20),1,42) as chain
from syscat.references
start with (reftabschema,reftabname) in (select root_parent_schema,
root_parent_name from root_parents)
connect by NOCYCLE prior tabname = reftabname and tabschema = reftabschema;
|
结果将类似于清单 8。
清单 8. 查询 3 的输出
ROOT LVL PARENT CHILD CNSTNAM CHAIN
----------- --- ----------- ---------- ------- ------------------------------------------
SELF 1 SELF SELF FK_SELF ->SELF->SELF
GRANDPARENT 1 GRANDPARENT PARENT FKP1 ->GRANDPARENT->PARENT
GRANDPARENT 2 PARENT CHILD FK1 ->GRANDPARENT->PARENT->CHILD
GRANDPARENT 3 CHILD GRANDCHILD FK3 ->GRANDPARENT->PARENT->CHILD->GRANDCHILD
GRANDPARENT 1 GRANDPARENT PARENT2 FKP2 ->GRANDPARENT->PARENT2
GRANDPARENT 2 PARENT2 CHILD FK2 ->GRANDPARENT->PARENT2->CHILD
GRANDPARENT 3 CHILD GRANDCHILD FK3 ->GRANDPARENT->PARENT2->CHILD->GRANDCHILD
GRANDPARENT 2 PARENT2 GRANDCHILD FK4 ->GRANDPARENT->PARENT2->GRANDCHILD
GP 1 GP P1 FKP1 ->GP->P1
GP 2 P1 C11 FKC11 ->GP->P1->C11
GP 2 P1 C12 FKC12 ->GP->P1->C12
GP 3 C12 GP FKGP1 ->GP->P1->C12->GP
|
RI 循环中不包含根表的用户输入
如果更喜欢全自动方式,不关心 RI 循环中哪个表被选为根表,那么只需在每个
RI 循环中包含一个代表,以下过程就会将每个 RI 循环中的一个代表性的表存储在一个名为 CYCLEROOTS
的临时表中。最后,该过程使用 CYCLEROOTS 中存储的表名称(如清单 8 所示)显示数据库中的所有
RI 链,如清单 9 所示。
清单 9. 显示所有 RI 链的查询 3
-- If needed, create the user temporary tablespace for the temporary table.
CREATE BUFFERPOOL BUFFERPOOL4K PAGESIZE 4K;
CREATE USER TEMPORARY TABLESPACE STMPTSP4 PAGESIZE 4K BUFFERPOOL BUFFERPOOL4K;
-- create the temporary table to store one representative from each RI cycle as root
create GLOBAL TEMPORARY TABLE SESSION.CYCLEROOTS (SCHEMANAME VARCHAR(128),
TABNAME VARCHAR(128));
--#SET TERMINATOR @
-- procedure to display RI chains in the database
CREATE PROCEDURE newton.FIND_RI_CHAINS ()
DETERMINISTIC
NO EXTERNAL ACTION
DYNAMIC RESULT SETS 1
BEGIN
DECLARE CYCLESCHEMA VARCHAR(128);
DECLARE CYCLETABLE VARCHAR(128);
DECLARE ROWS_FETCHED BIGINT;
DECLARE C_CYCLETABLES CURSOR;
-- This query will return the final result set after
-- temporary table SESSION.CYCLEROOTS has been populated.
DECLARE C_RESULTS CURSOR WITH RETURN TO CLIENT FOR
WITH
ROOT_PARENTS (ROOT_PARENT_SCHEMA, ROOT_PARENT_NAME) AS
(SELECT TABSCHEMA, TABNAME
FROM SYSCAT.TABLES
WHERE PARENTS = 0 AND CHILDREN > 0
UNION ALL
SELECT * FROM SESSION.CYCLEROOTS),
HIERARCHY (ROOT, LEVEL, REFTABSCHEMA, REFTABNAME, TABSCHEMA, TABNAME,
CONSTNAME, CHAIN) AS
(SELECT
CONNECT_BY_ROOT REFTABNAME,
LEVEL,
REFTABSCHEMA AS P_SCHEMA,
REFTABNAME AS PARENT,
TABSCHEMA AS C_SCHEMA,
TABNAME AS CHILD,
CONSTNAME AS CNSTNAM,
SUBSTR(SYS_CONNECT_BY_PATH(REFTABNAME, '->') || '->' ||
SUBSTR(TABNAME,1,20),1,42) AS CHAIN
FROM SYSCAT.REFERENCES
START WITH (REFTABSCHEMA,REFTABNAME) IN (SELECT ROOT_PARENT_SCHEMA,
ROOT_PARENT_NAME
FROM ROOT_PARENTS)
CONNECT BY NOCYCLE PRIOR TABNAME = REFTABNAME AND TABSCHEMA =
REFTABSCHEMA)
SELECT
SUBSTR(H.root,1,11) AS ROOT,
CAST (H.LEVEL AS CHAR(2)) as LVL,
-- SUBSTR(H.REFTABSCHEMA,1,6) as P_SCHEMA,
SUBSTR(H.REFTABNAME,1,11) as PARENT,
-- SUBSTR(H.TABSCHEMA,1,6) as C_SCHEMA,
SUBSTR(H.TABNAME,1,10) as CHILD,
SUBSTR(H.CONSTNAME,1,7) as CNSTNAM,
SUBSTR(H.CHAIN,1,42) as CHAIN
FROM HIERARCHY H;
-- initialize temporary table
DELETE FROM SESSION.CYCLEROOTS;
-- this query will return the remaining tables that are in RI cycles
SET C_CYCLETABLES = CURSOR FOR
WITH
ROOT_PARENTS (ROOT_PARENT_SCHEMA, ROOT_PARENT_NAME) AS
(SELECT TABSCHEMA, TABNAME
FROM SYSCAT.TABLES
WHERE PARENTS = 0 AND CHILDREN > 0
UNION ALL
SELECT * FROM SESSION.CYCLEROOTS),
HIERARCHY (ROOT, LEVEL, REFTABSCHEMA, REFTABNAME, TABSCHEMA,
TABNAME, CONSTNAME, CHAIN) AS
(SELECT CONNECT_BY_ROOT REFTABNAME AS ROOT, LEVEL,
REFTABSCHEMA AS PARENT_SCHEMA,
REFTABNAME AS PARENT,
TABSCHEMA AS CHILD_SCHEMA,
TABNAME AS CHILD,
CONSTNAME AS CONSTNAME,
SUBSTR(SYS_CONNECT_BY_PATH(REFTABNAME, '->') || '->' ||
SUBSTR(TABNAME,1,20),1,50) AS CHAIN
FROM SYSCAT.REFERENCES
START WITH (REFTABSCHEMA,REFTABNAME) IN (SELECT ROOT_PARENT_SCHEMA,
ROOT_PARENT_NAME FROM ROOT_PARENTS)
CONNECT BY NOCYCLE PRIOR TABNAME = REFTABNAME AND
TABSCHEMA = REFTABSCHEMA)
SELECT TABSCHEMA, TABNAME
FROM SYSCAT.TABLES
WHERE CHILDREN > 0
EXCEPT
SELECT REFTABSCHEMA, REFTABNAME FROM HIERARCHY H;
OPEN C_CYCLETABLES;
-- Just select the first table as a root table from the remaining tables that are
-- in RI cycles
FETCH C_CYCLETABLES INTO CYCLESCHEMA, CYCLETABLE;
SET ROWS_FETCHED = CURSOR_ROWCOUNT(C_CYCLETABLES);
-- Keep looping until the result set from C_CYCLETABLES is empty.
WHILE (ROWS_FETCHED > 0) DO
-- insert the select representative into temporary table SESSION.CYCLEROOTS
INSERT INTO SESSION.CYCLEROOTS VALUES (CYCLESCHEMA, CYCLETABLE);
CLOSE C_CYCLETABLES;
-- restart the cursor. The result set will be different from the
-- the previous iterations of the loop. All the tables that
-- are in the same RI cycle as the newly added representative
-- will not show up in the result set from the next iteration of the
-- loop.
OPEN C_CYCLETABLES;
FETCH FROM C_CYCLETABLES INTO CYCLESCHEMA, CYCLETABLE;
SET ROWS_FETCHED = CURSOR_ROWCOUNT(C_CYCLETABLES);
END WHILE;
CLOSE C_CYCLETABLES;
-- When the loop exits, one representative from each RI cycle will have
-- been added to the temporary table SESSION.CYCLEROOTS. We will now
-- open C_RESULTS using the populated temporary table SESSION.CYCLEROOTS.
OPEN C_RESULTS;
END@
--#SET TERMINATOR ;
CALL newton.FIND_RI_CHAINS();
DROP TABLE SESSION.CYCLEROOTS;
|
结果将类似于清单 10。
清单 10. 显示所有 RI 链的查询 3 的输出
Result set 1
--------------
ROOT LVL PARENT CHILD CNSTNAM CHAIN
---------- --- ----------- ---------- ------- ------------------------------------------
SELF 1 SELF SELF FK_SELF ->SELF->SELF
GRANDPARENT 1 GRANDPARENT PARENT FKP1 ->GRANDPARENT->PARENT
GRANDPARENT 2 PARENT CHILD FK1 ->GRANDPARENT->PARENT->CHILD
GRANDPARENT 3 CHILD GRANDCHILD FK3 ->GRANDPARENT->PARENT->CHILD->GRANDCHILD
GRANDPARENT 1 GRANDPARENT PARENT2 FKP2 ->GRANDPARENT->PARENT2
GRANDPARENT 2 PARENT2 CHILD FK2 ->GRANDPARENT->PARENT2->CHILD
GRANDPARENT 3 CHILD GRANDCHILD FK3 ->GRANDPARENT->PARENT2->CHILD->GRANDCHILD
GRANDPARENT 2 PARENT2 GRANDCHILD FK4 ->GRANDPARENT->PARENT2->GRANDCHILD
C12 1 C12 GP FKGP1 ->C12->GP
C12 2 GP P1 FKP1 ->C12->GP->P1
C12 3 P1 C11 FKC11 ->C12->GP->P1->C11
C12 3 P1 C12 FKC12 ->C12->GP->P1->C12
12 record(s) selected.
Return Status = 0
|
请注意,表 C12 已被选择为包含 GP->P1->C12->GP
的 RI 循环中的代表。
结束语
本文介绍了如何编写一个分层查询来查找数据库中的外键关系。具体而言,本文提供了一个查询示例来查找给定表的外键子表和后代表。还提供了示例查询和过程来查找数据库中的所有表的所有外键关系,无论
RI 循环中是否包含根表的用户输入。
|