求知 文章 文库 Lib 视频 iPerson 课程 认证 咨询 工具 讲座 Modeler   Code  
会员   
 
  
 
 
     
   
分享到
使用 SQL 查询外键关系的技巧
 

发布于2013-7-10

 

简介

当一个数据库中存在大量外键约束时,您可能发现难以可视化表之间的外键关系。本文将探讨如何编写 SQL 查询来查找 DB2 for Linux, UNIX, and Windows 中的外键关系。

文中将讨论以下变体。

1、给定一个外键父表,返回 RI(参照完整性)子表和后代表,以及从附表到这些子表和后代表的 RI 关系路径。

2、修改所提供的查询,以返回数据库中所有表的结果。

样例模式

清单 1 中所示的样例模式将用于本文中的示例。

清单 1. 样例模式

set schema newton;

create table grandparent (i1 int not null primary key, i2 int, i3 int);
create table parent (i1 int not null primary key, i2 int);
create table parent2 (i1 int not null primary key, i2 int);
create table child (i1 int not null primary key, i2 int, i3 int);
create table grandchild (i1 int not null primary key, i2 int, i3 int);

alter table parent add constraint fkp1 foreign key (i2) references grandparent;
alter table parent2 add constraint fkp2 foreign key (i2) references grandparent;
alter table child add constraint fk1 foreign key (i2) references parent;
alter table child add constraint fk2 foreign key (i3) references parent2;
alter table grandchild add constraint fk3 foreign key (i2) references child;
alter table grandchild add constraint fk4 foreign key (i3) references parent2;

create table gp (i1 int not null, i2 int not null, i3 int, primary key (i1, i2));
create table p1 (i1 int not null primary key, i2 int, i3 int);
create table c11 (i1 int not null primary key, i2 int);
create table c12 (i1 int not null primary key, i2 int);

alter table p1 add constraint fkp1 foreign key (i2, i3) references gp;
alter table c11 add constraint fkc11 foreign key (i2) references p1;
alter table c12 add constraint fkc12 foreign key (i2) references p1;
alter table gp add constraint fkgp1 foreign key (i2) references c12;

create table self (i1 int not null primary key, i2 int);
alter table self add constraint fk_self foreign key (i2) references self;

如何显示所有 RI 约束

在最简单的形式中,您可以通过查询目录视图 SYSCAT.REFERENCES 来获取所有外键约束的清单。

ELECT * FROM SYSCAT.REFERENCES

您可以结合使用结果和 SYSCAT.KEYCOLUSE 来查找外键列。

要生成 RI 约束中使用的外键列的列表(用逗号分隔),可以在将 SYSCAT.REFERENCES 与 SYSCAT.KEYCOLUSE 联结(join)时使用 LISTAGG() 聚合函数,如清单 2 所示。

清单 2. LISTAGG() 聚合函数

select  substr(R.reftabschema,1,12) as P_Schema, substr(R.reftabname,1,12) as PARENT,
        substr(R.tabschema,1,12) as C_Schema, substr (R.tabname,1,12) as CHILD,
        substr(R.constname,1,12) as CONSTNAME, 
        substr(LISTAGG(C.colname,', ') WITHIN GROUP (ORDER BY C.colname),1,20) as FKCOLS 
from syscat.references R, syscat.keycoluse C 
where R.constname = C.constname and R.tabschema = C.tabschema and R.tabname = C.tabname
group by R.reftabschema, R.reftabname, R.tabschema, R.tabname, R.constname;

结果将类似于清单 3。

清单 3. LISTAGG() 聚合函数的输出

P_SCHEMA     PARENT       C_SCHEMA     CHILD        CONSTNAME    FKCOLS              
------------ ------------ ------------ ------------ ------------ ----------------
NEWTON       PARENT       NEWTON       CHILD        FK1          I2                  
NEWTON       PARENT2      NEWTON       CHILD        FK2          I3                  
NEWTON       CHILD        NEWTON       GRANDCHILD   FK3          I2                  
NEWTON       PARENT2      NEWTON       GRANDCHILD   FK4          I3                  
NEWTON       P1           NEWTON       C11          FKC11        I2                  
NEWTON       P1           NEWTON       C12          FKC12        I2                  
NEWTON       C12          NEWTON       GP           FKGP1        I2                  
NEWTON       GP           NEWTON       P1           FKP1         I2  , I3           
NEWTON       GRANDPARENT  NEWTON       PARENT       FKP1         I2                  
NEWTON       GRANDPARENT  NEWTON       PARENT2      FKP2         I2                  
NEWTON       SELF         NEWTON       SELF         FK_SELF      I2  

但是,在一个复杂数据库中,父表和它的非直接后代之间的关系很难从前面描述的简单查询中看出。

显示一个给定表的所有外键子表和后代表

您可以编写一个递归查询来遍历 RI 关系。但是,如果使用 DB2 for Linux, UNIX, and Windows, Version 9.7 或更高版本,那么分层查询是递归地遍历 SYSCAT.REFERENCES 中的 RI 关系的更好的替代方法。参阅 参考资料 一节,了解有关的更多信息。

在 DB2 中使用分层查询需要以下 DB2_COMPATIBILITY_VECTOR 设置:

db2set DB2_COMPATIBILITY_VECTOR=08
db2stop
db2start

分层查询支持 START WITH 和 CONNECT BY 子句的规范。

例如,清单 4 中所示的查询(查询 1)返回所有 RI 后代和从根表 NEWTON.GRANDPARENT 到它们的惟一路径。

清单 4. 查询 1

with 
root_parents (root_parent_schema, root_parent_name) AS
 (select * from table(values ('NEWTON', 'GRANDPARENT')))
select 
 substr(connect_by_root reftabname,1,11) as root,
 substr (level, 1,5) as lvl,
--  substr(reftabschema,1,6) as parent_schema,
 substr(reftabname,1,11) as parent, 
--  substr(tabschema,1,6) as child_schema,
 substr(tabname,1,10) as child, 
 substr(constname,1,5) as cnst, 
 substr(sys_connect_by_path(reftabname, '->') || '->' || 
 substr(tabname,1,20),1,42) as chain
from syscat.references
start with (reftabschema,reftabname) in (select root_parent_schema, 
                                        root_parent_name from root_parents)
connect by nocycle prior  tabname = reftabname 
                      and tabschema = reftabschema; 

注意:模式名称已从所有查询的输出中删除,以便得到更好的格式。

查询 1 将返回清单 5 中所示的结果。

清单 5. 查询 1 的输出

ROOT        LVL   PARENT      CHILD      CNST  CHAIN                                     
----------- ----- ----------- ---------- ----- ------------------------------------------
GRANDPARENT 1     GRANDPARENT PARENT     FKP1  ->GRANDPARENT->PARENT                     
GRANDPARENT 2     PARENT      CHILD      FK1   ->GRANDPARENT->PARENT->CHILD              
GRANDPARENT 3     CHILD       GRANDCHILD FK3   ->GRANDPARENT->PARENT->CHILD->GRANDCHILD  
GRANDPARENT 1     GRANDPARENT PARENT2    FKP2  ->GRANDPARENT->PARENT2                    
GRANDPARENT 2     PARENT2     CHILD      FK2   ->GRANDPARENT->PARENT2->CHILD             
GRANDPARENT 3     CHILD       GRANDCHILD FK3   ->GRANDPARENT->PARENT2->CHILD->GRANDCHILD 
GRANDPARENT 2     PARENT2     GRANDCHILD FK4   ->GRANDPARENT->PARENT2->GRANDCHILD 

注意:查询将显示从给定根表到它的每个后代表的所有惟一路径。如果有一个钻石形的 RI 路径,那么两条路径都会显示在结果中。如上面的示例中所示,会显示以下两条路径。

GRANDPARENT->PARENT->CHILD->GRANDCHILD
GRANDPARENT->PARENT2->CHILD->GRANDCHILD

以下是使用分层查询相比传统的递归查询的一些优势。

1、递归性由 START WITH 和 CONNECT BY 子句处理,无需编写递归查询。

2、伪列 LEVEL 自动返回从根父表到该表的级数。

3、标量函数 sys_connect_by_path() 构建一个字符串,表示分层结构中从根表到一个节点的一条路径。在前一个示例中,该函数用在 CHAIN 列中来构建来自根表的 RI 关系链。

4、如果存在任何 RI 循环(例如 GP → P1 → C12 → GP),NOCYCLE 子句允许您告诉递归函数在循环中忽略重复行。

显示数据库中所有没有 RI 循环的表的外键子表和后代表关系

如果数据库中没有 RI 循环,那么您可以修改通用表的表达式 root_parents,以包含至少有一个子表但没有父表的所有表,如清单 6 中的查询 2 所示。

清单 6. 查询 2

with 
root_parents (root_parent_schema, root_parent_name) AS
 (select tabschema, tabname
    from syscat.tables
    where parents = 0 and children > 0)
select 
  substr(connect_by_root reftabname,1,11) as root,
  substr (level, 1,5) as lvl,
--  substr(reftabschema,1,6) as parent_schema,
  substr(reftabname,1,11) as parent, 
--  substr(tabschema,1,6) as child_schema,
  substr(tabname,1,10) as child, 
  substr(constname,1,5) as cnst, 
  substr(sys_connect_by_path(reftabname, '->') || '->' || 
  substr(tabname,1,20),1,42) as chain
from syscat.references
start with (reftabschema,reftabname) in (select root_parent_schema, 
                                        root_parent_name from root_parents)
connect by prior  tabname = reftabname and tabschema = reftabschema;

但是,查询 2 不会包含处于 RI 循环中的表,因为这些表拥有的父表和子表数都大于 0。

显示可能具有 RI 循环的外键子表和后代表关系

RI 循环中包含根表的用户输入

从业务逻辑角度讲,RI 循环中的一些表将更适合用作 RI 循环中的根父表。但是,数据库无法轻松地确定 RI 循环中的哪个表应该是根父表。

如果系统中没有太多 RI 循环,您可编写一个半自动查询为每个 RI 循环手动指定一个表作为根父表。

在清单 7 中所示的查询 3 中,通过在通用表的表达式 root_parents 的 UNION ALL 操作中添加一个 VALUES 子句,可以手动指定每个 RI 循环中的根表,查询将返回数据库中的所有 RI 子表和后代表。

清单 7. 查询 3

with 
root_parents (root_parent_schema, root_parent_name) AS
 (select tabschema, tabname
    from syscat.tables
    where parents = 0 and children > 0
  UNION ALL
  select * from table(values ('NEWTON', 'GP'), ('NEWTON', 'SELF')))
select 
  substr(connect_by_root reftabname,1,11) as root,
  substr (level, 1,3) as lvl,
--  substr(reftabschema,1,6) as parent_schema,
  substr(reftabname,1,11) as parent, 
--  substr(tabschema,1,6) as child_schema,
  substr(tabname,1,10) as child, 
  substr(constname,1,7) as cnstnam, 
  substr(sys_connect_by_path(reftabname, '->') || '->' || 
  substr(tabname,1,20),1,42) as chain
from syscat.references
start with (reftabschema,reftabname) in (select root_parent_schema, 
                                        root_parent_name from root_parents)
connect by NOCYCLE prior  tabname = reftabname and tabschema = reftabschema;

结果将类似于清单 8。

清单 8. 查询 3 的输出

ROOT        LVL PARENT      CHILD      CNSTNAM CHAIN                                     
----------- --- ----------- ---------- ------- ------------------------------------------
SELF        1   SELF        SELF       FK_SELF ->SELF->SELF                              
GRANDPARENT 1   GRANDPARENT PARENT     FKP1    ->GRANDPARENT->PARENT                     
GRANDPARENT 2   PARENT      CHILD      FK1     ->GRANDPARENT->PARENT->CHILD              
GRANDPARENT 3   CHILD       GRANDCHILD FK3     ->GRANDPARENT->PARENT->CHILD->GRANDCHILD  
GRANDPARENT 1   GRANDPARENT PARENT2    FKP2    ->GRANDPARENT->PARENT2                    
GRANDPARENT 2   PARENT2     CHILD      FK2     ->GRANDPARENT->PARENT2->CHILD             
GRANDPARENT 3   CHILD       GRANDCHILD FK3     ->GRANDPARENT->PARENT2->CHILD->GRANDCHILD 
GRANDPARENT 2   PARENT2     GRANDCHILD FK4     ->GRANDPARENT->PARENT2->GRANDCHILD        
GP          1   GP          P1         FKP1    ->GP->P1                                  
GP          2   P1          C11        FKC11   ->GP->P1->C11                             
GP          2   P1          C12        FKC12   ->GP->P1->C12                             
GP          3   C12         GP         FKGP1   ->GP->P1->C12->GP  

RI 循环中不包含根表的用户输入

如果更喜欢全自动方式,不关心 RI 循环中哪个表被选为根表,那么只需在每个 RI 循环中包含一个代表,以下过程就会将每个 RI 循环中的一个代表性的表存储在一个名为 CYCLEROOTS 的临时表中。最后,该过程使用 CYCLEROOTS 中存储的表名称(如清单 8 所示)显示数据库中的所有 RI 链,如清单 9 所示。

清单 9. 显示所有 RI 链的查询 3

-- If needed, create the user temporary tablespace for the temporary table.
CREATE BUFFERPOOL BUFFERPOOL4K  PAGESIZE 4K;
CREATE USER TEMPORARY TABLESPACE STMPTSP4 PAGESIZE 4K  BUFFERPOOL BUFFERPOOL4K;
                
-- create the temporary table to store one representative from each RI cycle as root
create GLOBAL TEMPORARY TABLE SESSION.CYCLEROOTS (SCHEMANAME VARCHAR(128),  
                                                  TABNAME VARCHAR(128));
                
--#SET TERMINATOR @
-- procedure to display RI chains in the database
CREATE PROCEDURE newton.FIND_RI_CHAINS ()
DETERMINISTIC 
NO EXTERNAL ACTION
DYNAMIC RESULT SETS 1
BEGIN
    DECLARE CYCLESCHEMA VARCHAR(128);
    DECLARE CYCLETABLE VARCHAR(128);
    DECLARE ROWS_FETCHED BIGINT;
    DECLARE C_CYCLETABLES CURSOR;
                
    -- This query will return the final result set after
    -- temporary table SESSION.CYCLEROOTS has been populated.
    DECLARE C_RESULTS CURSOR WITH RETURN TO CLIENT FOR
    WITH 
    ROOT_PARENTS (ROOT_PARENT_SCHEMA, ROOT_PARENT_NAME) AS
        (SELECT TABSCHEMA, TABNAME
            FROM SYSCAT.TABLES
            WHERE PARENTS = 0 AND CHILDREN > 0
        UNION ALL
        SELECT * FROM SESSION.CYCLEROOTS),
    HIERARCHY (ROOT, LEVEL, REFTABSCHEMA, REFTABNAME, TABSCHEMA, TABNAME, 
               CONSTNAME, CHAIN) AS
        (SELECT
            CONNECT_BY_ROOT REFTABNAME,
            LEVEL, 
            REFTABSCHEMA AS P_SCHEMA,
            REFTABNAME AS PARENT, 
            TABSCHEMA AS C_SCHEMA,
            TABNAME AS CHILD, 
            CONSTNAME AS CNSTNAM, 
            SUBSTR(SYS_CONNECT_BY_PATH(REFTABNAME, '->') || '->' || 
            SUBSTR(TABNAME,1,20),1,42) AS CHAIN
        FROM SYSCAT.REFERENCES
        START WITH (REFTABSCHEMA,REFTABNAME) IN (SELECT ROOT_PARENT_SCHEMA,
                                                 ROOT_PARENT_NAME
                                                 FROM ROOT_PARENTS)
        CONNECT BY NOCYCLE PRIOR  TABNAME = REFTABNAME AND TABSCHEMA =
         REFTABSCHEMA)
    SELECT
        SUBSTR(H.root,1,11) AS ROOT,
        CAST (H.LEVEL AS CHAR(2)) as LVL,
--       SUBSTR(H.REFTABSCHEMA,1,6) as P_SCHEMA,
        SUBSTR(H.REFTABNAME,1,11) as PARENT, 
--       SUBSTR(H.TABSCHEMA,1,6) as C_SCHEMA,
        SUBSTR(H.TABNAME,1,10) as CHILD, 
        SUBSTR(H.CONSTNAME,1,7) as CNSTNAM, 
        SUBSTR(H.CHAIN,1,42) as CHAIN
        FROM HIERARCHY H;

        -- initialize temporary table
        DELETE FROM SESSION.CYCLEROOTS;

        -- this query will return the remaining tables that are in RI cycles
        SET C_CYCLETABLES = CURSOR FOR
            WITH 
                ROOT_PARENTS (ROOT_PARENT_SCHEMA, ROOT_PARENT_NAME) AS
                  (SELECT TABSCHEMA, TABNAME
                    FROM SYSCAT.TABLES
                    WHERE PARENTS = 0 AND CHILDREN > 0
                   UNION ALL
                   SELECT * FROM SESSION.CYCLEROOTS),
                HIERARCHY (ROOT, LEVEL, REFTABSCHEMA, REFTABNAME, TABSCHEMA, 
                           TABNAME, CONSTNAME, CHAIN) AS
                (SELECT CONNECT_BY_ROOT REFTABNAME AS ROOT, LEVEL, 
                    REFTABSCHEMA AS PARENT_SCHEMA,
                    REFTABNAME AS PARENT, 
                    TABSCHEMA AS CHILD_SCHEMA,
                    TABNAME AS CHILD, 
                    CONSTNAME AS CONSTNAME, 
                    SUBSTR(SYS_CONNECT_BY_PATH(REFTABNAME, '->') || '->' || 
                    SUBSTR(TABNAME,1,20),1,50) AS CHAIN
                FROM SYSCAT.REFERENCES
                START WITH (REFTABSCHEMA,REFTABNAME) IN (SELECT ROOT_PARENT_SCHEMA,
                                                ROOT_PARENT_NAME FROM ROOT_PARENTS)
                CONNECT BY NOCYCLE PRIOR  TABNAME = REFTABNAME AND 
                                          TABSCHEMA =  REFTABSCHEMA)
            SELECT TABSCHEMA, TABNAME
                FROM SYSCAT.TABLES
                WHERE CHILDREN > 0
            EXCEPT
                SELECT REFTABSCHEMA, REFTABNAME  FROM HIERARCHY H;

        OPEN C_CYCLETABLES;

        -- Just select the first table as a root table from the remaining tables that are 
        -- in RI cycles
        FETCH C_CYCLETABLES INTO CYCLESCHEMA, CYCLETABLE;

        SET ROWS_FETCHED = CURSOR_ROWCOUNT(C_CYCLETABLES);

        -- Keep looping until the result set from C_CYCLETABLES is empty.
        WHILE (ROWS_FETCHED > 0) DO

            -- insert the select representative into temporary table SESSION.CYCLEROOTS 
            INSERT INTO SESSION.CYCLEROOTS VALUES (CYCLESCHEMA, CYCLETABLE);

            CLOSE C_CYCLETABLES;

            -- restart the cursor.  The result set will be different from the 
            -- the previous iterations of the loop.  All the tables that
            -- are in the same RI cycle as the newly added representative
            -- will not show up in the result set from the next iteration of the
            -- loop.
            OPEN C_CYCLETABLES;

            FETCH FROM C_CYCLETABLES INTO CYCLESCHEMA, CYCLETABLE;

            SET ROWS_FETCHED = CURSOR_ROWCOUNT(C_CYCLETABLES);

        END WHILE;

        CLOSE C_CYCLETABLES;

        -- When the loop exits, one representative from each RI cycle will have
        -- been added to the temporary table SESSION.CYCLEROOTS.  We will now 
        -- open C_RESULTS using the populated temporary table SESSION.CYCLEROOTS.

        OPEN C_RESULTS;

END@
--#SET TERMINATOR ;

CALL newton.FIND_RI_CHAINS();

DROP TABLE  SESSION.CYCLEROOTS;

结果将类似于清单 10。

清单 10. 显示所有 RI 链的查询 3 的输出

Result set 1
--------------
                
ROOT       LVL PARENT      CHILD      CNSTNAM CHAIN                                     
---------- --- ----------- ---------- ------- ------------------------------------------
SELF        1  SELF        SELF       FK_SELF ->SELF->SELF                              
GRANDPARENT 1  GRANDPARENT PARENT     FKP1    ->GRANDPARENT->PARENT                     
GRANDPARENT 2  PARENT      CHILD      FK1     ->GRANDPARENT->PARENT->CHILD              
GRANDPARENT 3  CHILD       GRANDCHILD FK3     ->GRANDPARENT->PARENT->CHILD->GRANDCHILD  
GRANDPARENT 1  GRANDPARENT PARENT2    FKP2    ->GRANDPARENT->PARENT2                    
GRANDPARENT 2  PARENT2     CHILD      FK2     ->GRANDPARENT->PARENT2->CHILD             
GRANDPARENT 3  CHILD       GRANDCHILD FK3     ->GRANDPARENT->PARENT2->CHILD->GRANDCHILD 
GRANDPARENT 2  PARENT2     GRANDCHILD FK4     ->GRANDPARENT->PARENT2->GRANDCHILD        
C12         1  C12         GP         FKGP1   ->C12->GP                                 
C12         2  GP          P1         FKP1    ->C12->GP->P1                             
C12         3  P1          C11        FKC11   ->C12->GP->P1->C11                        
C12         3  P1          C12        FKC12   ->C12->GP->P1->C12   
                
12 record(s) selected.
                
Return Status = 0

请注意,表 C12 已被选择为包含 GP->P1->C12->GP 的 RI 循环中的代表。

结束语

本文介绍了如何编写一个分层查询来查找数据库中的外键关系。具体而言,本文提供了一个查询示例来查找给定表的外键子表和后代表。还提供了示例查询和过程来查找数据库中的所有表的所有外键关系,无论 RI 循环中是否包含根表的用户输入。

相关文章

基于EA的数据库建模
数据流建模(EA指南)
“数据湖”:概念、特征、架构与案例
在线商城数据库系统设计 思路+效果
 
相关文档

Greenplum数据库基础培训
MySQL5.1性能优化方案
某电商数据中台架构实践
MySQL高扩展架构设计
相关课程

数据治理、数据架构及数据标准
MongoDB实战课程
并发、大容量、高性能数据库设计与优化
PostgreSQL数据库实战培训
 
分享到
 
 


MySQL索引背后的数据结构
MySQL性能调优与架构设计
SQL Server数据库备份与恢复
让数据库飞起来 10大DB2优化
oracle的临时表空间写满磁盘
数据库的跨平台设计
更多...   


并发、大容量、高性能数据库
高级数据库架构设计师
Hadoop原理与实践
Oracle 数据仓库
数据仓库和数据挖掘
Oracle数据库开发与管理


GE 区块链技术与实现培训
航天科工某子公司 Nodejs高级应用开发
中盛益华 卓越管理者必须具备的五项能力
某信息技术公司 Python培训
某博彩IT系统厂商 易用性测试与评估
中国邮储银行 测试成熟度模型集成(TMMI)
中物院 产品经理与产品管理
更多...