Relational Database Design¶

约 2084 个字 -8 行代码 9 张图片预计阅读时间 7 分钟

Outline:

Features of Good Relational Design
Atomic Domains and First Normal Form
Decomposition Using Functional Dependencies
Functional Dependency Theory
Algorithms for Functional Dependencies
Decomposition Using Multivalued Dependencies
More Normal Forms
Database-Design Process
Modeling Temporal Data

Features of Good Relational Design¶

Consider combining relations


student(id, name, tot_cred)
stud_dept(id, dept_name)

\[ \rightarrow \]


student(id, name, tot_cred, dept_name)

假设我们有一个关系


inst_dept(id, name, salary, dept_name, building, budget)

我们应该怎么把它们拆分成两个关系(instructor和department)呢？

通过函数依赖我们可以看到

\[ \mathrm{id} \rightarrow \mathrm{name, salary, dept_name} \]

\[ \mathrm{dept_name} \rightarrow \mathrm{building, budget} \]

在 inst_dept 关系中，由于 dept_name 不是候选键，所以可以把 building, budget, dept_name 拿出来单独成一个关系 department，而 id, name, salary 作为 instructor

但是有时候这么做可能会导致信息丢失 lose information, 比如

SQL
employee(ID, name, street, citym, salary)

\[ \rightarrow \]

SQL
employee1(ID, name)

employee2(name, street, city, salary)

这就发生了信息丢失，因为可能会有同名的人，导致 employee1 和 employee2 之间的联系丢失了

Lossless-join Decomposition¶

我们说一个分解是lossless-join 的，如果我们通过 $R_1 \cup R_2$ 代替 $R$, 不发生信息丢失

Conversely a decomposition is lossy if

\[ r \subset \pi_{R_1}(r) \bowtie \pi_{R_2}(r) \]

Note

more tuples implies more uncertainty(less information)

A decomposition of $R$ into $R_1$ and $R_2$ is lossless join if at least one of the following holds:

\[ R_1 \cap R_2 \rightarrow R_1 \]

\[ R_1 \cap R_2 \rightarrow R_2 \]

First Normal Form¶

A relation schema is in first normal form if the domains of all attributes of R are atomic

Domain is atomic if its elements are considered to be indivisible units.

一个关系满足第一范式需要以下条件：

原子性：关系中的每个属性都必须是原子值，即不可再分的基本数据单位。
无重复组：关系中的每一行都必须是唯一的，不能有重复的行。
唯一性：关系中的每一行都必须有一个唯一的标识符，即主键。

不符合第一范式的例子：

姓名	科目
张三	数学，物理

上边的“科目”一栏，就违背了“原子性”的原则

Important

Normal Forms(NF):

$\mathrm{1NF} \rightarrow \mathrm{2NF} \rightarrow \mathrm{3NF} \rightarrow \mathrm{BCNF} \rightarrow \mathrm{4NF}$

Functional Dependencies¶

Note

Let $R$ be a relation schema

\[ \alpha \subseteq R \ \mathrm{and} \ \beta \subseteq R \]

The functional dependency $\alpha \rightarrow \beta$ holds on $R$ if for any two tuples $t_1$ and $t_2$ in $R$, if $t_1[\alpha] = t_2[\alpha]$, then $t_1[\beta] = t_2[\beta]$

函数依赖 $\alpha \rightarrow \beta$ 要求从 $\alpha$ 到 $\beta$ 的映射是唯一的。反过来就是要求从 $\beta$ 到 $\alpha$ 的映射是唯一的

Example: Consider $r(A, B)$ with the following instance of $r$.

1	4
1	5
3	7

从 $A \rightarrow B$ 就不满足函数依赖，而从 $B \rightarrow A$ 就满足函数依赖

Superkey¶

K is a superkey for relation schema R if and only if $K \rightarrow R$

如果 K 是一个超键，那么 K 可以决定关系 R 中的所有属性换句话说，K 可以唯一地标识关系 R 中的每一行

K is a candidate key for $R$ is and only if

$K \rightarrow R$, and
for no $\alpha \subset K, \alpha \rightarrow R$

如果 K 是候选键，当且仅当 K 能决定 R，并且没有任何 K 的子集能决定 R

A functional dependency is trivial if it is satisfied by all relations

Example:

$\mathrm{ID, name} \rightarrow \mathrm{ID}$
$\mathrm{name} \rightarrow \mathrm{name}$

in general, $\alpha \rightarrow \beta \ \mathrm{is trivial if} \ \beta \subseteq \alpha$

Closure(闭包) of a set of Functional Dependencies¶

Given a set F of functional dependencies, there are certain other functional dependencies that are logically implied by F.

For example: if $\mathrm{A} \rightarrow \mathrm{B}$ and $\mathrm{B} \rightarrow \mathrm{C}$, then we can infer that $\mathrm{A} \rightarrow \mathrm{C}$

The set of all functional dependencies logically implied by F is the closure of F.

We donate the closure of F by $F^+$, $F^+$ is a superset of F

Armstrong's Axioms

if $\beta \subseteq \alpha$, then $\alpha \rightarrow \beta$ (reflexivity, 自反率)
if $\alpha \rightarrow \beta$ then $\gamma \alpha \rightarrow \gamma \beta$ (augmentation, 增补率)
if $\alpha \rightarrow \beta$ and $\beta \rightarrow \gamma$, then $\alpha \rightarrow \gamma$ (transitivity, 传递率)

Example

$R \ = \ \{A, B, C, G, H, I\}$

$F \ = \ \{A \rightarrow B, A \rightarrow C, CG \rightarrow H, CG \rightarrow I, B \rightarrow H\}$

some members of $F^+$

$A \rightarrow H$
- by transitivity from $A \rightarrow B$ and $B \rightarrow H$
$AG \rightarrow I$
- by augmentation from $A \rightarrow C$ with G, to get $AG \rightarrow CG$ and then transitivity with $CG \rightarrow I$
$CG \rightarrow HI$
- by augmentation from $CG \rightarrow I$ to infer $CG \rightarrow CGI$, and augmentation of $CG \tighrtarrow H$ to infer $CGI \rightarrow HI$, and then transitivity

Additional rules:

If $\alpha \rightarrow \beta$ and $\alpha \rightarrow \gamma$ holds, then $\alpha \rightarrow \beta \gamma$ holds (union, 合并)
If $\alpha \rightarrow \beta \gamma$ holds, then $\alpha \rightarrow \beta$ holds and $\alpha \rightarrow \gamma$ holds (decomposition, 分解)
If $\alpha \rightarrow \beta$ and $\gamma \beta \rightarrow \delta$ holds, then $\alpha \gamma \rightarrow \delta$ holds (pseudotransitivity, 伪传递率)

Closure of Attribute Sets¶

Given a set of attributes a, define the closure of a under F(donate by $a^+$) as the set of attributes that are functionally determined by a under F

Example

$R(A, B, C, D), F \ = \ \{A \rightarrow B, B \rightarrow C, B \rightarrow D\}$

So

\[ A^+ = ABCD \]

\[ B^+ = BCD \]

\[ C^+ = C \]

Uses of Attribute Closure¶

Canonical Cover(正则覆盖)¶

A canonical cover of F is a minimal set of functional dependencies equivalent to F, having no redundant dependencies or redundant parts of dependencies.

正则覆盖是 F 的一个最小超集，没有冗余的依赖关系或冗余的依赖关系部分

Example:

In $\{A \rightarrow B, B \rightarrow C, A \rightarrow C \}$, $A \rightarrow C$ is redundant, so we can remove it and get $\{A \rightarrow B, B \rightarrow C\}$

Example

On RHS:

\[ \{A \rightarrow B, B \rightarrow C, A \rightarrow CD \} \]

can be simplified to

\[ \{A \rightarrow B, B \rightarrow C, A \rightarrow D\} \]

On LHS:

$$ {A \rightarrow B, B \rightarrow C, AC \rightarrow D } $$

can be simplified to

\[ \{A \rightarrow B, B \rightarrow C, A \rightarrow D\} \]

Extraneous Attributes(无关属性)¶

Consider a set F of functional dependencies and the functional dependency $\alpha \rightarrow \beta$ in F.

Attribute A is extraneous in $\alpha$ if $A \in \alpha$ and F logically implies $(F - \{\alpha \rightarrow \beta \}) \cup \{(\alpha - A) \rightarrow \beta\}$
- 如果移除了 A 之后仍然能推导出相同的约束，那么 A 就是无关属性
Attribute A is extraneous in $\beta$ if $A \in \beta$ and the set of functional dependencies $(F - \{\alpha \rightarrow \beta \}) \cup \{\alpha \rightarrow (\beta - A)\}$

Example

Given $F \ = \ \{A \rightarrow C, AB \rightarrow C\}$

B is extraneous in $AB \rightarrow C$ because $\{A \rightarrow C, AB \rightarrow C\}$ logically implies $\{A \rightarrow C\}$

Given $F \ = \ \{A \rightarrow C, AB \rightarrow CD\}$

C is extraneous in $AB \rightarrow CD$ because $\{A \rightarrow C\}$ can be inferred even after deleting C.

Computing a Canonical Cover¶

\[ R \ = \ (A, B, C) \]

\[ F \ = \ \{A \rightarrow BC, B \rightarrow C, A \rightarrow B, AB \rightarrow C\} \]

Combine $A \rightarrow BC$ and $A \rightarrow B$ into $A \rightarrow BC$
Set is now $\{A \rightarrow BC, B \rightarrow C, AB \rightarrow C\}$
A is extraneous in $AB \rightarrow C$
Check if the result of deleting A from $AB \rightarrow C$ is implied by the other dependencies
- Yes: in fact, $B \rightarrow C$ is already in the set
Set is now $\{A \rightarrow BC, B \rightarrow C\}$
C is extraneous in $A \rightarrow BC$
Check if $A \rightarrow C$ is logically implied by $A \rightarrow B$ and the other dependencies
The canonical cover is $\{A \rightarrow B, B \rightarrow C\}$

另一种方法是通过画图：

可以直观地看出正则覆盖是 $\{A \rightarrow B, B \rightarrow C\}$

Example

Boyce-Codd Normal Form (BCNF)¶

一个关系模式满足 BCNF , 如果对于 $F^+$ 的所有形如 $\alpha \rightarrow \beta$ 的函数依赖，至少满足以下条件之一：

$\alpha \rightarrow \beta$ 是平凡的
$\alpha$ 是关系模式的超键

即 BCNF 要求所有的函数依赖要么是平凡的，要么左侧是超键

Decomposing a Schema into BCNF¶

假设我们有 schema R 和一个非平凡的函数依赖 $\alpha \rightarrow \beta$

我们可以把 R 分解为：

\[ \alpha \cup \beta \]

\[ R \ - \ (\beta \ - \ \alpha ) \]

BCNF Decomposition Algorithm¶

BCNF and Dependency Preservation¶

依赖保持：原来的函数依赖都可以在分解后的函数依赖中得到单独检验。如果需要把几个关系连在一起才能检验的，称为依赖不保持。

Note

If it's sufficient to test only those dependencies on each individual relation of a decomposition in order to ensure that all functional dependencies hold, then that decomposition is dependency perservation

如果通过检验单一关系上的函数依赖，就能确保所有的函数依赖成立，那么这样的分解是依赖保持的。或者原来关系 R 上的每一个函数依赖，都可以在分解后的单一关系上得到检验或者推导得到。

Let $F_i$ be the set of all functional dependencies in $F^+$ that include only attributes in $R_i$.($F_i$: the restriction of F on $R_i$)

A decomposition is dependency preserving if $(F_1 \cup F_2 \cup \cdots \cup F_n)^+ \ = \ F^+$
If it is not, then checking updates for violation of functional dependencies may require computing joins, which is expensive.

Example

Example

Third Normal Form (3NF)¶

A relation schema R is in third normal form(3NF) if for all:

\[ \alpha \rightarrow \beta \in F^+ \]

at least ont of the following holds:

$\alpha \rightarrow \beta$ is trivial
$\alpha$ is a superkey for R
Each attribute A in $\beta \ - \ \alpha$ is contained in a candidate key for R

任意一个非平凡的函数依赖，要么左边是超键，要么右边必须包含在一个候选键里面。

Algorithm

Goals of Normalization¶

Let R be a relation scheme with a set F of functional dependencies.Decide whether a relation scheme R is in "good" form.

In the case that a relation scheme R is not in "good" form, decompose it into a set of relation scheme $\{R_1, R_2, \cdots, R_n\}$ such that:

each relation scheme is in good form(3NF or BCNF)
the decomposition is a lossless-join decomposition
Preferably, the decomposition should be dependency preserving

Relational Database Design¶

Features of Good Relational Design¶

Lossless-join Decomposition¶

First Normal Form¶

Functional Dependencies¶

Superkey¶

Closure(闭包) of a set of Functional Dependencies¶

Closure of Attribute Sets¶

Uses of Attribute Closure¶

Canonical Cover(正则覆盖)¶

Extraneous Attributes(无关属性)¶

Computing a Canonical Cover¶

Boyce-Codd Normal Form (BCNF)¶

Decomposing a Schema into BCNF¶

BCNF Decomposition Algorithm¶

BCNF and Dependency Preservation¶

Third Normal Form (3NF)¶

Goals of Normalization¶

评论