The set { A1, A2, … An } is a superkey for a relation R if it functional determinel all attributes for a relation. In other words, any two tuples of R must necessarily agree on A1, A2, … An.
{ A1, A2,
… An } is a
key for a relation R if it is a
minimal superkey
: That is, the set is a superkey but no
proper subset is a superkey.
Example: For Posts, { poster, postdate } is a key. For Users, { userid, passwd } is a superkey but not a key, since there is a proper subset ({ userid }) which is also a superkey.
A relation R is in Boyce-Codd Normal Form (BCNF) if for every nontrivial FD A1 A2 … An → B1 B2 … Bm satisfied by R, the set { A1, A2, … An } is a superkey for R.
Suppose we defined Posts using the following schema instead.
Posts(poster, name, postdate, subject, body)
This would not be in BCNF, since it satisfies the FD poster → name, which is nontrivial, but poster is not a superkey for Posts.
The original definition of Posts is in BCNF.
It's desirable for all databases to have its relations in BCNF. So, given a relation not in BCNF, we would normally attempt to decompose it into two or more BCNF relations.
This is simple: We find a violating FD (i.e., a nontrivial FD for which the left side is not a superkey), expand the right side to include as many attributes as possible (called computing the closure of the left side), and then split the table into:
If any of the resulting relations aren't in BCNF, we recurse on it (them).
Take Posts(poster, name, postdate, subject, body) as an example. Here, poster → name is a violating FD. We can expand the right side slightly by adding poster to it, but we can't really expand the FD further, since poster name doesn't lead to anything else. Thus, we'll split into two tables:
Both are in BCNF, so there's no more work to do.