COS 597A, Fall 2011

Solutions to Problem Set 4


Problem 1:
First note that the first and third conditions of 3NF for a functional dependency X→Y:
(1st) Y is a subset of X
(3rd) each attribute A in Y-X is contained in a candidate key for R
can be combined into one condition because condition 1 just says Y-X is empty.

Suppose we use Armstrong's transitivity rule to obtain X→Z  from X→Y and Y→ Z in Φ and neither X or Y is a superkey.  We need to show that X→Z  satisfies the conditions for 3NF.   The 3rd condition of 3NF must hold for each of  X→Y and Y→ Z since the 2nd condition requires X and Y be superkeys, respectively.  The 3rd condition  applied to Y→Z  tells us that each attribute in Z  is either contained in a candidate key or in Y.   We consider each attribute A in Z as falling into one of two subcases:
c1) A is part of some candidate key:  any such A that is in Z-X then satisfies the 3rd condition of 3NF for X→Z.
c2) A is not part of some candidate key and thus must be in Y:  any such A must satisfy the 3rd 3NF condition for X→Y as well because X is not a superkey.   Therefore, such an A must also be in X and satisfies the 3rd condition of 3NF for  X→Z.

The above proof is part of the proof of the following theorem:

Theorem: Suppose that each functional dependency in a set Φ of functional dependencies on a relational scheme R satisfies the conditions for Third Normal Form (3NF).  Then any functional dependency that is derived from Φ using one application of any of the three rules making up Armstrong's axioms satisfies the conditions for 3NF.  

The theorem allows us to conclude that the functional dependencies in  Φ+ are in 3NF if the functional dependencies in Φ are in 3NF.


Problem 2:
Part a.
  1. A is a candidate key. So, FD 1 satisfies BCNF.
  2. AB is a superkey. So, FD 2 satisfies BCNF.
  3. FD 3 is not trivial, BCD is not a superkey, and E is not part of any candidate key. Therefore, FD 3 is neither BCNF nor 3NF.
Part b.
E->A and A->ABCDE imply that E->ABCDE, that is, E is a key.
BCD->E and E->ABCDE imply that BCD->ABCDE, that is, BCD is a key.
 
FDs 1 and 2 satisfy BCNF as before.
FD 3 now satisfies BCNF because BCD is a candidate key.
E->A also satisfies BCNF because E is a candidate key.

Part c.
Adding E->B does not change the set of candidate keys. Therefore, FDs1 and 2
are still in BCNF. FD 3 is neither in BCNF nor in 3NF. E->B is neither in BCNF
nor in 3NF.

Part d.
Decomposition:  R1:ABCD and R2:BCDE

Formally, we need to consider the closures of sets of dependencies, but we can draw conclusions by examining a smaller set of dependencies:

non-trivial dependences for R implied by the four given dependencies: AE->D, but implied by A->D

non-trivial dependences for R1 implied by the four given dependences: A->ABCD, AB->D (implied by A->D)
non-trivial dependences for R2 implied by the four given dependences: BCD->E, E->B

non-trivial dependency between R1 and R2:  A->E is implied by A->BCD in R1 and BCD->E in R2, so dependency preserving.

A is a candidate key in R1, so R1 in BCNF and 3NF
BCD is a candidate key for R2, so for E->B, B is part of a candidate key and in 3NF.  Therefore R2 is in 3NF.
Thus both R1 and R2 are in 3NF.

Part e.
Decomposition:   R1: ACDE and R2:EB

non-trivial dependences for R implied by the four given dependencies: AE->D, but implied by A->D

non-trivial dependences for R1 implied by the four given dependences: A->ACDE, AE->D from E->B, so AE->AB, and AB->D (but also implied by A->D)
non-trivial dependences for R2 implied by the four given dependences:  E->B

non-trivial dependencies between R1 and R2: 
A->B is implied by A->E in R1 and E->B in R2.
BCD->E is not implied by dependencies in R1 and R2, so not dependency preserving.

A is a candidate key in R1, so R1 is in BCNF and 3NF.
E is a candidate key in R2, so R2 is in BCNF and 3NF.
Thus both R1 and R2 are in BCNF.







Problem 3:

Step 4: 
Query Text:
for $line in doc("plays.xml")/PLAYS/PLAY//SPEECH/LINE
where $line ftcontains 'ghost'
return $line

Result:
<LINE>AEgeon art thou not? or else his ghost?</LINE>
<LINE>Ghost unlaid forbear thee!</LINE>
<LINE>By heaven, I'll make a ghost of him that lets me!</LINE>
<LINE>Alas, poor ghost!</LINE>
<LINE>Ay, thou poor ghost, while memory holds a seat</LINE>
<LINE>There needs no ghost, my lord, come from the grave</LINE>
<LINE>It is an honest ghost, that let me tell you:</LINE>
<LINE>It is a damned ghost that we have seen,</LINE>
<LINE>Never, O never, do his ghost the wrong</LINE>
<LINE>Henry the Fifth, thy ghost I invocate:</LINE>
<LINE>These news would cause him once more yield the ghost.</LINE>
<LINE>I think this upstart is old Talbot's ghost,</LINE>
<LINE>I trust the ghost of Talbot is not there:</LINE>
<LINE>Oft have I seen a timely-parted ghost,</LINE>
<LINE>And do some service to Duke Humphrey's ghost.</LINE>
<LINE>Sometimes he talks as if Duke Humphrey's ghost</LINE>
<LINE>The noble gentleman gave up the ghost.</LINE>
<LINE>And he will look as hollow as a ghost,</LINE>
<LINE>Our army lies, ready to give up the ghost.</LINE>
<LINE>The ghost of Caesar hath appear'd to me</LINE>
<LINE>Vex not his ghost: O, let him pass! he hates him much</LINE>
<LINE>Moves like a ghost. Thou sure and firm-set earth,</LINE>
<LINE>Her brother's ghost his paved bed would break,</LINE>
<LINE>Be it lawful that I invocate thy ghost,</LINE>
<LINE>To yield the ghost: but still the envious flood</LINE>
<LINE>Marry, my uncle Clarence' angry ghost:</LINE>
<LINE>Blind sight, dead life, poor mortal living ghost,</LINE>
<LINE>O, look! methinks I see my cousin's ghost</LINE>
<LINE>Were I the ghost that walk'd, I'ld bid you mark</LINE>
<LINE>As, walk'd your first queen's ghost,</LINE>


Step 5:  The query is:
"For each speech by the Ghost in the play Hamlet,  return the lines of the speech as one element enclosed in the tag <
GhostLines> </GhostLines>
and set
the attribute
total_number of GhostLines equal to the number of lines in the speech."
"let $ghostlines := $ghostspeech/LINE" is used to hold the entire sequence of lines of one speech as one entity.  The value of variable $ghostlines is this sequence of lines for each speech in turn.  Each sequence can thus be counted and returned within one <GhostLines> </GhostLines> tag.

Step 6:  The query is:
"For each speech by the Ghost in the play Hamlet,  return the first line of the speech enclosed in the tag
<FIRSTLINE> </FIRSTLINE>.  Furthermore, for each speech, make <FIRSTLINE> </FIRSTLINE>a child element of the element tagged with <GhostLines> </GhostLines>,  and set the attribute total_number of GhostLines equal to the number of lines in the speech."
"$ghostspeech/LINE[1]/text()" used in the return  for this query because we do not want the <LINE> </LINE> tag from the original XML of the first line of each speech by the Ghost.  We replace this with <FIRSTLINE> </FIRSTLINE> by selecting only the text within the element tagged with <LINE> </LINE> and surrounding the text  with <FIRSTLINE> </FIRSTLINE>.

Step 7:  With this query, the entire sequence of first lines of all speeches by the Ghost  are one element tagged with <GhostLines> </GhostLines>.  The attribute total_number  of  <GhostLines> </GhostLines> now gives the total number of lines in the enter play uttered by the Ghost. 

The value of  $allgspeech is the entire sequence of speeches by the Ghost.  (Recall that each speech is an element and can be represented as a node in the tree model.)  Using it, we can  iterate through each speech in turn in the nested query.  We also use it to specify "$ghostlines", which then takes on the entire sequence of lines spoken by the Ghost in the whole play as one value.  (This is in contrast to the use of "$ghostlines" in Steps 6 and 7. For these earlier Steps, the assignment of values to $ghostlines changes as we iterated over each speech;  $ghostlines holds the sequence of lines for one speech at a time.)

We use a nested query  in the return portion because we want information about each speech by the ghost, namely the first line, after we generate summary information for the whole play, namely the total number of lines spoken by the Ghost.  The nesting allows us to iterate over each speech within the construction of the query result even though we aggregated over all speeches in the outer portion of the query by our declarations of $allgspeech and $ghostlines.

Step 8:  One possibility:
 
Query Text:
for $speaks in doc("plays.xml")/PLAYS/PLAY 
[TITLE ftcontains 'Romeo']//SPEECH
where $speaks/LINE ftcontains 'hate'
return
<SPEAKSHATE>{$speaks/SPEAKER/text()}</SPEAKSHATE>

Result:
<SPEAKSHATE>TYBALT</SPEAKSHATE>
<SPEAKSHATE>PRINCE</SPEAKSHATE>
<SPEAKSHATE>ROMEO</SPEAKSHATE>
<SPEAKSHATE>JULIET</SPEAKSHATE>
<SPEAKSHATE>ROMEO</SPEAKSHATE>
<SPEAKSHATE>TYBALT</SPEAKSHATE>
<SPEAKSHATE>PRINCE</SPEAKSHATE>
<SPEAKSHATE>FRIAR LAURENCE</SPEAKSHATE>
<SPEAKSHATE>JULIET</SPEAKSHATE>
<SPEAKSHATE>JULIET</SPEAKSHATE>
<SPEAKSHATE>PRINCE</SPEAKSHATE>



Step 9:  One possibility:

Query Text:
for $speaks in doc("plays.xml")/PLAYS/PLAY 
[TITLE ftcontains 'Romeo']//SPEECH
where $speaks/LINE ftcontains 'hate'
return
<SPEAKSHATE>{$speaks/SPEAKER} {for $wordline in $speaks/LINE
where $wordline ftcontains 'hate'
return $wordline }
</SPEAKSHATE>

Result:
<SPEAKSHATE>
<SPEAKER>TYBALT</SPEAKER>
<LINE>What, drawn, and talk of peace! I hate the word,</LINE>
<LINE>As I hate hell, all Montagues, and thee:</LINE>
</SPEAKSHATE>
<SPEAKSHATE>
<SPEAKER>PRINCE</SPEAKER>
<LINE>Canker'd with peace, to part your canker'd hate:</LINE>
</SPEAKSHATE>
<SPEAKSHATE>
<SPEAKER>ROMEO</SPEAKER>
<LINE>Here's much to do with hate, but more with love.</LINE>
<LINE>Why, then, O brawling love! O loving hate!</LINE>
</SPEAKSHATE>
<SPEAKSHATE>
<SPEAKER>JULIET</SPEAKER>
<LINE>My only love sprung from my only hate!</LINE>
</SPEAKSHATE>
<SPEAKSHATE>
<SPEAKER>ROMEO</SPEAKER>
<LINE>My life were better ended by their hate,</LINE>
</SPEAKSHATE>
<SPEAKSHATE>
<SPEAKER>TYBALT</SPEAKER>
<LINE>Romeo, the hate I bear thee can afford</LINE>
</SPEAKSHATE>
<SPEAKSHATE>
<SPEAKER>PRINCE</SPEAKER>
<LINE>I have an interest in your hate's proceeding,</LINE>
</SPEAKSHATE>
<SPEAKSHATE>
<SPEAKER>FRIAR LAURENCE</SPEAKER>
<LINE>By doing damned hate upon thyself?</LINE>
</SPEAKSHATE>
<SPEAKSHATE>
<SPEAKER>JULIET</SPEAKER>
<LINE>It shall be Romeo, whom you know I hate,</LINE>
</SPEAKSHATE>
<SPEAKSHATE>
<SPEAKER>JULIET</SPEAKER>
<LINE>Proud can I never be of what I hate;</LINE>
<LINE>But thankful even for hate, that is meant love.</LINE>
</SPEAKSHATE>
<SPEAKSHATE>
<SPEAKER>PRINCE</SPEAKER>
<LINE>See, what a scourge is laid upon your hate,</LINE>
</SPEAKSHATE>