COS 597A, Fall 2008 - Problem Set 4

Due at 5:00pm, Friday  October 24, 2008
You may submit electronically or by hardcopy: bring your hardcopy submission to my office or put it in my mailbox in the CS mailroom on the 2nd floor.


Collaboration Policy

You may discuss problems with other students in the class. However, each student must write up his or her own solution to each problem independently. That is, while you may formulate the solutions to problems in collaboration with classmates, you must be able to articulate the solutions on your own.

Late Penalties

Note atypical  penalties due to fall break and approaching test.


Chapter, exercise and page numbers refer to the course text Database System Concepts, Fifth Edition, by Silberschatz,  Korth, and Sudarshan
Problem 1:
Suppose that each functional dependency in a set Φ of functional dependencies on a relational scheme R satisfies the conditions for Third Normal Form (3NF).  Show that any functional dependency that is derived from one application of any of the three rules making up Armstrong's axioms satisfies the conditions for 3NF.   Be precise.

Problem 2:
Consider relational scheme R = (A,B,C,D,E)  with functional dependencies:
A→ BC
CD→E
B→D
E→A
(taken from Practice Exercise 7.1, pg 306).
The set of dependencies in the closure of these dependencies can be characterized as:
A→ ABCDE
BC→ ABCDE
CD→ ABCDE
E→ ABCDE
All other functional dependencies in the closure are either trivial (right side is subset of left side) or derived from one of the above by use of  Armstrong's Augmentation rule or by taking a subset of attributes on the right-hand side of the dependency (the derived Decomposition rule, pg 280). 

Give a lossless, dependency-preserving decomposition of R into two or more relations satisfying 3NF (Exercise 7.27, pg 309).



Problem 3:
The eXist XML database software is free, open-source software that provides an XQuery processing engine for XML documents.  We are going to use a very very small portion of eXist - an application that allows one to write XQuery queries on a database consisting of 3 of Shakespeare's plays captured as XML documents.  Everyone will run under one guest account using a Web interface.  You will express queries and either print or save to a file the results of your queries.  Since everyone will be using the same account, you should save intermediate work by cutting and pasting to a file.  Then you can cut and paste back into the Web interface when you resume your work.  The eXist Web interface will not save state beyond your session. 

You can use eXist either by accessing a local copy running on the Computer Science Department studentdb server  http://studentdb.cs.princeton.edu:8080/exist/index.xml  or by using the eXist Project public Web site at  http://exist-db.org/.   Either will display a home page that looks essentially the same.  Access to the local copy is restricted to users in the cs.princeton.edu domain.  If you are not a member of the CS department,  I will email you with options, but one option is certainly the public Web site.   The advantage of using the local copy is that it will not be changed during the rest of the semester except to fix problems (as best we can) and response may be faster.   The advantage of the eXist organization site is that you don't need to be in the cs.princeton.edu domain and the bugs may be fixed before we know about them. 

The assignment consists of the following steps.  Note that you do not need to submit anything until Step 5:
  1. Follow one of the links above to the eXist home page.
  2. Looking down the left hand side menu, find "Examples" and click on "XQuery Sandbox" under "Examples".   Note that the use of Sandbox requires that Javascript is enabled.  You will reach a Web page that looks like this (pdf file) (except that color highlighting does not show in the pdf file.  Also,  the drop-down choice box labeled "Paste Saved Query" may or may not have a query description showing.  We are not using these "saved queries" in this assignment, but you may want to look at them for more XQuery examples.)  Sandbox allows you to write XQuery queries on several pre-loaded XML databases.  We will use only the Shakespeare database, which consists of 3 plays:  HamletMacbeth and Romeo and Juliet.   Each play is a separate XML document but the Shakespeare database consists of all 3 plays (each play is a child of the database root).  The default database for queries submitted to Sandbox is the Shakespeare database. 
  3. Type (or cut and paste) the following XML query into the text box below the "Paste Saved Query" drop-down box.   Note that eXist  XQuery is case sensitive.
for $play in /PLAY
where $play/TITLE &= "Hamlet"
return $play

Then click the SEND button. You will see in the result window the full XML text of Hamlet.  The result may take on the order of a minute to appear if you are using the eXist Project public Web site (but "Found 1 in 0... seconds. " will appear almost immediately in the color highlighted top of the result window).  Here (pdf)  is what the beginning of the result looks like.  This gives you an idea of how the plays are tagged.  Note that  "&="  denotes containment and matches full words. (Try replacing "Hamlet" with "Ham"; then replace it with "Ham*", where "*" matches any string.)   As you can see from the text of the TITLE tag, Hamlet is not the complete name of the play. 
  1. Type (or cut and paste) the following XML query into the text box and send it:
for $speech in /PLAY//SPEECH
where LINE &= 'war'
return $speech

(Note that the local site and the eXist Project site may return the results in different order;  this is because the 3 plays are in different orders in the different copies of the database.)
  1. Modify the query of Step 4 to return only the lines containing 'war'.   (Hint: recall that text() returns the text content of a tag.)  Turn in the query and the result of the query as part of the submission of this assignment.  You may print the Sandbox page showing the query and result.  If you wish to submit electronically,  you can either cut and paste the query and result into a text file or you can print the Sandbox page to a pdf file (if you have Adobe Acrobat).
  1. Type (or cut and paste) the following XML query into the text box and send it:
for $ghostspeech in /PLAY [TITLE &= 'Hamlet']//SPEECH [SPEAKER = 'Ghost']
let $ghostlines := $ghostspeech/LINE
return
<GhostLines total_number="{count($ghostlines)}">
{$ghostlines}
</GhostLines>

Describe the query in English.  Why is "
let $ghostlines := $ghostspeech/LINE" used?  Submit your answers to these questionsdo not submit the result of the query.
  1. Type (or cut and paste) the following XML query into the text box and send it:
for $ghostspeech in /PLAY [TITLE &= 'Hamlet']//SPEECH [SPEAKER = 'Ghost']
let $ghostlines := $ghostspeech/LINE
return
<GhostLines total_number="{count($ghostlines)}">
<FIRSTLINE>{$ghostspeech/LINE[1]/text()}</FIRSTLINE>
</GhostLines>

Describe the query in English.  Why is
"$ghostspeech/LINE[1]/text()" used in the return  for this query?  Submit your answers to these questionsdo not submit the result of the query.
Note that TAGNAME[n] is a general form for referring to the nth occurrence of  a child element with tag TAGNAME  in the order the children appear in the document.
  1. Type (or cut and paste)  the following XML query into the text box and send it:
let $allgspeech := /PLAY [TITLE &= 'Hamlet']//SPEECH [SPEAKER = 'Ghost']
let $ghostlines := $allgspeech/LINE
return
<GhostLines total_number="{count($ghostlines)}">
{for $ghostspeech in $allgspeech
return
<FIRSTLINE>{$ghostspeech/LINE[1]/text()}</FIRSTLINE>}
</GhostLines>

How do the results of this query differ from those in Step 7?  Why is "$allgspeech" used?  Why is a nested query used in the return portion?  Submit your answers to these questionsdo not submit the result of the query.
  1. Write an XQuery query that returns the titles of the plays that contain the word "fairies" in one or more of the lines of the play.  Turn in the query and the result of the query as part of the submission of this assignment.



NOTE:  Cognetic Systems Inc. also has an XQuery demo site that offers XML versions of Shakespeare plays (all 37).  Choose "Full-text Search" from the left hand side menu.  If you decide to play with this demo site, you should be aware of differences with the eXist site:
eXist and Cognetic Systems are the two demo sites that I happened to find on the Web;  there may be more and better.  I welcome your findings.