COS 597D, Fall 2013 
        Questions on data distribution in noSQL papers
       
        Due at 1:30pm, Wednesday October 23, 2013.
        You may hand in a paper copy or email a file to me.
        Keep a copy for your use during class discussion.
        
      No credit for
late
    submission.
    
    
    
    
The sections of papers
    assigned for Oct. 23 discuss the distributed storage and access of
    data in Bigtable and Cassandra.  Sections of the paper
    describing the Google distributed file system (GFS) are included
    because Bigtable relies on that system to take care of some of the
    issues of distributed storage.  You should consider GFS to be
    part of Bigtable for the purpose of answering the questions
    below.   The questions below ask about the main ideas, and
    your answers should be brief.  We may wish to dig deeper in
    class
    discussion.  
    
    1.  There are many ways one might organize the distributed
    storage of data structured as rows and columns.   What design
    decisions are shared by both Bigtable and Cassandra?  
    
    2.  Bigtable uses a "master server" and "tablet servers" 
    to manage the reading and writing of data;  Cassandra does not
    have a distinguished master node.   What are the pros and cons
    of each architecture?
    
    3.  How is replication handled in each of Bigtable and
    Cassandra?
    
    4.  What are the main steps in reading and writing data in
    Bigtable?
    
    5. What are the main steps in reading and writing data in Cassandra?