Knowledge Engineering

Seminar NOSQL Databases








Semester:Winter 2013/14
Lecturer:Dr. Lena Wiese
Course type:Seminar
ECTS (SWS):5 ECTS (2 SWS)
Date:First Meeting: 24.10.2013, 11:00, room 1.101
Presentations: 11.2.2014, 9:30-16:00, room 0.101
AudienceApplied Computer Science MSc
Applied Computer Science BSc
ITIS MSc



Deadlines









Submission of title page:14.11.2013
Submission of glossary:28.11.2013
Submission of essay (first version):16.1.2014
Submission of presentation (first version):16.1.2014
Registration in FlexNow:23.1.2014
Block seminar (live presentations):11.2.2014(presence in Göttingen required!)
Submission of essay (final version):18.2.2014
Submission of presentation (final version):18.2.2014


Description

Modern database management systems face novel requirements like flexible (non-relational) data structures, extremely fast updating and querying, and distributed storage on multiple servers. Under the term NOSQL (in the sense of Not Only SQL) several new database management systems have emerged which answer some of the demands of modern data management. In this course we look at some NOSQL systems and NOSQL data management technologies in detail.

Requirements

Preparation of a presentation (duration 35 minutes) and an essay (length 20 pages) on a topic chosen in the first meeting.

List of topics

-S. Gilbert and N. Lynch.
Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services.
ACM SIGACT News, 33(2):51–59, 2002.

-W. Lloyd, M. J. Freedman, M. Kaminsky, and D. G. Andersen.
Don’t settle for eventual: scalable causal consistency for wide-area storage with COPS.
In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (pp. 401-416). ACM, 2011.

-P. Bailis, A. Fekete, A. Ghodsi, J. M. Hellerstein, and I. Stoica.
The potential dangers of causal consistency and an explicit solution.
SoCC '12 Proceedings of the Third ACM Symposium on Cloud Computing, Article No. 22, ACM, 2012.

+D. B. Terry, A. J. Demers, K. Petersen, M. J. Spreitzer, M. M. Theimer, and B. B. Welch.
Session guarantees for weakly consistent replicated data.
In Proceedings of the Third International Conference on Parallel and Distributed Information Systems, PDIS 1994 (pp. 140-149). IEEE.

-D. Ongaro, S. M. Rumble, R. Stutsman, J. Ousterhout, J., and M. Rosenblum.
Fast crash recovery in RAMCloud.
In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (pp. 29-41). ACM, 2011.

-P. Bailis, A. Fekete, J. M. Hellerstein, and I. Stoica.
HAT, not CAP: Highly Available Transactions.
arXiv preprint arXiv:1302.0309, 2013.

-A. Thomson, T. Diamond, S. C. Weng, K. Ren, P. Shao, and D. J. Abadi.
Calvin: fast distributed transactions for partitioned database systems.
In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (pp. 1-12). ACM, 2012.

-J. Baker, C. Bond, J. Corbett, J. J. Furman, A. Khorlin, J. Larson, J.-M. Leon, Y. Li, A. Lloyd, and V. Yushprakh.
Megastore: Providing scalable, highly available storage for interactive services.
In CIDR, 2011.

-F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber.
Bigtable: a distributed storage system for structured data.
ACM Transactions on Computer Systems (TOCS), 26(2), 4. ACM, 2008.

-B. F. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H.-A. Jacobsen, N. Puz, D. Weaver, and R. Yerneni.
Pnuts: Yahoo!’s hosted data serving platform.
In Proceedings of the VLDB Endowment, 1(2), 1277-1288, 2008.

-G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels.
Dynamo: Amazon’s highly available key-value store.
SIGOPS, 2007.

+E. P. Jones, D. J. Abadi, and S. Madden.
Low overhead concurrency control for partitioned main memory databases.
In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data (pp. 603-614). ACM, 2010.

-A. Lakshman and P. Malik.
Cassandra: structured storage system on a p2p network.
In Proceedings of the 28th ACM symposium on Principles of distributed computing (pp. 5-5). ACM, 2009.

-L. Lamport.
Paxos made simple.
ACM SIGACT News, 2001.

-J. Rao, E. J. Shekita, and S. Tata.
Using paxos to build a scalable, consistent, and highly available datastore.
In Proceedings of the VLDB Endowment, 4(4), 243-254, 2011.

-J. Levandoski, D. Lomet, and S. Sengupta.
LLAMA: A cache/storage subsystem for modern hardware. In Proceedings of the VLDB Endowment, 6(10), 2013.

+E. Pacitti, M. T. Ozsu, and C. Coulon.
Preventive multi-master replication in a cluster of autonomous databases.
In Euro-Par 2003 parallel processing (pp. 318-327). Springer, 2003.

-K. Ren, A. Thomson, D. J. Abadi.
Lightweight locking for main memory database systems.
In Proceedings of the 39th international conference on Very Large Data Bases (pp. 145-156). VLDB Endowment, 2012.

+Sudipto Das, Divyakant Agrawal, Amr El Abbadi.
ElasTraS: An elastic, scalable, and self-managing transactional database for the cloud.
ACM Trans. Database Syst. 38(1): 5 (2013)

+Hoang Tam Vo, Sheng Wang, Divyakant Agrawal, Gang Chen, Beng Chin Ooi.
LogBase: A Scalable Log-structured Database System in the Cloud.
PVLDB 5(10): 1004-1015 (2012)

-Robert Escriva, Bernard Wong, Emin Gün Sirer.
HyperDex: a distributed, searchable key-value store.
SIGCOMM 2012: 25-36

-Jeff Shute, Radek Vingralek, Bart Samwel, Ben Handy, Chad Whipkey, Eric Rollins, Mircea Oancea, Kyle Littlefield, David Menestrina, Stephan Ellner, John Cieslewicz, Ian Rae, Traian Stancescu, Himani Apte.
F1: A Distributed SQL Database That Scales.
PVLDB 6(11): 1068-1079 (2013)

-Daniel J. Abadi, Samuel Madden, Nabil Hachem.
Column-stores vs. row-stores: how different are they really?
SIGMOD Conference 2008: 967-980

-Michael Stonebraker, Daniel J. Abadi, Adam Batkin, Xuedong Chen, Mitch Cherniack, Miguel Ferreira, Edmond Lau, Amerson Lin, Samuel Madden, Elizabeth J. O'Neil, Patrick E. O'Neil, Alex Rasin, Nga Tran, Stanley B. Zdonik.
C-Store: A Column-oriented DBMS.
VLDB 2005: 553-564

-James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, J. J. Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson C. Hsieh, Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak, Christopher Taylor, Ruth Wang, Dale Woodford.
Spanner: Google's Globally Distributed Database.
ACM Trans. Comput. Syst. 31(3): 8 (2013)

+Jeffrey Dean, Sanjay Ghemawat.
MapReduce: Simplified Data Processing on Large Clusters.
OSDI 2004: 137-150

+Shoji Nishimura, Sudipto Das, Divyakant Agrawal, Amr El Abbadi.
MD-HBase: design and implementation of an elastic data infrastructure for cloud-scale location services.
Distributed and Parallel Databases 31(2): 289-319 (2013)

+Sudipto Das, Divyakant Agrawal, Amr El Abbadi.
G-Store: a scalable data store for transactional multi key access in the cloud.
SoCC 2010: 163-174

-Schütt, Thorsten, Florian Schintke, and Alexander Reinefeld.
Scalaris: reliable transactional p2p key/value store.
Proceedings of the 7th ACM SIGPLAN workshop on ERLANG. ACM, 2008.