Publications
Search

Publications :: Search

Query Processing Techniques for Solid State Drives

Show publication

On this page you see the details of the selected publication.

    Publication properties
    Title: Query Processing Techniques for Solid State Drives
    Rating: (1)
    Discussion: 0 comments
    Date: 2009
    Publication type: Conference paper
    Authors:
    No. First name Last name Show
    1. Dimitris Tsirogiannis
    2. Stavros Harizopoulos
    3. Mehul Shah
    4. Janet Wiener
    5. Goetz Graefe
    Download (by DOI): 10.1145/1559845.1559854
    BibTeX: conf/sigmod/TsirogiannisHSWG09
    DBLP: db/conf/sigmod/sigmod2009.html#TsirogiannisHSWG09
    Bookmark:

    The following keywords have been assigned to this publication so far. If you have logged in, you can tag this publication with additional keywords.

    Keywords
    No keywords have been assigned to this publication yet.

    If you log in you can tag this publication with additional keywords

    A publication can refer to another publication (outgoing references) or it can be referred to by other publications (incoming references).

    Incoming References
    No incoming references have been assigned to this publication yet.
    Outgoing References
    No outgoing references have been assigned to this publication yet.

    If you log in you can add references to other publications

    A publication can be assigned to a conference, a journal or a school.

    Conference Track
    Conference Name: ACM SIGMOD International Conference on Management of Data, SIGMOD 2009, Providence, Rhode Island, USA, June 29 - July 2, 2009 2009
    Track Name: Research
    URL: http://www.sigmod09.org/

    Abstract

    Solid state drives perform random reads more than 100x faster than traditional magnetic hard disks, while offering comparable sequential read and write bandwidth. Because of their potential to speed up applications, as well as their reduced power consumption, these new drives are expected to gradually replace hard disks as the primary permanent storage media in large data centers. However, although they may benefit applications that stress random reads immediately, they may not improve database applications, especially those running long data analysis queries. Database query processing engines have been designed around the speed mismatch between random and sequential I/O on hard disks and their algorithms currently emphasize sequential accesses for disk resident data.


    In this paper, we investigate data structures and algorithms that leverage fast random reads to speed up selection, projection, and join operations in relational query processing. We first demonstrate how a column-based layout within each page reduces the amount of data read during selections and projections. We then introduce FlashJoin, a general pipelined join algorithm that minimizes accesses to base and intermediate relational data. FlashJoin's binary join kernel accesses only the join attributes, producing partial results in the form of a join index. Subsequently, its fetch kernel retrieves the attributes for later nodes in the query plan as they are needed. FlashJoin significantly reduces memory and I/O requirements for each join in the query. We implemented these techniques inside Postgres and experimented with an enterprise SSD drive. Our techniques improved query runtimes by up to 6x for queries ranging from simple relational scans and joins to full TPC-H queries.