At the moment many of modern relational databases support set valued attributes. Despite such attributes don’t fit in classical relational theory, they expands the possibilities of data storage and manipulation. Search query on set valued attribute can be represented in specific search predicates which can be easily expressed in set-theoretic operations. Accurate enough selectivity estimation for search predicates on set valued attributes is essential for query optimizer in the same way as selectivity estimation for regular search predicates. This paper introduces a probabilistic model for estimating selectivity of search predicates on set valued attributes. This model uses frequencies of set elements occurrences as well as histogram of set values cardinality. Parameters of the model are estimated during preliminary analysis of database contents. The model was implemented for array types of DBMS PostgreSQL, which are implementation of ordered set valued attributes. Experimental verification of this implementation showed that highly accurate selectivity estimation is provided on the basis of the proposed model.
1. Introduction
2. Proposed Model
3. Selectivity Estimation
4. Experimental Evaluation
5. Conclusion