An Empirical Performance Evaluation of Relational Keyword Search Techniques

1,500.00

Project Name An Empirical Performance Evaluation of Relational Keyword Search Techniques
Front End 
Back End
Software
Compare

Extending the keyword search paradigm to relational data has been an active area of research within the database and IR community during the past decade. Many approaches have been proposed, but despite numerous publications, there remains a severe lack of standardization for the evaluation of proposed search techniques. Lack of standardization has resulted in contradictory results from different evaluations, and the numerous discrepancies muddle what advantages are proffered by different approaches.

In this project, we present the most extensive empirical performance evaluation of relational keyword search techniques to appear to date in the literature. Our results indicate that many existing search techniques do not provide acceptable performance for realistic retrieval tasks. In particular, memory consumption precludes many search techniques from scaling beyond small data sets with tens of thousands of vertices.

We also explore the relationship between execution time and factors varied in previous evaluations; our analysis indicates that most of these factors have relatively little impact on performance. In summary, our work confirms previous claims regarding the unacceptable performance of these search techniques and underscores the need for standardization in evaluations—standardization exemplified by the IR community.

  • In this paper explore the relationship between execution time and factors varied in previous evaluations; our analysis indicates that most of these factors have relatively little impact on performance.
  • Our benchmark is the only one to date in the literature that satisfies the minimum criteria established by the IR community for the evaluation of retrieval systems.
  • Schema-based approaches support keyword search over relational databases via direct execution of SQL commands.
  • The database’s full text indexes identify all tuples that contain search terms, and a join expression is created for each possible relationship between these tuples.
  • The objective of graph-based approaches is to minimize the weight of result trees.
  • The benchmark’s query workload is derived from 50 information needs for each data set.
  • The query workload does not use real user queries extracted from a search engine log for two reasons.

In this paper use two metrics to measure runtime performance. The first is execution time, which is the time elapsed from issuing a query until an algorithm terminates. Because there are a large number of potential results for each query, search techniques typically return only the top- k results where k specifies the desired retrieval depth

Reviews

There are no reviews yet.

Be the first to review “An Empirical Performance Evaluation of Relational Keyword Search Techniques”

Your email address will not be published. Required fields are marked *