With the explosion of data and information, researchers and practitioners have started to rethink how users can best interact with the massive text data found on the World Wide Web and in relational data warehouses. Traditional structured query languages, such as SQL and XQuery, are simply too inflexible and cumbersome for the mass public. In recent days, we have seen a resurgence of information retrieval and natural language processing techniques in information management for structured data. In this article, we emphasize on patents that utilize natural language queries and keyword queries as means of information retrieval. Without the declarative grammar structure, natural language and keyword queries pose unique challenges to query interpretation and document retrieval. Query interpretation and evaluation of such queries have received much attention in the past decade. We have selected four inventions that claim methods of improving the task of natural language and keyword query processing. We review the technical details and features of the patents, and compare them in a unified context.
Information retrieval, text analysis, natural language processing, keyword queries, web data, unstructured data, machine learning, language queries, Structured Query Language (SQL), XQuery, user interface, community-natural language, vector space model, World Wide Web, probabilistic model, term frequency, inverse document frequency, Semantic Vector, Lexical Ambiguity, POS tagging, logical-form triples (LFT), Cyclical Redundancy Check (CRC), Category-Based Search, bins, topical coherence, Latent Semantic Space, latent semantic analysis, peta-bytes, natural language text, Probabilistic latent semantic analysis
University of Ontario Institute of Technology, Oshawa, ON, Canada.