Database Land

August 5, 2008

Database Views

Filed under: processing, query, rewrite, tables, views — Amr El-Helw @ 10:11 am

A database view is a stored query accessible as a virtual table. The contents of this virtual table are the results of that query. Views are only logically stored in the database (i.e. the contents are not physically stored, but dynamically computed whenever needed). Materialized views are an exception to this rule. Changing the data in a base table automatically alters any view that references that table.

A view can be used in exactly the same way as a base table. Views can be used for the following reasons:

  • Simplify queries by creating views to represnts parts of these queries
  • Hiding the complexity of the data, by creating views that reflect on the parts of the data relevant to a particular user
  • Providing extra security by giving users access only to the views, not to the base tables, thus restricting the data they can access
  • Since the actual contents of the views are not physically stored, they do not require much space

Views can be seen in analogy to modules (or functions) in programming. They can be used to create abstraction. They can be nested (a query can reference a view, and a view can reference other views, etc.)

For example, consider the following two queries:

Query 1: Query 2:
SELECT name, balance
FROM (SELECT name, money_received, money_sent,
(money_received - money_sent) AS balance, address
FROM table_customers c, accounts_table a
WHERE a.customerid = c.customer_id
)
WHERE money_sent > 10000
SELECT name, address
FROM (SELECT name, money_received, money_sent,
(money_received - money_sent) AS balance, address
FROM table_customers c, accounts_table a
WHERE a.customerid = c.customer_id
)
WHERE balance between 10000 and 20000

Both these queries contain the same sub-query. To simplify these queries, it is possible to define a view (v_cust_account) as follows:

CREATE VIEW v_cust_account AS
SELECT name, money_received, money_sent,
(money_received - money_sent) AS balance, address
FROM table_customers c, accounts_table a
WHERE a.customerid = c.customer_id

Now this view can be treated as a table containing the results of the subquery. Now the 2 initial queries can be simplified as follows:

Query 1a: Query 2a:
SELECT name, balance
FROM v_cust_account
WHERE money_sent > 10000
SELECT name, address
FROM v_cust_account
WHERE balance between 10000 and 20000

Views are usually read-only, i.e. one cannot directly update the data in a view, but can only update the underlying tables. However, in some cases, a view can be updatable, but only if the database system is able to determine the reverse mapping from the view schema to the schema of the underlying base tables. In this case, INSERT, UPDATE, and DELETE operations can be performed. Read-only views do not support such operations because the DBMS is not able to map the changes to the underlying base tables.

During query processing, queries that reference views are usually rewritten to reference the base tables. For example, if a user issues Query 1a to the DBMS, the query processor will determine that the query references the view (v_cust_account). It will then proceed to rewrite this query to reference the base tables (thus it will look like the original query (Query 1).

February 14, 2008

The Basics (VI) - Overview on Query Processing

Filed under: catalog, cost, execution, optimization, parsing, plan, processing, query, rewrite, semantic, sql — Amr El-Helw @ 2:20 pm

I mentioned before that SQL is a declarative language, where the user only specifies WHAT they want to get from the database, not HOW to get it. The how part is the responsibility of the database engine. Query processing is a complex task that involves several steps in order to translate a user’s query to something that the computer can actually execute in order to retrieve the results of that query.

Query processing usually consists of the following phases:

  1. Parsing
    In this phase, the engine analyzes the text of the SQL query, to make sure that it is written correctly, and reports any syntax errors (misspellings, unidentified operators, etc.). If the query has no errors, it is translated into an internal representation that is to be used for later processing.
  2. Semantic Checking
    Now that the text of the query has no errors, the semantics are checked. The engine makes sure that any referenced tables, columns, or views are valid. This phase is sometimes considered part of the parsing phase.
  3. Query Rewrite
    First of all, note that usually the same query may have multiple representations in SQL. Complex queries often result in redundancy, especially with views. The Query rewriting phase rewrites a given SQL query into another query that is equivalent (produces the same result), but may be processed more efficiently.
  4. Query Optimization
    This is probably the most important phase in query processing, in which the engine tries to figure out the best way to evaluate the query (the “how” part). The same query can usually be evaluated in multiple ways, depending on which tables are joined first, what join methods are used, whether the predicates are tested at the beginning or at the end, etc. Each such way is called a query execution plan (or just a query plan). For the same query, some plans are more efficient than others. The job of the query optimizer is:

    1. Determine all possible query plans for the query in hand
    2. Estimate the execution cost for each of these plans. This is achieved using the statistics available in the database catalog
    3. Pick the plan with the least estimated cost
  5. Code Generation
    Once a query plan is chosen, the system translates this plan into executable machine code.
  6. Plan Execution
    Finally the query is executed, and the results are returned.

Blog at WordPress.com.