Retrieval Queries
- Basic form of the Retrieval Queries:
-
is a list of attribute names whose values are to be retrieved by the query
-
is a list of the relation names to process query
-
is a conditional (Boolean) expression that identifies the tuples to be retrieved by the query
Simple Query : Single Table
- Retrieve the birthdate and address of employees whose sex is ‘male’ and whose salary is greater than 30,000.
- SELECT-clause specifies PROJECT($\pi$);
- WHERE-clause specifies SELECT ($\sigma$)
- (참고: SQL의 SELECT는 relational algebra의 $\sigma$와 다름; $\pi$ 를 의미)
- This SQL expression is equivalent to:
- $\pi$ bdate, address ($\sigma$ (salary > 30000) AND (sex = ‘male’) EMP)
Optional WHERE / Use of *
- ‘WHERE’ is optional; That means “no condition” is applied; Thus, all tuples of the table are retrieved.
- To retrieve all the attribute values of the selected tuples, the symbol * is used, which means all the attributes
DISTINCT
- SQL does not remove duplicate tuples in query result: Why?
- Duplicates removal is expensive; (Sort and elimination cost!)
- Users want to see duplicate tuples in the result of query.
- When aggregate function (ie., sum, avg) is applied, we need to keep duplicated tuples.
- To remove duplicates in a query result, DISTINCT is used.
- Q1 may have duplicate values, but Q2 does not have any duplicates.
- What do think about the following query Q3?
Multiple Table Query
- Retrieve SSN and age of employees who work for ‘research’ dept.
- (DNO = DNUMBER) condition specifies the JOIN;
- This SQL expression is equivalent to :
- For project name ‘product X’, list the project number, controlling department number, and the department manager’s salary
- Notice that there are two join conditions
- (DNUM = DNUMBER) relates a project to its controlling department
- (Mgr-SSN = SSN) relates the controlling department to the employee who manages that department
Rename Operation
- We can rename for both relation and attributes, taking the form;
- The AS clause can appear in both the SELECT and FROM clause.
Use of Tuple Variables
- We can use tuple variables to refer any tables in SQL query;
- Retrieve name and dept. phone of employees who work for ‘research’ dept.
- The alternative table names e and d are called “tuple variables”
- This SQL expression is equivalent to :
- Tuple variables are very useful for comparing two tuples in the same relation.
- In this case, two tuple variables are given to the same relation.
- For each employee, retrieve the employee’s SSN, and the name of his (or her) direct supervisor.
- The alternative table names e and s are called “tuple variables”
- We can think of e and s as two different copies of EMPLOYEE;
- e represents employees in role of ’workers’
- s represents employees in role of ‘bosses
- Retrieve SSN and names of all employees whose salary greater than Smith’s salary.
- We can also think of e and b as two different copies of EMPLOYEE;
- e represents in the role of employees
- s represents in the role of employee(s) with name = ‘smith’
- Retrieve the names of two employees (i.e., married) who share the same address.
- The second condition (e1.name < e2.name) means that the name of first employee precedes the name of second employee alphabetically.
- If this condition is omitted, tuple variables e1 and e2 both could refer to the same tuple. Thus, the query result produces each employee name paired with itself.
- Question: If use “<>” instead of “< “,’ what will happen?
Substring Pattern Matching
- SQL provides LIKE comparison operator for pattern matching. We describe patterns by using two special characters.
- Percent(%): The % matches any “substring”.
- Underscore(_): The _ matches any “character”.
- Patterns are case sensitive; that is, upper case characters do not match lower case characters, or vice versa; Some examples are;
- ‘Blue%’ matches any string beginning with ‘Blue’.
- ‘%idge% matches any string containing “idge” as substring.
- Examples: Blueridge, Ridgeriver, ‘Rockbridge, Ridgeway, . .
- ‘_ _ _’ matches any string of exactly 3 characters.
- ‘_ _ _%’ matches any string of at least 3 characters.
- Retrieve all employees whose address is in Houston, Texas
- Retrieve all employees who were born during the 1970s.
- Note that NOT LIKE also can be used in a similar way
Arithmetic Operations
- We can use arithmetic operations in query; The standard operators +, -, *, and / can be applied to attributes with numeric data types;
- Show resulting salaries for every employee working for the research department is given 10% raise.
- SQL provides BETWEEN operator to simplify comparison operators.
- For example, retrieve all employees whose salary is between 30,000 and 40,000. (that is, salary >= AND salary <= 40000)
Set Operators
- UNION, INTERSECT, EXCEPT
- Each of the above operations automatically removes duplicated tuples;
- To retain all duplicated tuples, use the corresponding operations UNION ALL, INTERSECT ALL, EXCEPT ALL.
- Suppose a tuple t occurs m times in R and n times in S;
- R UNION ALL S : (m + n) times occurs.
- R INTERSECT ALL S : Min(m, n) times occurs.
- R EXCEPT ALL S : Max(0, m – n) times occurs
- Retrieve SSN and name of employees with age > 40 or sex = ‘male’.
- Retrieve SSN of employees with any no dependents.
- Retrieve all project numbers for projects that involve an employee whose name is ‘Bob’ as a worker, and as a manager of the department that controls the project.
NULL Values
- NULL represents missing values; It has 3 different meanings :
- (1) Unknown (exists, but unknown):
- Someone has ‘blood-type’, but currently is not known.
- (2) Unavailable (exists, but is withheld):
- Someone has ‘phone-number’ but does not want it to be listed.
- (3) No applicable (undefined or non-existent):
- ‘spouse’ may be NULL for unmarried someone.
- It is not possible to distinguish these 3 meanings: For example, Suppose many persons have “NULL value for phone-number”; How can we interpret this?
- Let X have the value NULL;
- (1) Any arithmetic operation with NULL returns “NULL”.
- X + 5, X – 5, X * 5, . . .
- (2) Any comparison with NULL returns “UNKOWN” :
- X < 5, X = 5, X = NULL, X <> 5, . . .
- Note that “UNKOWN” is another truth value like TRUE and FALSE.
- SQL provides IS NULL (or IS NOT NULL) for dealing NULL values;
- “X IS NULL” returns “TRUE” if X has NULL value; It returns “FALSE” otherwise.
- SQL uses a three-valued logic: TRUE, FALSE, UNKNOWN
- To understand how 3-valued logic works:
- Let T = 1, F = 0, U = ½,
- AND = MIN, OR = MAX, NOT(X) = 1 - X
- Example:
- T AND (F OR NOT(U)) = MIN(1, MAX(0, (1 - ½ )))
- = MIN(1, MAX(0, ½ ) = MIN(1, ½ ) = ½ (= U).
- Result of WHERE condition is treated as FALSE if it evaluates to UNKNOWN.
- Tuples that are evaluated to only TRUE are selected.
- Thus, tuples with either FALSE or UNKNOWN values are not selected from the result.
- Is this tuple selected? Answer: NO!
- In this case, we need to use IS NULL to check whether the salary attribute has NULL or not.
- Now, all tuples are selected!
Use of JOIN Operators
- SQL permits users to specify join operators in FROM clause; This is easier to understand than mixing together select and join conditions in WHERE;
Nested Queries (= Subqueries)
- A query that is part of another query is called a subquery (or nested query);
- Subquery can have many nested subquries as we want;
- There are many other ways that subqury can be used;
- (1) Subquery can return constant value(s), and this value can be compared with another value in WHERE clause;
- (2) Subquery can return relation (a set of tuples) that can be used in various ways in WHERE clause;
- (3) Subquery can have their relations appear in FROM clause;
- A nested query can be specified within the WHERE-clause of another query, called the outer query
- Single query :
- Nested Query : Having executed nested query returns ‘research’ number; Then, outer query returns SSN(s) only if its DNO is equal to result of nested query.
- Question: What if there are more than one research dept numbers?
Nested Queries : IN
- A nested query can be compared with outer query by using IN operator;
- (v IN V ) compares a value v (in outer query) with the value(s) V (in nested query), and returns “true” if v is included in V. Otherwise, return “false”;
- SQL allows use of (sub)tuples of values in comparisons by placing them within parenthesis;
- Get SSNs of employees who work for the same (project, hours) on some project that employee with SSN = ‘123456789’ works on;
- Get names of employees who work for the same (project, hours) on some project that employee with SSN = ‘123456789’ works on;
Nested Queries : NOT IN
- (v NOT IN V ) compares a value v (in outer query) with the value(s) V (in nested query), and returns “true” if v is not included in V. Otherwise, return “false”;
- Retrieve names of employees with no dependents.
- Retrieve names of employees whose names are neither Smith nor Jones.
Nested Queries : Use of Tuple Variables
- Nested Queries also can use tuple variables;
- Retrieve names of employees who have a dependent with the same sex as the employees.
Nested Queries : SOME (= ANY)
- SOME (= ANY) can be used with {<, >, =, <=, >=, <>} operators:
- For example, (v > SOME V ) returns “true” if the value v is greater than some of the values in a set V.
- Get SSNs of employees whose salary is greater than salary of some employees in department 5:
Example : Meaning of SOME
- Example
- (5 < SOME {0, 5, 7} ) = true ) = false
- (5 <> SOME {0, 5, 7}) = true
- (5 < SOME {1, 3, 5}
- (5 = SOME {0, 5, 7} ) = true
- 참조: (= SOME)은 IN 과 동일한 표현임. 반면에, (<> SOME)은 NOT IN 과 동일하지 않음.
Nested Queries : ALL
- ALL can be used with {<, >, =, <=, >=, <>} operators:
- For example, (v > ALL V ) returns “true” if the value v is greater than all the values in a set V
- Get SSNs of employees whose salary is greater than salary of all the employees in department 5:
Example : Meaning of ALL
- Example
- (5 < ALL {0, 5, 6} ) = false) = true
- (5 <> ALL {4, 6, 8} ) = true
- (5 > ALL {0, 3, 4}
- (5 = ALL {4, 5, 7} ) = false
- 참조: (<> ALL)은 NOT IN 과 동일한 표현임. 반면에, (= ALL)은 IN 과 동일하지 않음.
Nested Queries : EXISTS
- EXISTS(Q) returns “true” if there is at least one tuple (= not empty) in the result of a nested query Q. Otherwise, it returns “false”
- Retrieve SSNs of employees who have dependent(s).
- Retrieve names of employees who have a dependent with the same sex as the employees
Nested Queries : NOT EXISTS
- NOT EXISTS(Q) returns “true” if there no tuple (= empty) in the result of a nested query Q. Otherwise, it returns “false”.
- Retrieve SSNs of employees who have no dependent.
- Retrieve names of employees whose sex are not the same with any dependent or with no dependent.
- Retrieve names of employees whose salary are greater than salary of all the employees with age < 25.
- Question: Suppose that age of all employees are >= 25; Then, what is the query result?
Division Query : NOT EXISTS and EXCEPT
- Retrieve name of each employee who work on all the projects controlled by department 5.
Aggregate Functions
댓글남기기