[Database] SQL (Structured Query Language) - 3
Aggregate Functions
- Aggregate Functions
- COUNT : count the number of values
- SUM : sum of values
- AVG : average of values
- MAX : maximum of values
- MIN : minimum of values
GROUP BY
- We want to apply the aggregate functions to subgroups of tuples in a relation
- GROUP BY partitions the table into disjoint groups of tuples based on grouping attributes(s).
- Each group consists of the set of tuples that have the same value for the grouping attribute(s)
- We can then apply the aggregate function to each group independently.
- The grouping attributes must also appear in SELECT-clause.
- For each department, retrieve DNO, number of employees in the department, and their average salary
- Each employee group has the same value for the grouping attribute DNO.
- Note : Group attribute ‘DNO’ must appear in SELECT clause
- COUNT and AVG functions are applied to each such group.
- For each project, retrieve the project number, project name, and the number of employees who work on that project.
- In this case, a join condition is used together with grouping.
-
The grouping and functions are applied after the joining of the two relations.
- If null values exist in the grouping attribute, then separate group is created for all tuples with a NULL value for the grouping attribute.
HAVING
- Sometimes, we want to retrieve the values of these functions for only those groups that satisfy certain conditions
- HAVING is used for specifying a selection condition on groups. (rather than on individual tuples)
- For each project on which more than two employees work, retrieve the project number, project name, and the number of employees who work on that project.
HAVING, GROUP BY and Nested Queries
- Find the project number with the highest average working hours.
- SQL cannot compose of aggregate functions (function of a function),so it must be written in a certain way.
ORDER BY
- ORDER BY is used to sort the tuples in a query result based on the values of some attribute(s);
- ASC (usually, omitted) : increasing order
- DESC : decreasing order
Summary of SQL Queries
- SQL Query consists of 6 clauses. [ ] is optional.
- Order of query is evaluated conceptually as follows:
- First, evaluate the FROM-clause,
- Then, evaluate the WHERE-clause,
- Then, GROUP BY and HAVING, and
- Finally, the SELECT-clause
- However, this method may be inefficient in real systems;
- Each DBMS has its own query optimization;
INSERT
- Single tuple Insertion
- Attribute value들은 table에 정의된 attribute들의 순서와 일치가 되어야 함.
- Multiple tuples Insertion
- 이 경우 SELECT query 결과를 저장하기 위한 새로운 임시 table (CREATE TABLE을 이용)를 미리 만들어 야 함.
INSERT : Single Tuple
- Insert a new project tuple into PROJECT table;
- Insert another new project tuple into PROJECT table;
- 단, 이 project의 location은 알려지지 않았음
- 이 경우 각 값들과 상응하는 attribute들을 명시해야 함.
INSERT : Multiple Tuples
- Insert DEPT tuples with the name, number of employees, and total salaries for each department; In this case, we have to prepare a temporary table.
- We can now query DEPT-INFO as any other tables;
- Note that if we update the tuples in EMP or DEPT, DEPT-INFO may not be up-to-date.
DELETE
- Removes tuple(s) from a table.
- A missing WHERE-clause specifies that all tuples in the table are to be deleted;
- The table then becomes an empty, but the table itself still exists;
- Tuples are deleted from only one table at a time. (unless CASCADE is specified on a referential constraint)
- Referential integrity should be verified.
- All employees tuples in ‘research’ department are deleted.
- All employee tuples are deleted, but, the EMP table itself (= definition) still exists
UPDATE
- Used to modify attribute values of one or more selected tuples
- WHERE-clause selects the tuples to be modified.
- SET-clause specifies the attributes to be modified and their new values
- Each update command modifies tuples in a single relation.
- If primary key value is modified, referential constraints should be verified
- Change the location and controlling department number of project number 10 to ‘Chicago’ and 5, respectively.
- Give employees in the ‘research’ department a 10% raise in salary.
- Salary column on the right side of = refers to the old SALARY value before update.
- SALARY column on the left side of = refers to the new SALARY value after update
View
- A view is a single table that is derived from other table(s);
- These other tables can be base tables, or previously defined views;
- A view is defined using CREATE VIEW as follows;
- We can define a view as a table that we reference “frequently”;
- A view is considered a virtual table because view does not necessary physically exists;
- The view is realized at request time for querying the view.
Views : From Single Table
- We only want to keep information with SSN, name, salary, and age of employees with high salaried (say, > $50,000) employees. (We don’t want about all salaries.)
Querying Views
- The view “High-Sal-EMP” does not contain tuples in the usual sense. Rather, if we query this view, the corresponding tuples are obtained from the base table “EMP”.
- Retrieve the names of high salaried employees with age < 45.
- This query is translated into underlying base table as follows.
- As a result, we can ask the same query about “High-Sal-EMP” twice or more, we may get different answers, because base table “EMP” may have changed in the interim.
Views : From Multiple Tables
- We only want to keep information with DNO, dept. name, number of employees, and total salary for each department;
- Note: COUNT(*), SUM(salary) are renamed as no-of-emps, total-sal.
- We only want to keep WORK-ON information with employee name, project name, and hours.
Advantages of Views
- By using views, we can specify queries “more concisely”.
- Retrieve names of employees who work on the project ‘notebook’:
- This query is equivalent to the following query;
- In case of not using view, we need to specify two joins on the base tables; This is more complex!
- Retrieve the number of working employees and total salary for department 5.
Modifying Views
- Sometimes, users need to modify (insert, delete, update) views;
- Insert a new tuple into a view
- Delete some tuple from a view
- Update some tuple from a view
- Is it possible to modify views? (Note : Views does not exist physically, but base table does.
- Answer : Yes, but only for some cases; Very restricted!
- For simple views (called, “updatable views”), it is possible to translate the modification of the view into equivalent modification on the underlying base table.
- Consider the following view;
- Increase salary for high salaried employees with age >= 60 by 5%.
- Is this view update possible? “Yes”, because we can actually modify base table EMP through the view; Thus, this view update is translated as follows;
- Consider the following view;
- Delete all high salaried employees with salary >= 100,000.
- Is this view update possible? “Yes”, because we can actually modify base table EMP through the view; Thus, this view update is translated as follows;
- Consider the following view;
- Insert a new high salaried employee as follows;
- Is this view update possible? “No”. Why? This view update is translated as follows; But this insertion is impossible, because primary key SSN has a NULL value.
View Update : Multiple Tables
- Update bob’s current project name from ‘printer’ to ‘laptop’.
- Is view update possible? No! It has two possible translations:
- 1) Find bob’s current PNO (say, p1), then find its pname (say, printer), then, change it by ‘laptop’ or
- 2) Find laptop’s PNUM (say, p3) and then find bob’s current working project (say, printer, p1), and then, change it by ‘p3’.
- This view update is ambiguous because it can not be translated uniquely:
- DBMS seems not smart to decide which one is better! In this example, option 2) looks better.
Un-updatable Views
- View update는 일반적으로 다음의 경우에는 update 불가.
- Aggregate 함수(sum, avg, group by 등)를 사용한 view
- 2 개 이상의 table들로 (join 등을 이용) 부터 유도된 view
- View의 SELECT clause에 Primary key가 명시되지 않은 경우
- 기타의 경우 : 사례별로 봐야 함.
- We can specify with CHECK option:
- Must be added to the view definition if the view can be updated
- To allow check for updatability and to plan for an execution strategy
View Implementation
(1) Query Modification (Most widely used)
- A view is not physically stored.
- Present the view query in terms of a query on the underlying base tables
- A view is computed at the time users specify a query on the view.
- DBMS must modify the user’s view query into a query on its underlying base tables.
- Query processing is inefficient for views defined via complex queries:
- Especially if additional queries are to be applied to the view within a short time period
(2) View Materialization
- A view is pre-computed and physically stored as keeping a temporary table.
- When a query is requested on the view, the unmodified query is executed directly on the pre-computed result.
- This is much faster than query modification; Faster access for expensive and complex joins
- Support applications that do not require current data or require data valid at a specific point in time (snapshot data).
- A major drawback is that we must maintain the consistency between the base table and the view when the base table is updated; Cost of maintaining view is high!
댓글남기기