跳转至

Introduction to SQL

约 1942 个字 172 行代码 4 张图片 预计阅读时间 9 分钟

Outline:

  • Overview of the SQL Query Language
  • Date Definition
  • Basic Query Structure
  • Additional Basic Operations
  • Set Operations
  • Null Values
  • Aggregate Functions
  • Nested Subqueries
  • Modification of the Database

Date Definition Language

The SQL data-definition language(DDL) allows the specification of information about relations, including:

  • The shcema for each relation.
  • The domain of values associated with each attribute.
  • Integrity constraints.
  • And as we will see later, also other information such as
    • The set of indices to be maintained for each relations.
    • Security and authorization information for each relation.
    • The physical storage structure of each relation on disk.

Domain Types in SQL

  • char(n). Fixed length character string, with user-specified length n.
  • varchar(n). Variable length character string, with user-specified maximum length n.
  • int. Integer (typically 4 bytes).
  • smallint. Small integer (typically 2 bytes).
  • tinyint: A very small integer (typically 0-255).
  • numeric(p,d). Fixed point number, with user-specified precision of \(p\) digits, with \(d\) digits to the right of decimal point.
    • 最多有\(p\)位数字,其中\(d\)位在小数点右边。
  • real, double precision. Floating point number and double-precision floating point numbers, with machine-dependent precision.
  • float(n). Floating point number with user-specified precision of at least \(n\) digits.

Built-in Data Types in SQL

  • date. Dates, containing a (4 digit) year, month and date.
    • Example: date '2005-7-27'.
  • time. Time of day, in hours, minutes and seconds.
    • Example: time '14:30:00' or time '14:30:00.45'.
  • timestamp. Date plus time of day.
    • Example: timestamp '2005-7-27 14:30:00.45'.
  • interval: period of time
    • Example: interval '1' day
    • Subtracting a date/time/timestamp value from another gives an interval value.
    • Interval values can be added to date/time/timestamp values.
  • date, time functions:
    • current_date(), current_time()
    • year(x), month(x), day(x), hour(x), minute(x), second(x)

Creat Table Construct

  • An SQL relation is defined using the create table command.
SQL
1
2
3
4
5
6
create teble r(A1 D1, A2 D2, ..., An Dn, 
            (integrity-constraint1),
            (integrity-constraint2),
            ...,
            (integrity-constraintk));
);
  • \(r\) is the name of relation
  • each \(A_i\) is an attribute name in the schema of relation \(r\)
  • \(D_i\) is the data type of values in the domain of attribute \(A_i\)

Example

SQL
1
2
3
4
5
6
create table instructor(
    ID        char(5),
    name      varchar(20) not null,
    dept_name varchar(20),
    salary    numeric(8,2)
);
  • We can do insert into instructor values('10101', 'Srinivasan', 'Comp. Sci.', 65000);
  • But we can't do insert into instructor values('10101', null, 'Comp. Sci.', 65000); because of the not null constraint.

Integrity Constraints in Create Table

  • not null
  • primary key(\(A_1, A_2, ..., A_n\))
  • foreign key(\(A_m, \dots, A_n\)) references \(r\)

Declare ID as the primary key for instructor

SQL
1
2
3
4
5
6
7
8
create table instructor(
    ID        char(5),
    name      varchar(20) not null,
    dept_name varchar(20),
    salary    numeric(8,2),
    primary key(ID)
    foreign key(dept_name) references department
);

primary key declaration on an attribute automatically ensures not null

And a few more relation definitions:

SQL
1
2
3
4
5
6
7
8
create table student(
    ID        char(5),
    name      varchar(20) not null,
    dept_name varchar(20),
    tot_cred  numeric(3,0) default 0,
    primary key(ID)
    foreign key(dept_name) references department
);

Here default 0 means that if no value is specified for the tot_cred attribute, it is assumed to be 0.

SQL
create table takes(
    ID        varchar(5),
    course_id varchar(8),
    sec_id    varchar(8),
    semester  varchar(6),
    year      numeric(4,0),
    grade     char(2),
    primary key(ID, course_id, sec_id, semester, year),
    foreign key(ID) references student
    foreign key(course_id, sec_id, semester, year) references section
);

Note: sec_id can be dropped from primary key above, to ensure a student cannot be registered for two sections of the same course in the same semester.

And more still

SQL
1
2
3
4
5
6
7
create table course(
    course_id   varchar(8) primary key,
    title       varchar(50),
    dept_name   varchar(20),
    credits     numeric(2,0),
    foreign key(dept_name) references department(dept_name)
);

Foreign key

foreign key(dept_name) references department

on delete cascade | set null | restrict | set default on update cascade | set null | restrict | set default

  • 如果被引用的元组该属性被删除,那么引用该属性的元组可以有相应变换规则
  • update 是指当被引用的属性被更新时,引用该属性的元组可以有相应变换规则,而delete 就是删除。
  • on update / delete 后边可以跟四个词语:
    • cascade:删除或更新被引用的属性后,引用该属性的元组也被删除或更新。
    • set null:删除或更新被引用的属性后,引用该属性的元组的该属性被设置为null。
    • restrict:如果有元组引用被引用的属性,那么被引用的属性不能被删除或更新。
    • set default:删除或更新被引用的属性后,引用该属性的元组的该属性被设置为默认值。

Foreign key

Drop and Alter Table Constructs

  • drop table \(r\); 删除表\(r\)
  • delete from \(r\); 删除表\(r\)中的所有元组
  • alter table:
    • alter table \(r\) add \(A \ D\); 添加属性\(A\)到表\(r\)中,其中\(D\)\(A\)的值域
    • Example: alter table student add resume varchar(256);
    • alter table \(r\) drop \(A\); 从表\(r\)中删除属性\(A\),但是许多数据库并不支持这个操作。

SQL and Relational Algebra

SQL
1
2
3
select A1, A2, ..., An
from r1, r2, ..., rm
where P;

is equivalent to the following expression in multiset relational algebra:

\[ \pi_{A_1, \ldots, A_n}(\sigma_{P}(r_1 \times r_2 \times \ldots \times r_m)) \]

SQL
1
2
3
4
select A1, A2, sum(A3)
from r1, r2
where P
group by A1, A2;

is equivalent to the following expression in multiset relational algebra:

\[ A_1, A_2 \mathcal{G}_{sum(A_3)}(\sigma_P(r_1 \times r_2 \times \ldots \times r_m)) \]

SQL
1
2
3
4
select A1, sum(A3)
from r1, r2, ..., rm
where P
group by A1, A2;

is equivalent to the following expression in multiset relational algebra:

\[ \pi_{A_1, sumA3}(_{A_1, A_2}\mathcal{G}_{sum(A_3) \ as \ sumA3}(\sigma_P(r_1 \times r_2 \times \ldots \times r_m))) \]

Basic Query Structure

  • The SQL data-manipulation language(DML) provides the ability to query information, and insert, delete and update tuples.
  • A typical SQL query has the form:
SQL
1
2
3
select A1, A2, ..., An
from r1, r2, ..., rm
where P;
  • \(A_i\) represents an attribute
  • \(r_i\) represents a relation
  • \(P\) is a predicate
  • The result of an SQL query is a relation.

The select Clause

  • Corresponds to the projection operation in relational algebra.
  • NOTE: SQL names are case insensitive(i.e., you may use upper or lower case letters)
  • E.g. \(Name \equiv NAME \equiv name\)

In our last course, we konw that the select clause allows duplicates by default. If we want to eliminate duplicates, we can use the distinct keyword.

Example

Find the names of all departments with instructor, and remove duplicates

SQL
select distinct dept_name
from instructor

And the keyword all can be used to include duplicates.

SQL
select all dept_name
from instructor

The where Clause

  • The where clause specifies conditions that the result must satisfy.
  • Corresponds to the selection operation in relational algebra.
  • To find all instructors in the Comp. Sci. department with salary > 70000:
SQL
1
2
3
select name
from instructor
where dept_name = 'Comp. Sci.' and salary > 70000;
  • SQL includes a between comparison operator
  • Example: Find the names of all instructors with salary between 90,000 and 100,000
SQL
1
2
3
select name
from instructor
where salary between 90000 and 100000;
  • Tuple comparison
SQL
1
2
3
select name, course_id
from instructor, teaches
where (instructor.ID, dept_name) = (teaches.ID, 'Biology');

The from Clause

  • Corresponds to the cartesian product operation in relational algebra.
  • Find the Cartesian product instructor X teaches.
SQL
select *
from instructor, teaches

Additional Basic Operations

Joins

  • For all instructors who have taught some course, find their names and the course ID of the course they taught.
SQL
1
2
3
select name, course_id
from instructor, teaches
where instructor.ID = teaches.ID;
  • Find the course ID, semester, year and title of each course offered by the Comp. Sci. department
SQL
1
2
3
select course_id, semester, year, title
from section, course
where section.course_id = course.course_id and course.dept_name = 'Comp. Sci.';

Natural Join

  • List the names of instructors along with the course ID of the courses that they taught.
SQL
1
2
3
select name, course_id
from instructor, teaches
where instructor.ID = teaches.ID;

or

SQL
select name, course_id
from instructor natural join teaches;

Important

Beware of unrelated attributes with same name which get equated incorrectly.

  • List the names of instructors along with the titles of courses that they teach.
    • course(course_id, title, dept_name, credits)
    • teaches(ID, course_id, sec_id, semester, year)
    • instructor(ID, name, dept_name, salary)
  • Incorrect version(makes course.dept_name = instructor.dept_name)
SQL
select name, title
from instructor natural join teaches natural join course;
  • Correct version
SQL
1
2
3
select name, title
from instructor natural join teaches, course
where teaches.course_id = course.course_id;

or

SQL
select name, title
from (instructor natural join teaches) join course using (course_id);

or

SQL
1
2
3
select name, title
from instructor, teaches, course
where instructor.ID = teaches.ID and teaches.course_id = course.course_id;

Another example: find students who takes courses across his/her department

SQL
1
2
3
select distinct student.id
from (student natural join takes) join course using (course_id)
where student.dept_name <> course.dept_name;

The Rename Operation

  • The SQL allows renaming relations and attributes using the as clause:
SQL
old-name as new-name

E.g.

SQL
select ID, name, salary/12 as monthly_salary
from instructor;
  • Find the names of all instructors who have a higher salary than some instructor in 'Comp. Sci'.
SQL
1
2
3
select distinct T.name
from instructor as T, instructor as S
where T.salary > S.salary and S.dept_name = 'Comp. Sci.';
  • Keyword as is optional and may be omitted: \(instructor as T \equiv instructor T\)

String Operations

  • percent(%). The % character matches any substring.
  • underscore(_). The _ character matches any character.

  • Find the names of all instructors whose name includes the substring "dar".

SQL
1
2
3
select name
from instructor
where name like '%dar%';
  • Match the string "100%"
SQL
1
2
3
like '100\%' escape '\'
like '100\%'
like '100 #%' escape '#'

Note

  • '_ _ _' matches any 3-character string
  • '_ _ _ %' matches any string of at least 3 characters

Ordering the Display of Tuples

  • List in alphabetic order the names of all instructors
SQL
1
2
3
select distinct name 
from instructor
order by dept_name, name desc

从上边的代码可以看到,我们可以同时对多个属性进行排序,并且可以指定升序或者降序。

The limit Clause

  • Limit clause takes one or two numeric arguments, which must both be nonnegative integer constants:
SQL
limit offset, row_count

or

SQL
limit row_count == limit 0, row_count
  • List names of instructors whose salary is among top 3.
SQL
1
2
3
4
select name
from instructor
order by salary desc
limit 3;

Set Operations

  • Set operations union, intersect, except are supported in SQL.
    • Each of the above operations automatically eliminates duplicates
  • To retain duplicates, use union all, intersect all, except all

Null Values

  • The result of any arithmetic operation with a null operand is null.
  • Example: Find all instructors whose salary is null.
SQL
1
2
3
select name
from instructor
where salary is null

Note

  • comparisons with null values return the special truth value: unknown
  • Three-valued logic using the truth value unknown:
  • unknown or true = true
  • unknown or false = unknown
  • unknown or unknown = unknown
  • true and unknown = unknown
  • false and unknown = false
  • unknown and unknown = unknown
  • not unknown = unknown

Aggregate Functions

  • These functions operate on the multiset of values of a column of a relation, and return a value

    • avg: average value
    • min: minimum value
    • max: maximum value
    • sum: sum of values
    • count: number of values
  • Find the number of tuples in the course relation

SQL
select count(*)
from course;

Having Clause

  • 对我们的聚合函数进行过滤
  • 例:找到所有的部门,以及部门的平均工资大于 42000 的部门
SQL
1
2
3
4
select dept_name, avg(salary)
from instructor
group by dept_name
having avg(salary) > 42000;
  • All aggregate operations except count(*) ignore tuples with null values on the aggregated attributes
  • If collection has only null values, then count(*) returns 0, while other aggregate functions return null.

Nested Subqueries

  • A subquery is a select-from-where expression that is nested within another query.
  • A common use of subqueries is to perform tests for:
    • set membership
    • set comparison
    • set cardinality

Set Membership

  • Find courses offered in Fall 2009 and in Spring 2010
SQL
1
2
3
4
5
6
select distinct course_id
from section
where semester = 'Fall' and year = 2009 and
    course_id in (select course_id
                  from section
                  where semester = 'Spring' and year = 2010);
  • Find courses offered in Fall 2009 but not in Spring 2010
SQL
1
2
3
4
5
6
select distinct course_id
from section
where semester = 'Fall' and year = 2009 and 
    course_id not in (select course_id
                      from section
                      where semester = 'Spring' and year = 2010);
  • Find the total number of (distinct) students who have taken course sections taught by the instructor with ID 10101
SQL
1
2
3
4
5
6
select count(distinct ID)
from takes
where (course_id, sec_id, semester, year) in 
    (select course_id, sec_id, semester, year
    from teaches
    where ID = '10101');

Set Comparison

  • Find names of instructors with salary greater than that of all instructors in the Biology department.
SQL
1
2
3
4
5
select name
from instructor
where salary > all (select salary
                    from instructor
                    where dept_name = 'Biology');

Scalar Subquery

Scalar(标量) subquery is one which is used where a single value is expected.

Example

SQL
1
2
3
4
5
select name
from instructor
where salary * 10 >
    (select budget from department
    where department.dept_name = instructor.dept_name);

这里的dept_name是这个表的主键,只返回一个元组,所以我们不能用some, all这样的关键字。

Runtime error if subquery returns more than one result value.

Test for Empty Relations

The exists construct returns the value true if the argument subquery is nonempty.

  • \(\mathrm{Exists} \ r \ \Leftrightarrow \ r \neq \emptyset\)
  • \(\mathrm{Not \ exists} \ r \ \Leftrightarrow \ r = \emptyset\)

Example

找到所有在2009年秋学期和2010年春学期都有课的课程

SQL
1
2
3
4
5
6
7
select course_id
from section as S
where semester = 'Fall' and year = 2009 and
    exists(select *
           from section as T
           where semester = 'Spring' and year = 2010 and
               S.course_id = T.course_id);

Example

找到所有参加了生物系所有课程的学生

SQL
1
2
3
4
5
6
7
8
9
select distinct S.ID, S.name
from student as S
where not exists((select course_id
                  from course
                  where dept_name = 'Biology')
                  except
                  (select course_id
                  from takes as T
                  where S.ID = T.ID));

逆向考虑,找到这样的学生,不存在他没选过的生物系的课。

Test for Absence of Duplicate Tuples

The unique construct tests whether a subquery has any duplicate tuples in its result.

Unique关键字用于测试子查询的结果是否有重复的元组。

可以将其理解为最多一次。

Example

Find all courses that were offered at most once in 2009.

SQL
1
2
3
4
5
select T.course_id
from course as T
where unique (select R.course_id
              from section as R
              where R.course_id = T.course_id and R.year = 2009);

Another Example

Find all courses that were offered once in 2009

SQL
1
2
3
4
5
6
7
8
select T.course_id
from course as T
where unqiue (select R.course_id
              from section as R
              where R.course_id = T.course_id and R.year = 2009);
        and exists (select R.course_id
                    from section as R
                    where R.course_id = T.course_id and R.year = 2009);

由于unqiue是找一次或两次,所以我们必须加上后边的exists来保证存在一次,否则有不开的课也会被算进去。

Modification of the Database

Deletion

Deletion

Insertion

Insertion

insert后还可以接查询语句,用于插入特定的元组。

SQL
1
2
3
insert into student
select ID, name, dept_name, 0
from instructor

Update

Update

还可以与case配合使用。

SQL
1
2
3
4
5
update instructor
set salary = case 
                when salary <= 100000 then salary * 1.1
                else salary * 1.05
            end;

评论