postgresql exercises 다운로드 - postgresql exercises 소스 코드 다운로드

PostgreSQL 연습

이것은 Alisdair Owen의 PostgreSQL 연습에 대한 모든 질문과 답변을 편집 한 것입니다. 실제로 이러한 문제를 해결하면이 가이드를 훑어 보는 것보다 더 나아갈 수 있으므로 PostgreSQL 연습을 지불해야합니다.

시작하기
- 내 자신의 Postgres 시스템을 사용하고 싶습니다
- 개요
간단한 SQL 쿼리
- 테이블에서 모든 것을 검색하십시오
- 테이블에서 특정 열을 검색합니다
- 검색되는 행을 제어합니다
- 검색되는 행을 제어, 2 부
- 기본 문자열 검색
- 여러 가능한 값과 일치합니다
- 결과를 버킷으로 분류합니다
- 날짜로 작업
- 중복 제거 및 주문 결과
- 여러 쿼리의 결과를 결합합니다
- 간단한 집계
- 더 많은 집계
합류 및 하위 쿼리
- 회원 예약의 시작 시간을 검색하십시오
- 테니스 코트 예약의 시작 시간을 해결하십시오.
- 다른 회원을 추천 한 모든 회원 목록을 작성하십시오.
- 추천자와 함께 모든 회원 목록을 작성하십시오.
- 테니스 코트를 사용한 모든 회원 목록을 작성
- 비용이 많이 드는 예약 목록을 작성합니다
- No Join을 사용하여 추천자와 함께 모든 회원 목록을 작성하십시오.
- 하위 쿼리를 사용하여 비용이 많이 드는 예약 목록을 작성하십시오
데이터 수정
- 일부 데이터를 테이블에 삽입하십시오
- 여러 행의 데이터를 테이블에 삽입하십시오
- 계산 된 데이터를 테이블에 삽입하십시오
- 기존 데이터를 업데이트하십시오
- 여러 행과 열을 동시에 업데이트하십시오
- 다른 행의 내용에 따라 행을 업데이트합니다.
- 모든 예약 삭제
- C-Members 테이블에서 멤버를 삭제하십시오
- 하위 퀘스트를 기반으로 삭제합니다
집합
- 시설의 수를 세십시오
- 비싼 시설의 수를 세십시오
- 각 회원이하는 권장 사항 수를 계산하십시오
- 시설 당 예약 된 총 슬롯을 나열하십시오
- 주어진 달에 시설 당 예약 된 총 슬롯 나열
- 한 달에 시설 당 예약 된 총 슬롯을 나열하십시오
- 한 번 이상 예약 한 회원 수를 찾으십시오.
- 1000 개 이상의 슬롯이 예약 된 시설을 나열하십시오
- 각 시설의 총 수익을 찾으십시오
- 총 수익이 1000 미만인 시설을 찾으십시오
- 예약 된 슬롯 수가 가장 많은 시설 ID를 출력
- 매달 시설 당 예약 된 총 슬롯, 2 부
- 지명 된 시설 당 예약 된 총 시간을 나열하십시오
- 2012 년 9 월 1 일 이후 각 회원의 첫 번째 예약을 나열하십시오
- 총 멤버 카운트가 포함 된 각 행과 함께 멤버 이름 목록을 생성합니다.
- 번호가 매겨진 회원 목록을 생산합니다
- 슬롯 수가 가장 많은 시설 ID를 다시 출력하여 다시 예약되었습니다.
- 사용 된 시간으로 회원을 순위를 매 깁니다
- 상위 3 개 수익 창출 시설을 찾으십시오
- 시설을 가치별로 분류하십시오
- 각 시설의 투자 회수 시간을 계산하십시오
- 총 수익의 롤링 평균을 계산합니다
타임 스탬프로 작업합니다
- 2012 년 8 월 31 일 오전 1시 타임 스탬프 생산
- 서로로부터 타임 스탬프를 빼십시오
- 2012 년 10 월 모든 날짜 목록 생성
- 타임 스탬프에서 매월의 날을 얻으십시오
- 타임 스탬프 사이의 초 수를 해결하십시오
- 2012 년 각 달의 일수를 해결하십시오.
- 그 달에 남아있는 일 수를 해결
- 예약 종료 시간을 해결하십시오
- 매월 예약 수를 반환하십시오
- 매달 각 시설의 활용률을 계산하십시오.
문자열 작업
- 회원의 이름을 형식화합니다
- 이름 접두사로 시설을 찾으십시오
- 사례에 민감한 검색을 수행하십시오
- 괄호로 전화 번호를 찾으십시오
- 주요 0을 가진 패드 우편 번호
- 알파벳의 각 문자로 성을 시작하는 멤버 수를 계산하십시오.
- 전화 번호를 정리하십시오
재귀 쿼리
- 회원 ID 27의 상향 추천 체인을 찾으십시오
- 멤버 ID 1의 하향 추천 체인을 찾으십시오
- 모든 회원의 상향 추천 체인을 반환 할 수있는 CTE를 생성합니다.

시작하기

연습을하는 것은 매우 간단합니다. 운동을 열고 질문을보고, 대답하는 것입니다!

이 연습의 데이터 세트는 새로 만든 컨트리 클럽을위한 것입니다. 회원 세트, 테니스 코트와 같은 시설 및 해당 시설의 예약 기록이 있습니다. 무엇보다도 클럽은 정보를 사용하여 시설 사용/수요를 분석 할 수있는 방법을 이해하려고합니다. 참고 : 이 데이터 세트는 순전히 흥미로운 연습을 지원하도록 설계되었으며 데이터베이스 스키마는 여러 측면에서 결함이 있습니다. 좋은 디자인의 예로 사용하지 마십시오. 회원 테이블을 살펴 보겠습니다.

 CREATE TABLE cd .members
(
    memid integer NOT NULL , 
    surname character varying ( 200 ) NOT NULL , 
    firstname character varying ( 200 ) NOT NULL , 
    address character varying ( 300 ) NOT NULL , 
    zipcode integer NOT NULL , 
    telephone character varying ( 20 ) NOT NULL , 
    recommendedby integer ,
    joindate timestamp not null ,
    CONSTRAINT members_pk PRIMARY KEY (memid),
    CONSTRAINT fk_members_recommendedby FOREIGN KEY (recommendedby)
        REFERENCES cd . members (memid) ON DELETE SET NULL
);

각 멤버에게는 ID (순차적이지 않음), 기본 주소 정보, 권장 한 멤버 (있는 경우)에 대한 참조 및 가입 시점에 대한 타임 스탬프가 있습니다. 데이터 세트의 주소는 전적으로 (그리고 비현실적으로) 제작되었습니다.

 CREATE TABLE cd .facilities
(
    facid integer NOT NULL , 
    name character varying ( 100 ) NOT NULL , 
    membercost numeric NOT NULL , 
    guestcost numeric NOT NULL , 
    initialoutlay numeric NOT NULL , 
    monthlymaintenance numeric NOT NULL , 
    CONSTRAINT facilities_pk PRIMARY KEY (facid)
);

시설 테이블에는 Country Club이 보유한 모든 예약 가능한 시설이 나와 있습니다. 클럽은 ID/이름 정보, 회원 및 손님을 예약하는 데 드는 비용, 시설 구축 초기 비용 및 예상 월간 유지 비용을 저장합니다. 그들은이 정보를 사용하여 각 시설이 얼마나 재정적으로 가치가 있는지 추적하기를 희망합니다.

 CREATE TABLE cd .bookings
(
    bookid integer NOT NULL , 
    facid integer NOT NULL , 
    memid integer NOT NULL , 
    starttime timestamp NOT NULL ,
    slots integer NOT NULL ,
    CONSTRAINT bookings_pk PRIMARY KEY (bookid),
    CONSTRAINT fk_bookings_facid FOREIGN KEY (facid) REFERENCES cd . facilities (facid),
    CONSTRAINT fk_bookings_memid FOREIGN KEY (memid) REFERENCES cd . members (memid)
);

마지막으로, 시설의 예약 예약이 있습니다. 이로 인해 시설 ID, 예약을 한 회원, 예약 시작 및 예약의 30 분의 슬롯 수를 저장합니다. 이 특유의 디자인은 특정 쿼리를 더욱 어렵게 만들지 만 몇 가지 흥미로운 과제를 제공하고 실제 데이터베이스와 함께 일하는 공포를 준비해야합니다. :-).

좋아, 그것은 필요한 모든 정보 여야합니다. 위의 메뉴에서 시도 할 쿼리 범주를 선택하거나 처음부터 시작할 수 있습니다.

내 자신의 Postgres 시스템을 사용하고 싶습니다

괜찮아요! 일어나서 달리기는 어렵지 않습니다. 먼저 PostgreSQL을 설치해야합니다. 여기에서 얻을 수 있습니다. 시작되면 SQL을 다운로드하십시오.

마지막으로, psql -U <username> -f clubdata.sql -d postgres -x -q '연습'데이터베이스, postgres 'pgexercises'사용자, 테이블을로드하기 위해 데이터를로드하기 위해. C 로케일)

쿼리를 실행할 때는 PSQL이 약간 어리석은 것을 찾을 수 있습니다. 그렇다면 Pgadmin 또는 Eclipse 데이터베이스 개발 도구를 사용해 보는 것이 좋습니다.

개요

간단한 SQL 쿼리

이 범주는 SQL의 기본 사항을 다룹니다. 그것은 선택과 조항, 사례 표현, 노조 및 기타 몇 가지 확률과 끝을 포함합니다. 이미 SQL로 교육을 받았다면 아마도 이러한 연습이 상당히 쉽게 찾을 수있을 것입니다. 그렇지 않다면 앞으로 더 어려운 범주를 배우기 시작하는 좋은 포인트를 찾아야합니다!

이러한 질문으로 어려움을 겪고 있다면 Alan Beaulieu의 SQL 학습을 주제에 대한 간결하고 잘 작성된 책으로 강력히 권장합니다. 데이터베이스 시스템의 기본 사항 (사용 방법과는 달리)에 관심이있는 경우 CJ 날짜까지 데이터베이스 시스템 소개를 조사해야합니다.

테이블에서 모든 것을 검색하십시오

CD.Ficiilities 테이블에서 모든 정보를 어떻게 검색 할 수 있습니까?

예상 결과 :

안면	이름	멤버 코스트	게스트 코스트	초기 라우팅	월간 관리
0	테니스 코트 1	5	25	10000	200
1	테니스 코트 2	5	25	8000	200
2	배드민턴 법원	0	15.5	4000	50
3	탁구	0	5	320	10
4	마사지 룸 1	35	80	4000	3000
5	마사지 룸 2	35	80	4000	3000
6	스쿼시 코트	3.5	17.5	5000	80
7	스누커 테이블	0	5	450	15
8	당구대	0	5	400	15

답변:

 select * from cd . facilities ;

SELECT 문은 데이터베이스에서 정보를 읽는 쿼리의 기본 시작 블록입니다. 최소 선택 명령문은 일반적으로 select [some set of columns] from [some table or group of tables] 로 구성됩니다.

이 경우 시설 테이블의 모든 정보를 원합니다. From Section은 쉽습니다. cd.facilities 테이블을 지정하면됩니다. 'CD'는 표의 스키마 - 데이터베이스에서 관련 정보의 논리적 그룹화에 사용되는 용어입니다.

다음으로 모든 열을 원하는 것을 지정해야합니다. 편리하게, '모든 열' - *에 대한 속기가 있습니다. 우리는 모든 열 이름을 무력하게 지정하는 대신 이것을 사용할 수 있습니다.

테이블에서 특정 열을 검색합니다

모든 시설 목록과 회원 비용을 인쇄하려고합니다. 시설 이름과 비용 목록을 어떻게 검색 하시겠습니까?

예상 결과 :

이름	멤버 코스트
테니스 코트 1	5
테니스 코트 2	5
배드민턴 법원	0
탁구	0
마사지 룸 1	35
마사지 룸 2	35
스쿼시 코트	3.5
스누커 테이블	0
당구대	0

답변:

 select name, membercost from cd . facilities ;

이 질문을 위해서는 원하는 열을 지정해야합니다. Select 문에 지정된 간단한 쉼표로 지정된 열 이름 목록으로이를 수행 할 수 있습니다. 모든 데이터베이스는 From Clause에서 사용 가능한 열을 살펴보고 아래 그림과 같이 요청한 열을 반환하는 것입니다.

일반적으로 말하면, 비단 쿼리의 경우 *사용하지 않고 쿼리에서 원하는 열의 이름을 지정하는 것이 바람직한 것으로 간주됩니다. 더 많은 열이 테이블에 추가되면 응용 프로그램에 대처하지 못할 수 있기 때문입니다.

검색되는 행을 제어합니다

회원에게 수수료를 청구하는 시설 목록을 어떻게 생산할 수 있습니까?

예상 결과 :

안면	이름	멤버 코스트	게스트 코스트	초기 라우팅	월간 관리
0	테니스 코트 1	5	25	10000	200
1	테니스 코트 2	5	25	8000	200
4	마사지 룸 1	35	80	4000	3000
5	마사지 룸 2	35	80	4000	3000
6	스쿼시 코트	3.5	17.5	5000	80

답변:

 select * from cd . facilities where membercost > 0 ;

FROM Clause는 결과를 읽을 수있는 후보 행을 구축하는 데 사용됩니다. 우리의 예에서 지금 까지이 행 세트는 단순히 테이블의 내용이었습니다. 향후 우리는 가입을 탐색하여 훨씬 더 흥미로운 후보자를 만들 수 있습니다.

후보 행을 구축 한 후에는 WHERE 절을 통해 우리가 관심있는 행을 필터링 할 수 있습니다.이 경우 멤버 코스트가 0 이상인 것입니다. 이후 연습에서 볼 수 있듯이 조항 WHERE 부울 논리와 결합 된 여러 구성 요소를 가질 수 있습니다. 예를 들어, 0보다 큰 비용을 가진 시설을 검색 할 수 있습니다. 시설 테이블의 WHERE 절의 필터링 조치는 다음과 같습니다.

검색되는 행을 제어, 2 부

회원들에게 수수료를 청구하는 시설 목록을 어떻게 생산할 수 있으며, 그 수수료는 월간 유지 보수 비용의 1/50 미만입니다. 문제의 시설의 얼굴, 시설 이름, 회원 비용 및 월간 유지 보수를 반환하십시오.

예상 결과 :

안면	이름	멤버 코스트	월간 관리
4	마사지 룸 1	35	3000
5	마사지 룸 2	35	3000

답변:

 select facid, name, membercost, monthlymaintenance 
	from cd . facilities 
	where 
		membercost > 0 and 
		(membercost < monthlymaintenance / 50 . 0 );

WHERE 절을 통해 우리가 관심있는 행을 필터링 할 수 있습니다.이 경우 멤버 코스트가 0보다 높고 월간 유지 보수 비용의 1/50 미만인 것입니다. 보시다시피, 마사지 룸은 직원 비용 덕분에 달리기가 매우 비쌉니다!

두 개 이상의 조건을 테스트하려면 사용 AND 결합합니다. 우리는 당신이 기대할 수 있듯이 한 쌍의 조건 중 하나가 사실인지 테스트 OR 수 있습니다.

이 WHERE 을 특정 열을 선택하는 것과 결합하는 첫 번째 쿼리임을 알 수 있습니다. 선택한 열과 선택한 행의 교차점은 우리에게 데이터를 반환 할 수있는 데이터를 제공합니다. 이것은 지금 너무 흥미롭지 않은 것처럼 보일지 모르지만 나중에 조인과 같은 더 복잡한 작업을 추가하면이 행동의 간단한 우아함이 보입니다.

기본 문자열 검색

'테니스'라는 단어 이름으로 모든 시설 목록을 어떻게 만들 수 있습니까?

예상 결과 :

안면	이름	멤버 코스트	게스트 코스트	초기 라우팅	월간 관리
0	테니스 코트 1	5	25	10000	200
1	테니스 코트 2	5	25	8000	200
3	탁구	0	5	320	10

답변:

 select *
	from cd . facilities 
	where 
		name like ' %Tennis% ' ;

SQL은 연산자 LIKE 간단한 패턴 일치를 제공합니다. 그것은 거의 보편적으로 구현되어 있으며 사용하기가 멋지고 간단합니다. % 문자가 모든 문자열과 일치하는 문자열과 _ 단일 문자와 일치하는 문자열이 필요합니다. 이 경우 '테니스'라는 단어가 포함 된 이름을 찾고 있으므로 양쪽에 %를 넣는 것이 청구서에 적합합니다.

이 작업을 수행하는 다른 방법이 있습니다. Postgres는 예를 들어 ~ 연산자와의 정규 표현을 지원합니다. 무엇이든 사용하면 편안하게 느끼게되지만 LIKE 연산자가 시스템간에 훨씬 더 휴대가 가능하다는 것을 알고 있어야합니다.

여러 가능한 값과 일치합니다

ID 1과 5로 시설의 세부 사항을 어떻게 검색 할 수 있습니까? OR 연산자를 사용하지 않고 시도하십시오.

예상 결과 :

안면	이름	멤버 코스트	게스트 코스트	초기 라우팅	월간 관리
1	테니스 코트 2	5	25	8000	200
5	마사지 룸 2	35	80	4000	3000

답변:

 select *
	from cd . facilities 
	where 
		facid in ( 1 , 5 );

이 질문에 대한 명백한 대답은 where facid = 1 or facid = 5 보이는 WHERE 절을 사용하는 것입니다. 가능한 많은 일치에서 더 쉬운 대안은 IN 입니다. IN 연산자는 가능한 값의 목록을 가져 와서 (이 경우) 팩스와 일치시킵니다. 값 중 하나가 일치하는 경우 해당 행에 대해 절차가 true이고 행이 반환됩니다.

IN Operator는 관계형 모델의 우아함에 대한 훌륭한 초기 시연 자입니다. 인수는 단순한 값 목록이 아니라 실제로 단일 열이있는 테이블입니다. 쿼리도 테이블을 반환하므로 단일 열을 반환하는 쿼리를 작성하면 해당 결과를 IN 연산자에 공급할 수 있습니다. 장난감 예제를 제공하려면 :

 select * 
	from cd . facilities
	where
		facid in (
			select facid from cd . facilities
			);

이 예제는 기능적으로 모든 시설을 선택하는 것과 동일하지만 한 쿼리의 결과를 다른 쿼리에 공급하는 방법을 보여줍니다. 내부 쿼리를 하위 쿼리 라고합니다.

결과를 버킷으로 분류합니다

월간 유지 보수 비용이 $ 100 이상인지에 따라 각각이 '저렴한'또는 '비싼'으로 표시된 시설 목록을 어떻게 생산할 수 있습니까? 해당 시설의 이름과 월간 유지 보수를 반환하십시오.

예상 결과 :

이름	비용
테니스 코트 1	값비싼
테니스 코트 2	값비싼
배드민턴 법원	값이 싼
탁구	값이 싼
마사지 룸 1	값비싼
마사지 룸 2	값비싼
스쿼시 코트	값이 싼
스누커 테이블	값이 싼
당구대	값이 싼

답변:

 select name, 
	case when (monthlymaintenance > 100 ) then
		' expensive '
	else
		' cheap '
	end as cost
	from cd . facilities ;

이 연습에는 몇 가지 새로운 개념이 포함되어 있습니다. 첫 번째는 SELECT 와 FROM 사이의 쿼리 영역에서 계산을 수행한다는 사실입니다. 이전에는이 작업을 사용하여 반환하려는 열만 선택했지만 서브 쿼리를 포함하여 반환 된 행 당 단일 결과를 생성하는 내용을 여기에 넣을 수 있습니다.

두 번째 새로운 개념은 CASE 진술 자체입니다. CASE 다른 언어로 된 IF/스위치 문을 효과적으로 좋아하며 쿼리에 표시된 양식이 있습니다. '중간'옵션을 추가하려면 단순히 다른 when...then 섹션을 삽입합니다.

마지막으로, AS 운영자가 있습니다. 이것은 단순히 열이나 표현을 레이블을 지정하여 더 멋지게 표시하거나 하위 퀘스트의 일부로 사용될 때 더 쉽게 참조 할 수 있도록하는 데 사용됩니다.

날짜로 작업

2012 년 9 월 초에 가입 한 회원 목록을 어떻게 만들 수 있습니까? 해당 회원의 Memid, Surname, FirstName 및 Joindate를 반환하십시오.

예상 결과 :

Memid	성	FirstName	JOINDATE
24	사윈	람 나시	2012-09-01 08:44:42
26	존스	더글러스	2012-09-02 18:43:05
27	럼니	헨리에타	2012-09-05 08:42:35
28	파렐	데이비드	2012-09-15 08:22:05
29	Worthington-Smyth	헨리	2012-09-17 12:27:15
30	범위	밀리 센트	2012-09-18 19:04:01
33	Tupperware	히아신스	2012-09-18 19:32:05
35	사냥	남자	2012-09-19 11:32:45
36	머리	에리카	2012-09-22 08:36:38
37	스미스	대런	2012-09-26 18:08:45

답변:

 select memid, surname, firstname, joindate 
	from cd . members
	where joindate >= ' 2012-09-01 ' ;

이것은 SQL 타임 스탬프를 처음으로 살펴보십시오. 그들은 내림차순으로 형식화되어 있습니다 : YYYY-MM-DD HH:MM:SS.nnnnnn . 날짜 사이의 차이점을 얻는 것은 조금 더 관여하고 강력하지만 UNIX 타임 스탬프처럼 비교할 수 있습니다. 이 경우 타임 스탬프의 날짜 부분을 지정했습니다. 이것은 Postgres에 의해 전체 타임 스탬프 2012-09-01 00:00:00 으로 자동으로 시전됩니다.

중복 제거 및 주문 결과

회원 테이블에서 처음 10 개의 성 목록을 어떻게 작성할 수 있습니까? 목록에는 복제물이 포함되어 있지 않아야합니다.

예상 결과 :

성
바더
빵 굽는 사람
부스
버터
코플린
머리
도전
파렐
손님
젠틀

답변:

 select distinct surname 
	from cd . members
order by surname
limit 10 ;

여기에는 세 가지 새로운 개념이 있지만 모두 매우 간단합니다.

SELECT 후 DISTINCT 지정하면 결과 세트에서 중복 행을 제거합니다. 이것은 행 에 적용됩니다. 행 A에 여러 열이있는 경우 모든 열의 값이 동일 한 경우에만 행 B가 동일합니다. 일반적으로 윌리 니 잘라기 방식으로 DISTINCT 사용하지 마십시오. 큰 쿼리 결과 세트에서 중복을 자유롭게 제거 할 수는 없으므로 필요로합니다.
(쿼리 끝 근처에서 FROM 이후 및 WHERE 후) ORDER BY 지정하면 열 또는 열 세트 (쉼표 분리)로 결과를 주문할 수 있습니다.
LIMIT 키워드를 사용하면 검색된 결과 수를 제한 할 수 있습니다. 이것은 한 번에 한 페이지 씩 결과를 얻는 데 유용하며 다음 페이지를 얻기 위해 OFFSET 키워드와 결합 할 수 있습니다. 이것은 MySQL이 사용하는 것과 동일한 접근법이며 매우 편리합니다. 불행히도이 프로세스가 다른 DBS에서는 조금 더 복잡하다는 것을 알 수 있습니다.

여러 쿼리의 결과를 결합합니다

어떤 이유로 든 모든 성과 모든 시설 이름의 결합 된 목록을 원합니다. 예, 이것은 다음과 같은 예입니다 :-). 그 목록을 생산하십시오!

예상 결과 :

성
테니스 코트 2
Worthington-Smyth
배드민턴 법원
핑커
도전
바더
맥켄지
머리
마사지 룸 1
스쿼시 코트

답변:

 select surname 
	from cd . members
union
select name
	from cd . facilities ;

UNION Operator는 예상 할 수있는 작업을 수행합니다. 두 SQL 쿼리 결과를 단일 테이블로 결합합니다. 경고는 두 쿼리의 두 결과 모두 동일한 수의 열과 호환 데이터 유형을 가져야한다는 것입니다.

UNION 중복 행을 제거하고 UNION ALL 그렇지 않습니다. 중복 결과를 신경 쓰지 않는 한 기본적으로 UNION ALL 사용하십시오.

간단한 집계

마지막 멤버의 가입 날짜를 받고 싶습니다. 이 정보를 어떻게 검색 할 수 있습니까?

예상 결과 :

최신
2012-09-26 18:08:45

답변:

 select max (joindate) as latest
	from cd . members ;

이것이 SQL의 집계 함수로 처음 진출합니다. 그들은 전체 행 그룹에 대한 정보를 추출하는 데 사용되며 다음과 같은 질문을 쉽게 요청할 수 있습니다.

매월 유지하는 가장 비싼 시설은 무엇입니까?
누가 가장 새로운 회원을 추천 했습니까?
각 회원은 우리 시설에서 얼마나 많은 시간을 보냈습니까?

여기서 최대 집계 함수는 매우 간단합니다. Joindate에 대한 가능한 모든 값을 수신하고 가장 큰 값을 출력합니다. 기능을 집계 할 수있는 힘이 훨씬 더 많으며 향후 운동에서 만나게됩니다.

더 많은 집계

날짜뿐만 아니라 가입 한 마지막 멤버의 첫 번째 및 성을 얻고 싶습니다. 어떻게 할 수 있습니까?

예상 결과 :

FirstName	성	JOINDATE
대런	스미스	2012-09-26 18:08:45

답변:

 select firstname, surname, joindate
	from cd . members
	where joindate = 
		( select max (joindate) 
			from cd . members );

위의 제안 된 접근법에서는 서브 쿼리를 사용하여 가장 최근의 Joindate가 무엇인지 알아냅니다. 이 하위 퀘스트는 스칼라 테이블, 즉 단일 열이있는 테이블과 단일 행이있는 테이블을 반환합니다. 우리는 단일 값 만 있으므로 단일 상수 값을 넣을 수있는 어느 곳에서나 하위 쿼리를 대체 할 수 있습니다. 이 경우 쿼리의 WHERE 절을 완성하여 주어진 멤버를 찾습니다.

당신은 당신이 다음과 같은 일을 할 수 있기를 바랍니다.

 select firstname, surname, max (joindate)
        from cd . members

불행히도 이것은 작동하지 않습니다. MAX 기능은 절에서 행하는 WHERE 같이 행을 제한하지 않습니다. 단순히 많은 값을 가져 와서 가장 큰 값을 반환합니다. 그런 다음 데이터베이스는 최대 기능에서 나오는 단일 조인 날짜와 긴 이름 목록을 페어링하는 방법이 궁금합니다. 대신, 당신은 '최대 조인 날짜와 동일한 가입 날짜가있는 행을 찾아야합니다'라고 말해야합니다.

힌트에서 언급했듯이,이 작업을 완료하는 다른 방법이 있습니다. 한 예는 다음과 같습니다. 이 접근법에서 마지막으로 결합 된 날짜가 무엇인지 명시 적으로 찾는 대신, 우리는 단순히 멤버 테이블을 하강 순서로 주문하고 첫 번째를 선택합니다. 이 접근법은 두 사람의 동시에 합류 할 가능성이 극도로 극도로 적용되지 않습니다 :-).

 select firstname, surname, joindate
	from cd . members
order by joindate desc
limit 1 ;

합류 및 하위 쿼리

이 범주는 주로 관계형 데이터베이스 시스템의 기본 개념을 다루고 있습니다. 가입하면 여러 테이블의 관련 정보를 결합하여 질문에 답할 수 있습니다. 이는 쿼리의 용이성에 도움이 될뿐만 아니라 결합 기능 부족으로 인해 데이터의 비정규 화를 장려하여 데이터를 내부적으로 일관성있게 유지하는 복잡성을 증가시킵니다.

이 주제는 내부, 외부 및 셀프 조인을 다루며 하위 쿼리 (쿼리 내 쿼리)에 약간의 시간을 소비합니다. 이러한 질문으로 어려움을 겪고 있다면 Alan Beaulieu의 SQL 학습을 주제에 대한 간결하고 잘 작성된 책으로 강력히 권장합니다.

회원 예약의 시작 시간을 검색하십시오

'David Farrell'이라는 회원의 예약을위한 시작 시간 목록을 어떻게 제작할 수 있습니까?

예상 결과 :

시작 시간
2012-09-18 09:00:00
2012-09-18 17:30:00
2012-09-18 13:30:00
2012-09-18 20:00:00
2012-09-19 09:30:00
2012-09-19 15:00:00
2012-09-19 12:00:00
2012-09-20 15:30:00
2012-09-20 11:30:00
2012-09-20 14:00:00

답변:

 select bks . starttime 
	from 
		cd . bookings bks
		inner join cd . members mems
			on mems . memid = bks . memid
	where 
		mems . firstname = ' David ' 
		and mems . surname = ' Farrell ' ;

가장 일반적으로 사용되는 조인은 INNER JOIN 입니다. 이것이하는 일은 조인식 표현식을 기반으로 두 개의 테이블을 결합하는 것입니다.이 경우 멤버 테이블의 각 멤버 ID에 대해 예약 테이블에서 일치하는 값을 찾고 있습니다. 일치를 찾는 경우 각 테이블의 값을 결합한 행이 반환됩니다. 각 테이블에 별칭 (BKS 및 MEMS)을 제공했습니다. 이것은 두 가지 이유로 사용됩니다. 첫째, 편리하고 둘째는 동일한 테이블에 여러 번 교대 할 수 있으므로 테이블이 결합 될 때마다 열을 구별해야합니다.

우리의 선택과 현재의 조항을 무시하고 FROM 가 생성하는 것에 집중합시다. 우리의 모든 이전 예에서, FROM 단순한 테이블이었습니다. 지금 무엇입니까? 다른 테이블! 이번에는 예약 및 회원의 합성물로 제작되었습니다. 아래 조인 출력의 하위 집합을 볼 수 있습니다.

회원 테이블의 각 멤버에 대해 가입은 예약 테이블에서 일치하는 모든 멤버 ID를 찾았습니다. 각 경기마다 회원 테이블에서 행을 결합한 행과 예약 테이블에서 행을 결합한 행을 생성합니다.

분명히, 이것은 자체적으로 너무 많은 정보이며, 유용한 질문은 그것을 필터링하기를 원할 것입니다. 쿼리에서 우리는 SELECT 항의 시작을 사용하여 열을 선택하고 아래 그림과 같이 WHERE 절을 선택합니다.

그것이 우리가 David의 예약을 찾는 데 필요한 전부입니다! 일반적으로 FROM Clause의 출력은 본질적으로 정보를 필터링하는 하나의 큰 테이블이라는 것을 기억하는 것이 좋습니다. 이것은 비효율적으로 들릴 수 있지만 걱정하지 마십시오. 커버 아래 DB는 훨씬 더 지능적으로 행동합니다 :-).

마지막 참고 사항 : 내부 조인에 대한 두 개의 다른 구문이 있습니다. 나는 당신에게 내가 선호하는 것을 보여 주었고, 다른 조인 유형과 더 일치한다는 것을 보여주었습니다. 일반적으로 아래에 표시된 다른 구문이 표시됩니다.

 select bks . starttime
        from
                cd . bookings bks,
                cd . members mems
        where
                mems . firstname = ' David '
                and mems . surname = ' Farrell '
                and mems . memid = bks . memid ;

이것은 승인 된 답변과 기능적으로 정확히 동일합니다. 이 구문에 더 편한 느낌이 들면 자유롭게 사용하십시오!

테니스 코트 예약의 시작 시간을 해결하십시오.

'2012-09-21'날짜에 대한 테니스 코트의 예약을위한 시작 시간 목록을 어떻게 만들 수 있습니까? 시간에 따라 주문 시간 및 시설 이름 페어링 목록을 반환하십시오.

예상 결과 :

시작	이름
2012-09-21 08:00:00	테니스 코트 1
2012-09-21 08:00:00	테니스 코트 2
2012-09-21 09:30:00	테니스 코트 1
2012-09-21 10:00:00	테니스 코트 2
2012-09-21 11:30:00	테니스 코트 2
2012-09-21 12:00:00	테니스 코트 1
2012-09-21 13:30:00	테니스 코트 1
2012-09-21 14:00:00	테니스 코트 2
2012-09-21 15:30:00	테니스 코트 1
2012-09-21 16:00:00	테니스 코트 2
2012-09-21 17:00:00	테니스 코트 1
2012-09-21 18:00:00	테니스 코트 2

답변:

 select bks . starttime as start, facs . name as name
	from 
		cd . facilities facs
		inner join cd . bookings bks
			on facs . facid = bks . facid
	where 
		facs . facid in ( 0 , 1 ) and
		bks . starttime >= ' 2012-09-21 ' and
		bks . starttime < ' 2012-09-22 '
order by bks . starttime ;

이것은 또 다른 INNER JOIN 쿼리입니다. 쿼리의 FROM 는 쉽습니다. 우리는 단순히 시설과 예약 테이블을 함께 가입하고 있습니다. 이것은 예약의 각 행에 예약 된 시설에 대한 자세한 정보를 첨부하는 테이블을 생성합니다.

쿼리의 WHERE 구성 요소로 STARTTIME의 확인은 상당히 자체 설명입니다. 우리는 모든 예약이 지정된 날짜 사이에서 시작되도록하고 있습니다. 우리는 테니스 법원에만 관심이 있기 때문에 IN 운영자를 사용하여 데이터베이스 시스템에 시설 ID 0 또는 1- 법원의 ID 만 제공하도록 데이터베이스 시스템에 알려줍니다. 이것을 표현하는 다른 방법이 있습니다 : 우리는 where facs.facid = 0 or facs.facid = 1 또는 where facs.name like 'Tennis%' 사용할 수있었습니다.

나머지는 매우 간단합니다. 관심있는 열을 SELECT 하고 시작 시간 ORDER BY .

다른 회원을 추천 한 모든 회원 목록을 작성하십시오.

다른 회원을 추천 한 모든 회원의 목록을 어떻게 출력 할 수 있습니까? 목록에 중복이 없으며 결과를 (성, FirstName)에 의해 주문하십시오.

예상 결과 :

FirstName	성
피렌체	바더
디모데	빵 굽는 사람
제랄드	버터
제미마	파렐
매튜	젠틀
데이비드	존스
제니스	조플렛
밀리 센트	범위
팀	Rownam
대런	스미스
트레이시	스미스
숙고	스티본
버튼	트레이시

답변:

 select distinct recs . firstname as firstname, recs . surname as surname
	from 
		cd . members mems
		inner join cd . members recs
			on recs . memid = mems . recommendedby
order by surname, firstname;

여기에 어떤 사람들이 혼란스러워하는 개념은 다음과 같습니다. 테이블에 합류 할 수 있습니다! CD. Members의 추천과 마찬가지로 동일한 테이블에 데이터를 참조하는 열이있는 경우 실제로 유용합니다.

이것을 시각화하는 데 어려움이 있다면, 이것은 다른 내부 조인과 동일하게 작동한다는 것을 기억하십시오. 우리의 가입은 값이 권장되는 멤버의 각 행을 가져오고 일치하는 멤버 ID가있는 행을 다시 한 번 봅니다. 그런 다음 두 멤버 항목을 결합한 출력 행을 생성합니다. 아래 다이어그램처럼 보입니다.

출력 세트에 두 개의 '성'열이있을 수 있지만 테이블 별칭으로 구별 할 수 있습니다. 우리가 원하는 열을 선택한 후에는 단순히 DISTINCT 사용하여 복제물이 없도록합니다.

memfname	memsname	recfname	recsname
피렌체	바더	숙고	스티본
앤	빵 굽는 사람	숙고	스티본
디모데	빵 굽는 사람	제미마	파렐
팀	부스	팀	Rownam
제랄드	버터	대런	스미스
조안	코플린	디모데	빵 굽는 사람
에리카	머리	트레이시	스미스
낸시	도전	제니스	조플렛
데이비드	파렐
제미마	파렐
손님	손님
매튜	젠틀	제랄드	버터
남자	사냥	밀리 센트	범위
데이비드	존스	제니스	조플렛
더글러스	존스	데이비드	존스
제니스	조플렛	대런	스미스
안나	맥켄지	대런	스미스
찰스	오웬	대런	스미스
데이비드	핑커	제미마	파렐
밀리 센트	범위	트레이시	스미스
팀	Rownam
헨리에타	럼니	매튜	젠틀
람 나시	사윈	피렌체	바더
대런	스미스
대런	스미스
잭	스미스	대런	스미스
트레이시	스미스
숙고	스티본	버튼	트레이시
버튼	트레이시
히아신스	Tupperware
헨리	Worthington-Smyth	트레이시	스미스

테니스 코트를 사용한 모든 회원 목록을 작성

테니스 코트를 사용한 모든 회원 목록을 어떻게 만들 수 있습니까? 출력에 법원 이름과 단일 열로 형식화 된 회원 이름을 포함하십시오. 중복 데이터를 확인하지 않고 멤버 이름별로 주문하십시오.

예상 결과 :

회원	시설
앤 베이커	테니스 코트 2
앤 베이커	테니스 코트 1
버튼 트레이시	테니스 코트 2
버튼 트레이시	테니스 코트 1
찰스 오웬	테니스 코트 2
찰스 오웬	테니스 코트 1
대런 스미스	테니스 코트 2
데이비드 파렐	테니스 코트 2
데이비드 파렐	테니스 코트 1
데이비드 존스	테니스 코트 1
데이비드 존스	테니스 코트 2
데이비드 핑커	테니스 코트 1
더글러스 존스	테니스 코트 1
에리카 크럼 펫	테니스 코트 1
피렌체 바더	테니스 코트 1
피렌체 바더	테니스 코트 2
게스트 게스트	테니스 코트 2
게스트 게스트	테니스 코트 1
제랄드 버터	테니스 코트 1
제랄드 버터	테니스 코트 2
헨리에타 럼니	테니스 코트 2
잭 스미스	테니스 코트 1
잭 스미스	테니스 코트 2
Janice Joplette	테니스 코트 1
Janice Joplette	테니스 코트 2
제미마 파렐	테니스 코트 2
제미마 파렐	테니스 코트 1
조안 코플린	테니스 코트 1
존 헌트	테니스 코트 1
존 헌트	테니스 코트 2
마태 복음	테니스 코트 1
Millicent Purview	테니스 코트 2
낸시 다이	테니스 코트 2
낸시 다이	테니스 코트 1
스티본을 숙고하십시오	테니스 코트 2
스티본을 숙고하십시오	테니스 코트 1
람 나시 사윈	테니스 코트 2
람 나시 사윈	테니스 코트 1
팀 부스	테니스 코트 1
팀 부스	테니스 코트 2
Tim Rownam	테니스 코트 1
Tim Rownam	테니스 코트 2
티모시 베이커	테니스 코트 2
티모시 베이커	테니스 코트 1
트레이시 스미스	테니스 코트 2
트레이시 스미스	테니스 코트 1

답변:

 select distinct mems . firstname || ' ' || mems . surname as member, facs . name as facility
	from 
		cd . members mems
		inner join cd . bookings bks
			on mems . memid = bks . memid
		inner join cd . facilities facs
			on bks . facid = facs . facid
	where
		bks . facid in ( 0 , 1 )
order by member

이 연습은 이전 질문에서 배운 내용을보다 복잡하게 적용하는 것입니다. 또한 우리가 하나 이상의 결합을 사용한 것은 이번이 처음입니다. 일부는 약간 혼란 스러울 수 있습니다. Join Expressions를 읽을 때 조인은 효과적으로 왼쪽 테이블에 레이블이 붙은 두 개의 테이블을 사용하고 다른 하나는 오른쪽을 사용하는 기능이라는 것을 기억하십시오. 이것은 쿼리에 한 번의 결합만으로 쉽게 시각화 할 수 있지만 두 가지와 조금 더 혼란 스럽습니다.

이 쿼리의 두 번째 INNER JOIN CD.Ficientiility의 오른쪽이 있습니다. 이해하기에 쉽습니다. 그러나 왼쪽은 CD.Members를 CD.Bookings에 가입하여 반환 된 테이블입니다. 이를 강조하는 것이 중요합니다. 관계형 모델은 테이블에 관한 것입니다. 조인의 출력은 다른 테이블입니다. 쿼리의 출력은 테이블입니다. 단일 콜링 된 목록은 테이블입니다. 당신이 그것을 이해하면, 당신은 모델의 근본적인 아름다움을 파악했습니다.

마지막으로, 우리는 여기에 하나의 새로운 것을 소개합니다. || 연산자는 문자열을 연결하는 데 사용됩니다.

비용이 많이 드는 예약 목록을 작성합니다

2012-09-14 일에 예약 목록을 어떻게 만들 수 있으며, 이는 30 달러 이상의 회원 (또는 손님)이 비용을 지불 할 수 있습니까? 손님은 회원에 대한 비용이 다르고 (나열된 비용은 30 분당 '슬롯'당), 게스트 사용자는 항상 ID 0입니다. 시설의 이름, 단일 열로 형식화 된 회원의 이름 및 비용을 포함시킵니다. 하강 비용으로 주문하고 하위 쿼리를 사용하지 마십시오.

예상 결과 :

회원	시설	비용
게스트 게스트	마사지 룸 2	320
게스트 게스트	마사지 룸 1	160
게스트 게스트	마사지 룸 1	160
게스트 게스트	마사지 룸 1	160
게스트 게스트	테니스 코트 2	150
제미마 파렐	마사지 룸 1	140
게스트 게스트	테니스 코트 1	75
게스트 게스트	테니스 코트 2	75
게스트 게스트	테니스 코트 1	75
마태 복음	마사지 룸 1	70
피렌체 바더	마사지 룸 2	70
게스트 게스트	스쿼시 코트	70.0
제미마 파렐	마사지 룸 1	70
스티본을 숙고하십시오	마사지 룸 1	70
버튼 트레이시	마사지 룸 1	70
잭 스미스	마사지 룸 1	70
게스트 게스트	스쿼시 코트	35.0
게스트 게스트	스쿼시 코트	35.0

답변:

 select mems . firstname || ' ' || mems . surname as member, 
	facs . name as facility, 
	case 
		when mems . memid = 0 then
			bks . slots * facs . guestcost
		else
			bks . slots * facs . membercost
	end as cost
        from
                cd . members mems                
                inner join cd . bookings bks
                        on mems . memid = bks . memid
                inner join cd . facilities facs
                        on bks . facid = facs . facid
        where
		bks . starttime >= ' 2012-09-14 ' and 
		bks . starttime < ' 2012-09-15 ' and (
			( mems . memid = 0 and bks . slots * facs . guestcost > 30 ) or
			( mems . memid != 0 and bks . slots * facs . membercost > 30 )
		)
order by cost desc ;

이것은 약간 복잡한 것입니다! 우리가 이전에 사용했던 것보다 더 복잡한 논리이지만, 말할 것도 많지 않습니다. WHERE 절은 2012-09-14에서 우리의 출력을 충분히 비용이 많이 드는 행으로 제한하여 손님과 다른 사람을 구별하는 것을 기억합니다. 그런 다음 열 선택에서 CASE 문을 사용하여 회원 또는 게스트의 올바른 비용을 출력합니다.

No Join을 사용하여 추천자와 함께 모든 회원 목록을 작성하십시오.

조인을 사용하지 않고 추천 한 개인을 포함하여 모든 회원 목록을 어떻게 출력 할 수 있습니까? 목록에 중복이없고 각 FirstName + 성 페어링이 열로 서식으로 주문되어 주문되어 있는지 확인하십시오.

예상 결과 :

회원	추천인
안나 맥켄지	대런 스미스
앤 베이커	스티본을 숙고하십시오
버튼 트레이시
찰스 오웬	대런 스미스
대런 스미스
데이비드 파렐
데이비드 존스	Janice Joplette
데이비드 핑커	제미마 파렐
더글러스 존스	데이비드 존스
에리카 크럼 펫	트레이시 스미스
피렌체 바더	스티본을 숙고하십시오
게스트 게스트
제랄드 버터	대런 스미스
헨리에타 럼니	마태 복음
Henry Worthington-Smyth	트레이시 스미스
히아신스 투퍼웨어
잭 스미스	대런 스미스
Janice Joplette	대런 스미스
제미마 파렐
조안 코플린	티모시 베이커
존 헌트	Millicent Purview
마태 복음	제랄드 버터
Millicent Purview	트레이시 스미스
낸시 다이	Janice Joplette
스티본을 숙고하십시오	버튼 트레이시
람 나시 사윈	피렌체 바더
팀 부스	Tim Rownam
Tim Rownam
티모시 베이커	Jemima Farrell
Tracy Smith

답변:

 select distinct mems . firstname || ' ' ||  mems . surname as member,
	( select recs . firstname || ' ' || recs . surname as recommender 
		from cd . members recs 
		where recs . memid = mems . recommendedby
	)
	from 
		cd . members mems
order by member;

This exercise marks the introduction of subqueries. Subqueries are, as the name implies, queries within a query. They're commonly used with aggregates, to answer questions like 'get me all the details of the member who has spent the most hours on Tennis Court 1'.

In this case, we're simply using the subquery to emulate an outer join. For every value of member, the subquery is run once to find the name of the individual who recommended them (if any). A subquery that uses information from the outer query in this way (and thus has to be run for each row in the result set) is known as a correlated subquery .

Produce a list of costly bookings, using a subquery

The Produce a list of costly bookings exercise contained some messy logic: we had to calculate the booking cost in both the WHERE clause and the CASE statement. Try to simplify this calculation using subqueries. For reference, the question was:

How can you produce a list of bookings on the day of 2012-09-14 which will cost the member (or guest) more than $30? Remember that guests have different costs to members (the listed costs are per half-hour 'slot'), and the guest user is always ID 0. Include in your output the name of the facility, the name of the member formatted as a single column, and the cost. Order by descending cost.

Expected results:

회원	시설	비용
GUEST GUEST	Massage Room 2	320
GUEST GUEST	Massage Room 1	160
GUEST GUEST	Massage Room 1	160
GUEST GUEST	Massage Room 1	160
GUEST GUEST	Tennis Court 2	150
Jemima Farrell	Massage Room 1	140
GUEST GUEST	Tennis Court 1	75
GUEST GUEST	Tennis Court 2	75
GUEST GUEST	Tennis Court 1	75
Matthew Genting	Massage Room 1	70
Florence Bader	Massage Room 2	70
GUEST GUEST	Squash Court	70.0
Jemima Farrell	Massage Room 1	70
Ponder Stibbons	Massage Room 1	70
Burton Tracy	Massage Room 1	70
Jack Smith	Massage Room 1	70
GUEST GUEST	Squash Court	35.0
GUEST GUEST	Squash Court	35.0

답변:

 select member, facility, cost from (
	select 
		mems . firstname || ' ' || mems . surname as member,
		facs . name as facility,
		case
			when mems . memid = 0 then
				bks . slots * facs . guestcost
			else
				bks . slots * facs . membercost
		end as cost
		from
			cd . members mems
			inner join cd . bookings bks
				on mems . memid = bks . memid
			inner join cd . facilities facs
				on bks . facid = facs . facid
		where
			bks . starttime >= ' 2012-09-14 ' and
			bks . starttime < ' 2012-09-15 '
	) as bookings
	where cost > 30
order by cost desc ;

This answer provides a mild simplification to the previous iteration: in the no-subquery version, we had to calculate the member or guest's cost in both the WHERE clause and the CASE statement. In our new version, we produce an inline query that calculates the total booking cost for us, allowing the outer query to simply select the bookings it's looking for. For reference, you may also see subqueries in the FROM clause referred to as inline views .

Modifying Data

Querying data is all well and good, but at some point you're probably going to want to put data into your database! This section deals with inserting, updating, and deleting information. Operations that alter your data like this are collectively known as Data Manipulation Language, or DML.

In previous sections, we returned to you the results of the query you've performed. Since modifications like the ones we're making in this section don't return any query results, we instead show you the updated content of the table you're supposed to be working on. You can compare this with the table shown in 'Expected Results' to see how you've done.

If you struggle with these questions, I strongly recommend Learning SQL, by Alan Beaulieu.

Insert some data into a table

The club is adding a new facility - a spa. We need to add it into the facilities table. Use the following values:

facid: 9, Name: 'Spa', membercost: 20, guestcost: 30, initialoutlay: 100000, monthlymaintenance: 800.

Expected results:

facid	이름	membercost	guestcost	initialoutlay	monthlymaintenance
0	Tennis Court 1	5	25	10000	200
1	Tennis Court 2	5	25	8000	200
2	Badminton Court	0	15.5	4000	50
3	탁구	0	5	320	10
4	Massage Room 1	35	80	4000	3000
5	Massage Room 2	35	80	4000	3000
6	Squash Court	3.5	17.5	5000	80
7	Snooker Table	0	5	450	15
8	Pool Table	0	5	400	15
9	온천	20	30	100000	800

답변:

 insert into cd . facilities
    (facid, name, membercost, guestcost, initialoutlay, monthlymaintenance)
    values ( 9 , ' Spa ' , 20 , 30 , 100000 , 800 );

INSERT INTO ... VALUES is the simplest way to insert data into a table. There's not a whole lot to discuss here: VALUES is used to construct a row of data, which the INSERT statement inserts into the table. It's a simple as that.

You can see that there's two sections in parentheses. The first is part of the INSERT statement, and specifies the columns that we're providing data for. The second is part of VALUES , and specifies the actual data we want to insert into each column.

If we're inserting data into every column of the table, as in this example, explicitly specifying the column names is optional. As long as you fill in data for all columns of the table, in the order they were defined when you created the table, you can do something like the following:

 insert into cd . facilities values ( 9 , ' Spa ' , 20 , 30 , 100000 , 800 );

Generally speaking, for SQL that's going to be reused I tend to prefer being explicit and specifying the column names.

Insert multiple rows of data into a table

In the previous exercise, you learned how to add a facility. Now you're going to add multiple facilities in one command. Use the following values:

facid: 9, Name: 'Spa', membercost: 20, guestcost: 30, initialoutlay: 100000, monthlymaintenance: 800.
facid: 10, Name: 'Squash Court 2', membercost: 3.5, guestcost: 17.5, initialoutlay: 5000, monthlymaintenance: 80.

Expected results:

facid	이름	membercost	guestcost	initialoutlay	monthlymaintenance
0	Tennis Court 1	5	25	10000	200
1	Tennis Court 2	5	25	8000	200
2	Badminton Court	0	15.5	4000	50
3	탁구	0	5	320	10
4	Massage Room 1	35	80	4000	3000
5	Massage Room 2	35	80	4000	3000
6	Squash Court	3.5	17.5	5000	80
7	Snooker Table	0	5	450	15
8	Pool Table	0	5	400	15
9	온천	20	30	100000	800
10	Squash Court 2	3.5	17.5	5000	80

답변:

 insert into cd . facilities
    (facid, name, membercost, guestcost, initialoutlay, monthlymaintenance)
    values
        ( 9 , ' Spa ' , 20 , 30 , 100000 , 800 ),
        ( 10 , ' Squash Court 2 ' , 3 . 5 , 17 . 5 , 5000 , 80 );

VALUES can be used to generate more than one row to insert into a table, as seen in this example. Hopefully it's clear what's going on here: the output of VALUES is a table, and that table is copied into cd.facilities, the table specified in the INSERT command.

While you'll most commonly see VALUES when inserting data, Postgres allows you to use VALUES wherever you might use a SELECT . This makes sense: the output of both commands is a table, it's just that VALUES is a bit more ergonomic when working with constant data.

Similarly, it's possible to use SELECT wherever you see a VALUES . This means that you can INSERT the results of a SELECT . 예를 들어:

 insert into cd . facilities
    (facid, name, membercost, guestcost, initialoutlay, monthlymaintenance)
    SELECT 9 , ' Spa ' , 20 , 30 , 100000 , 800
    UNION ALL
        SELECT 10 , ' Squash Court 2 ' , 3 . 5 , 17 . 5 , 5000 , 80 ;

In later exercises you'll see us using INSERT ... SELECT to generate data to insert based on the information already in the database.

Insert calculated data into a table

Let's try adding the spa to the facilities table again. This time, though, we want to automatically generate the value for the next facid, rather than specifying it as a constant. Use the following values for everything else:

Name: 'Spa', membercost: 20, guestcost: 30, initialoutlay: 100000, monthlymaintenance: 800.

Expected results:

facid	이름	membercost	guestcost	initialoutlay	monthlymaintenance
0	Tennis Court 1	5	25	10000	200
1	Tennis Court 2	5	25	8000	200
2	Badminton Court	0	15.5	4000	50
3	탁구	0	5	320	10
4	Massage Room 1	35	80	4000	3000
5	Massage Room 2	35	80	4000	3000
6	Squash Court	3.5	17.5	5000	80
7	Snooker Table	0	5	450	15
8	Pool Table	0	5	400	15
9	온천	20	30	100000	800

답변:

 insert into cd . facilities
    (facid, name, membercost, guestcost, initialoutlay, monthlymaintenance)
    select ( select max (facid) from cd . facilities ) + 1 , ' Spa ' , 20 , 30 , 100000 , 800 ;

In the previous exercises we used VALUES to insert constant data into the facilities table. Here, though, we have a new requirement: a dynamically generated ID. This gives us a real quality of life improvement, as we don't have to manually work out what the current largest ID is: the SQL command does it for us.

Since the VALUES clause is only used to supply constant data, we need to replace it with a query instead. The SELECT statement is fairly simple: there's an inner subquery that works out the next facid based on the largest current id, and the rest is just constant data. The output of the statement is a row that we insert into the facilities table.

While this works fine in our simple example, it's not how you would generally implement an incrementing ID in the real world. Postgres provides SERIAL types that are auto-filled with the next ID when you insert a row. As well as saving us effort, these types are also safer: unlike the answer given in this exercise, there's no need to worry about concurrent operations generating the same ID.

Update some existing data

We made a mistake when entering the data for the second tennis court. The initial outlay was 10000 rather than 8000: you need to alter the data to fix the error.

Expected results:

facid	이름	membercost	guestcost	initialoutlay	monthlymaintenance
0	Tennis Court 1	5	25	10000	200
1	Tennis Court 2	5	25	10000	200
2	Badminton Court	0	15.5	4000	50
3	탁구	0	5	320	10
4	Massage Room 1	35	80	4000	3000
5	Massage Room 2	35	80	4000	3000
6	Squash Court	3.5	17.5	5000	80
7	Snooker Table	0	5	450	15
8	Pool Table	0	5	400	15

답변:

 update cd . facilities
    set initialoutlay = 10000
    where facid = 1 ;

The UPDATE statement is used to alter existing data. If you're familiar with SELECT queries, it's pretty easy to read: the WHERE clause works in exactly the same fashion, allowing us to filter the set of rows we want to work with. These rows are then modified according to the specifications of the SET clause: in this case, setting the initial outlay.

The WHERE clause is extremely important. It's easy to get it wrong or even omit it, with disastrous results. Consider the following command:

 update cd . facilities
    set initialoutlay = 10000 ;

There's no WHERE clause to filter for the rows we're interested in. The result of this is that the update runs on every row in the table! This is rarely what we want to happen.

Update multiple rows and columns at the same time

We want to increase the price of the tennis courts for both members and guests. Update the costs to be 6 for members, and 30 for guests.

facid	이름	membercost	guestcost	initialoutlay	monthlymaintenance
0	Tennis Court 1	6	30	10000	200
1	Tennis Court 2	6	30	8000	200
2	Badminton Court	0	15.5	4000	50
3	탁구	0	5	320	10
4	Massage Room 1	35	80	4000	3000
5	Massage Room 2	35	80	4000	3000
6	Squash Court	3.5	17.5	5000	80
7	Snooker Table	0	5	450	15
8	Pool Table	0	5	400	15

답변:

 update cd . facilities
    set
        membercost = 6 ,
        guestcost = 30
    where facid in ( 0 , 1 );

The SET clause accepts a comma separated list of values that you want to update.

Update a row based on the contents of another row

We want to alter the price of the second tennis court so that it costs 10% more than the first one. Try to do this without using constant values for the prices, so that we can reuse the statement if we want to.

Expected results:

facid	이름	membercost	guestcost	initialoutlay	monthlymaintenance
0	Tennis Court 1	5	25	10000	200
1	Tennis Court 2	5.5	27.5	8000	200
2	Badminton Court	0	15.5	4000	50
3	탁구	0	5	320	10
4	Massage Room 1	35	80	4000	3000
5	Massage Room 2	35	80	4000	3000
6	Squash Court	3.5	17.5	5000	80
7	Snooker Table	0	5	450	15
8	Pool Table	0	5	400	15

답변:

 update cd . facilities facs
    set
        membercost = ( select membercost * 1 . 1 from cd . facilities where facid = 0 ),
        guestcost = ( select guestcost * 1 . 1 from cd . facilities where facid = 0 )
    where facs . facid = 1 ;

Updating columns based on calculated data is not too intrinsically difficult: we can do so pretty easily using subqueries. You can see this approach in our selected answer.

As the number of columns we want to update increases, standard SQL can start to get pretty awkward: you don't want to be specifying a separate subquery for each of 15 different column updates. Postgres provides a nonstandard extension to SQL called UPDATE...FROM that addresses this: it allows you to supply a FROM clause to generate values for use in the SET clause. Example below:

 update cd . facilities facs
    set
        membercost = facs2 . membercost * 1 . 1 ,
        guestcost = facs2 . guestcost * 1 . 1
    from ( select * from cd . facilities where facid = 0 ) facs2
    where facs . facid = 1 ;

Delete all bookings

As part of a clearout of our database, we want to delete all bookings from the cd.bookings table. How can we accomplish this?

Expected results:

bookid	facid	memid	starttime	slots

답변:

 delete from cd . bookings ;

The DELETE statement does what it says on the tin: deletes rows from the table. Here, we show the command in its simplest form, with no qualifiers. In this case, it deletes everything from the table. Obviously, you should be careful with your deletes and make sure they're always limited - we'll see how to do that in the next exercise.

An alternative to unqualified DELETE s is the following:

truncate cd . bookings ;

TRUNCATE also deletes everything in the table, but does so using a quicker underlying mechanism. It's not perfectly safe in all circumstances, though, so use judiciously. When in doubt, use DELETE .

Delete a member from the cd.members table

We want to remove member 37, who has never made a booking, from our database. How can we achieve that?

Expected results:

memid	성	FirstName	주소	zipcode	전화	recommendedby	joindate
0	손님	손님	손님	0	(000) 000-0000		2012-07-01 00:00:00
1	스미스	대런	8 Bloomsbury Close, Boston	4321	555-555-5555		2012-07-02 12:02:05
2	스미스	Tracy	8 Bloomsbury Close, New York	4321	555-555-5555		2012-07-02 12:08:23
3	Rownam	팀	23 Highway Way, Boston	23423	(844) 693-0723		2012-07-03 09:32:15
4	Joplette	제니스	20 Crossing Road, New York	234	(833) 942-4710	1	2012-07-03 10:25:05
5	Butters	제랄드	1065 Huntingdon Avenue, Boston	56754	(844) 078-4130	1	2012-07-09 10:44:09
6	Tracy	버튼	3 Tunisia Drive, Boston	45678	(822) 354-9973		2012-07-15 08:52:55
7	도전	낸시	6 Hunting Lodge Way, Boston	10383	(833) 776-4001	4	2012-07-25 08:59:12
8	Boothe	팀	3 Bloomsbury Close, Reading, 00234	234	(811) 433-2547	3	2012-07-25 16:02:35
9	Stibbons	Ponder	5 Dragons Way, Winchester	87630	(833) 160-3900	6	2012-07-25 17:09:05
10	오웬	찰스	52 Cheshire Grove, Winchester, 28563	28563	(855) 542-5251	1	2012-08-03 19:42:37
11	Jones	David	976 Gnats Close, Reading	33862	(844) 536-8036	4	2012-08-06 16:32:55
12	빵 굽는 사람	앤	55 Powdery Street, Boston	80743	844-076-5141	9	2012-08-10 14:23:22
13	Farrell	Jemima	103 Firth Avenue, North Reading	57392	(855) 016-0163		2012-08-10 14:28:01
14	스미스	잭	252 Binkington Way, Boston	69302	(822) 163-3254	1	2012-08-10 16:22:05
15	Bader	Florence	264 Ursula Drive, Westford	84923	(833) 499-3527	9	2012-08-10 17:52:03
16	빵 굽는 사람	디모데	329 James Street, Reading	58393	833-941-0824	13	2012-08-15 10:34:25
17	Pinker	David	5 Impreza Road, Boston	65332	811 409-6734	13	2012-08-16 11:32:47
20	Genting	매튜	4 Nunnington Place, Wingfield, Boston	52365	(811) 972-1377	5	2012-08-19 14:55:55
21	Mackenzie	안나	64 Perkington Lane, Reading	64577	(822) 661-2898	1	2012-08-26 09:32:05
22	Coplin	Joan	85 Bard Street, Bloomington, Boston	43533	(822) 499-2232	16	2012-08-29 08:32:41
24	Sarwin	Ramnaresh	12 Bullington Lane, Boston	65464	(822) 413-1470	15	2012-09-01 08:44:42
26	Jones	더글러스	976 Gnats Close, Reading	11986	844 536-8036	11	2012-09-02 18:43:05
27	Rumney	Henrietta	3 Burkington Plaza, Boston	78533	(822) 989-8876	20	2012-09-05 08:42:35
28	Farrell	David	437 Granite Farm Road, Westford	43532	(855) 755-9876		2012-09-15 08:22:05
29	Worthington-Smyth	헨리	55 Jagbi Way, North Reading	97676	(855) 894-3758	2	2012-09-17 12:27:15
30	범위	Millicent	641 Drudgery Close, Burnington, Boston	34232	(855) 941-9786	2	2012-09-18 19:04:01
33	Tupperware	히아신스	33 Cheerful Plaza, Drake Road, Westford	68666	(822) 665-5327		2012-09-18 19:32:05
35	사냥	남자	5 Bullington Lane, Boston	54333	(899) 720-6978	30	2012-09-19 11:32:45
36	머리	Erica	Crimson Road, North Reading	75655	(811) 732-4816	2	2012-09-22 08:36:38

답변:

 delete from cd . members where memid = 37 ;

This exercise is a small increment on our previous one. Instead of deleting all bookings, this time we want to be a bit more targeted, and delete a single member that has never made a booking. To do this, we simply have to add a WHERE clause to our command, specifying the member we want to delete. You can see the parallels with SELECT and UPDATE statements here.

There's one interesting wrinkle here. Try this command out, but substituting in member id 0 instead. This member has made many bookings, and you'll find that the delete fails with an error about a foreign key constraint violation. This is an important concept in relational databases, so let's explore a little further.

Foreign keys are a mechanism for defining relationships between columns of different tables. In our case we use them to specify that the memid column of the bookings table is related to the memid column of the members table. The relationship (or 'constraint') specifies that for a given booking, the member specified in the booking must exist in the members table. It's useful to have this guarantee enforced by the database: it means that code using the database can rely on the presence of the member. It's hard (even impossible) to enforce this at higher levels: concurrent operations can interfere and leave your database in a broken state.

PostgreSQL supports various different kinds of constraints that allow you to enforce structure upon your data. For more information on constraints, check out the PostgreSQL documentation on foreign keys

Delete based on a subquery

In our previous exercises, we deleted a specific member who had never made a booking. How can we make that more general, to delete all members who have never made a booking?

Expected results:

memid	성	FirstName	주소	zipcode	전화	recommendedby	joindate
0	손님	손님	손님	0	(000) 000-0000		2012-07-01 00:00:00
1	스미스	대런	8 Bloomsbury Close, Boston	4321	555-555-5555		2012-07-02 12:02:05
2	스미스	Tracy	8 Bloomsbury Close, New York	4321	555-555-5555		2012-07-02 12:08:23
3	Rownam	팀	23 Highway Way, Boston	23423	(844) 693-0723		2012-07-03 09:32:15
4	Joplette	제니스	20 Crossing Road, New York	234	(833) 942-4710	1	2012-07-03 10:25:05
5	Butters	제랄드	1065 Huntingdon Avenue, Boston	56754	(844) 078-4130	1	2012-07-09 10:44:09
6	Tracy	버튼	3 Tunisia Drive, Boston	45678	(822) 354-9973		2012-07-15 08:52:55
7	도전	낸시	6 Hunting Lodge Way, Boston	10383	(833) 776-4001	4	2012-07-25 08:59:12
8	Boothe	팀	3 Bloomsbury Close, Reading, 00234	234	(811) 433-2547	3	2012-07-25 16:02:35
9	Stibbons	Ponder	5 Dragons Way, Winchester	87630	(833) 160-3900	6	2012-07-25 17:09:05
10	오웬	찰스	52 Cheshire Grove, Winchester, 28563	28563	(855) 542-5251	1	2012-08-03 19:42:37
11	Jones	David	976 Gnats Close, Reading	33862	(844) 536-8036	4	2012-08-06 16:32:55
12	빵 굽는 사람	앤	55 Powdery Street, Boston	80743	844-076-5141	9	2012-08-10 14:23:22
13	Farrell	Jemima	103 Firth Avenue, North Reading	57392	(855) 016-0163		2012-08-10 14:28:01
14	스미스	잭	252 Binkington Way, Boston	69302	(822) 163-3254	1	2012-08-10 16:22:05
15	Bader	Florence	264 Ursula Drive, Westford	84923	(833) 499-3527	9	2012-08-10 17:52:03
16	빵 굽는 사람	디모데	329 James Street, Reading	58393	833-941-0824	13	2012-08-15 10:34:25
17	Pinker	David	5 Impreza Road, Boston	65332	811 409-6734	13	2012-08-16 11:32:47
20	Genting	매튜	4 Nunnington Place, Wingfield, Boston	52365	(811) 972-1377	5	2012-08-19 14:55:55
21	Mackenzie	안나	64 Perkington Lane, Reading	64577	(822) 661-2898	1	2012-08-26 09:32:05
22	Coplin	Joan	85 Bard Street, Bloomington, Boston	43533	(822) 499-2232	16	2012-08-29 08:32:41
24	Sarwin	Ramnaresh	12 Bullington Lane, Boston	65464	(822) 413-1470	15	2012-09-01 08:44:42
26	Jones	더글러스	976 Gnats Close, Reading	11986	844 536-8036	11	2012-09-02 18:43:05
27	Rumney	Henrietta	3 Burkington Plaza, Boston	78533	(822) 989-8876	20	2012-09-05 08:42:35
28	Farrell	David	437 Granite Farm Road, Westford	43532	(855) 755-9876		2012-09-15 08:22:05
29	Worthington-Smyth	헨리	55 Jagbi Way, North Reading	97676	(855) 894-3758	2	2012-09-17 12:27:15
30	범위	Millicent	641 Drudgery Close, Burnington, Boston	34232	(855) 941-9786	2	2012-09-18 19:04:01
33	Tupperware	히아신스	33 Cheerful Plaza, Drake Road, Westford	68666	(822) 665-5327		2012-09-18 19:32:05
35	사냥	남자	5 Bullington Lane, Boston	54333	(899) 720-6978	30	2012-09-19 11:32:45
36	머리	Erica	Crimson Road, North Reading	75655	(811) 732-4816	2	2012-09-22 08:36:38

답변:

 delete from cd . members where memid not in ( select memid from cd . bookings );

We can use subqueries to determine whether a row should be deleted or not. There's a couple of standard ways to do this. In our featured answer, the subquery produces a list of all the different member ids in the cd.bookings table. If a row in the table isn't in the list generated by the subquery, it gets deleted.

An alternative is to use a correlated subquery . Where our previous example runs a large subquery once, the correlated approach instead specifies a smaller subqueryto run against every row.

 delete from cd . members mems where not exists ( select 1 from cd . bookings where memid = mems . memid );

The two different forms can have different performance characteristics. Under the hood, your database engine is free to transform your query to execute it in a correlated or uncorrelated fashion, though, so things can be a little hard to predict.

집합

Aggregation is one of those capabilities that really make you appreciate the power of relational database systems. It allows you to move beyond merely persisting your data, into the realm of asking truly interesting questions that can be used to inform decision making. This category covers aggregation at length, making use of standard grouping as well as more recent window functions.

If you struggle with these questions, I strongly recommend Learning SQL, by Alan Beaulieu and SQL Cookbook by Anthony Molinaro. In fact, get the latter anyway - it'll take you beyond anything you find on this site, and on multiple different database systems to boot.

Count the number of facilities

For our first foray into aggregates, we're going to stick to something simple. We want to know how many facilities exist - simply produce a total count.

Expected results:

세다
9

답변:

 select count ( * ) from cd . facilities ;

Aggregation starts out pretty simply! The SQL above selects everything from our facilities table, and then counts the number of rows in the result set. The count function has a variety of uses:

COUNT(*) simply returns the number of rows
COUNT(address) counts the number of non-null addresses in the result set.
Finally, COUNT(DISTINCT address) counts the number of different addresses in the facilities table.

The basic idea of an aggregate function is that it takes in a column of data, performs some function upon it, and outputs a scalar (single) value. There are a bunch more aggregation functions, including MAX , MIN , SUM , and AVG . These all do pretty much what you'd expect from their names :-).

One aspect of aggregate functions that people often find confusing is in queries like the below:

 select facid, count ( * ) from cd . facilities

Try it out, and you'll find that it doesn't work. This is because count(*) wants to collapse the facilities table into a single value - unfortunately, it can't do that, because there's a lot of different facids in cd.facilities - Postgres doesn't know which facid to pair the count with.

Instead, if you wanted a query that returns all the facids along with a count on each row, you can break the aggregation out into a subquery as below:

 select facid, 
	( select count ( * ) from cd . facilities )
	from cd . facilities

When we have a subquery that returns a scalar value like this, Postgres knows to simply repeat the value for every row in cd.facilities.

Count the number of expensive facilities

Produce a count of the number of facilities that have a cost to guests of 10 or more.

세다
6

답변:

 select count ( * ) from cd . facilities where guestcost >= 10 ;

This one is only a simple modification to the previous question: we need to weed out the inexpensive facilities. This is easy to do using a WHERE clause. Our aggregation can now only see the expensive facilities.

Count the number of recommendations each member makes

Produce a count of the number of recommendations each member has made. Order by member ID.

Expected results:

recommendedby	세다
1	5
2	3
3	1
4	2
5	1
6	1
9	2
11	1
13	2
15	1
16	1
20	1
30	1

답변:

 select recommendedby, count ( * ) 
	from cd . members
	where recommendedby is not null
	group by recommendedby
order by recommendedby;

Previously, we've seen that aggregation functions are applied to a column of values, and convert them into an aggregated scalar value. This is useful, but we often find that we don't want just a single aggregated result: for example, instead of knowing the total amount of money the club has made this month, I might want to know how much money each different facility has made, or which times of day were most lucrative.

In order to support this kind of behaviour, SQL has the GROUP BY construct. What this does is batch the data together into groups, and run the aggregation function separately for each group. When you specify a GROUP BY , the database produces an aggregated value for each distinct value in the supplied columns. In this case, we're saying 'for each distinct value of recommendedby, get me the number of times that value appears'.

List the total slots booked per facility

Produce a list of the total number of slots booked per facility. For now, just produce an output table consisting of facility id and slots, sorted by facility id.

Expected results:

facid	Total Slots
0	1320
1	1278
2	1209
3	830
4	1404
5	228
6	1104
7	908
8	911

답변:

 select facid, sum (slots) as " Total Slots "
	from cd . bookings
	group by facid
order by facid;

Other than the fact that we've introduced the SUM aggregate function, there's not a great deal to say about this exercise. For each distinct facility id, the SUM function adds together everything in the slots column.

List the total slots booked per facility in a given month

Produce a list of the total number of slots booked per facility in the month of September 2012. Produce an output table consisting of facility id and slots, sorted by the number of slots.

Expected results:

facid	Total Slots
5	122
3	422
7	426
8	471
6	540
2	570
1	588
0	591
4	648

답변:

 select facid, sum (slots) as " Total Slots "
	from cd . bookings
	where
		starttime >= ' 2012-09-01 '
		and starttime < ' 2012-10-01 '
	group by facid
order by sum (slots);

This is only a minor alteration of our previous example. Remember that aggregation happens after the WHERE clause is evaluated: we thus use the WHERE to restrict the data we aggregate over, and our aggregation only sees data from a single month.

List the total slots booked per facility per month

Produce a list of the total number of slots booked per facility per month in the year of 2012. Produce an output table consisting of facility id and slots, sorted by the id and month.

Expected results:

facid	월	Total Slots
0	7	270
0	8	459
0	9	591
1	7	207
1	8	483
1	9	588
2	7	180
2	8	459
2	9	570
3	7	104
3	8	304
3	9	422
4	7	264
4	8	492
4	9	648
5	7	24
5	8	82
5	9	122
6	7	164
6	8	400
6	9	540
7	7	156
7	8	326
7	9	426
8	7	117
8	8	322
8	9	471

답변:

 select facid, extract(month from starttime) as month, sum (slots) as " Total Slots "
	from cd . bookings
	where
		starttime >= ' 2012-01-01 '
		and starttime < ' 2013-01-01 '
	group by facid, month
order by facid, month;

The main piece of new functionality in this question is the EXTRACT function. EXTRACT allows you to get individual components of a timestamp, like day, month, year, etc. We group by the output of this function to provide per-month values. An alternative, if we needed to distinguish between the same month in different years, is to make use of the DATE_TRUNC function, which truncates a date to a given granularity.

It's also worth noting that this is the first time we've truly made use of the ability to group by more than one column.

Find the count of members who have made at least one booking

Find the total number of members who have made at least one booking.

Expected results:

세다
30

답변:

 select count (distinct memid) from cd . bookings

Your first instinct may be to go for a subquery here. Something like the below:

 select count ( * ) from 
	( select distinct memid from cd . bookings ) as mems

This does work perfectly well, but we can simplify a touch with the help of a little extra knowledge in the form of COUNT DISTINCT . This does what you might expect, counting the distinct values in the passed column.

List facilities with more than 1000 slots booked

Produce a list of facilities with more than 1000 slots booked. Produce an output table consisting of facility id and hours, sorted by facility id.

Expected results:

facid	Total Slots
0	1320
1	1278
2	1209
4	1404
6	1104

답변:

 select facid, sum (slots) as " Total Slots "
        from cd . bookings
        group by facid
        having sum (slots) > 1000
        order by facid

It turns out that there's actually an SQL keyword designed to help with the filtering of output from aggregate functions. This keyword is HAVING .

The behaviour of HAVING is easily confused with that of WHERE . The best way to think about it is that in the context of a query with an aggregate function, WHERE is used to filter what data gets input into the aggregate function, while HAVING is used to filter the data once it is output from the function. Try experimenting to explore this difference!

Find the total revenue of each facility

Produce a list of facilities along with their total revenue. The output table should consist of facility name and revenue, sorted by revenue. Remember that there's a different cost for guests and members!

Expected results:

이름	수익
탁구	180
Snooker Table	240
Pool Table	270
Badminton Court	1906.5
Squash Court	13468.0
Tennis Court 1	13860
Tennis Court 2	14310
Massage Room 2	15810
Massage Room 1	72540

답변:

 select facs . name , sum (slots * case
			when memid = 0 then facs . guestcost
			else facs . membercost
		end) as revenue
	from cd . bookings bks
	inner join cd . facilities facs
		on bks . facid = facs . facid
	group by facs . name
order by revenue;

The only real complexity in this query is that guests (member ID 0) have a different cost to everyone else. We use a case statement to produce the cost for each session, and then sum each of those sessions, grouped by facility.

Find facilities with a total revenue less than 1000

Produce a list of facilities with a total revenue less than 1000. Produce an output table consisting of facility name and revenue, sorted by revenue. Remember that there's a different cost for guests and members!

Expected results:

이름	수익
탁구	180
Snooker Table	240
Pool Table	270

답변:

 select name, revenue from (
	select facs . name , sum (case 
				when memid = 0 then slots * facs . guestcost
				else slots * membercost
			end) as revenue
		from cd . bookings bks
		inner join cd . facilities facs
			on bks . facid = facs . facid
		group by facs . name
	) as agg where revenue < 1000
order by revenue;

You may well have tried to use the HAVING keyword we introduced in an earlier exercise, producing something like below:

 select facs . name , sum (case 
		when memid = 0 then slots * facs . guestcost
		else slots * membercost
	end) as revenue
	from cd . bookings bks
	inner join cd . facilities facs
		on bks . facid = facs . facid
	group by facs . name
	having revenue < 1000
order by revenue;

Unfortunately, this doesn't work! You'll get an error along the lines of ERROR: column "revenue" does not exist . Postgres, unlike some other RDBMSs like SQL Server and MySQL, doesn't support putting column names in the HAVING clause. This means that for this query to work, you'd have to produce something like below:

 select facs . name , sum (case 
		when memid = 0 then slots * facs . guestcost
		else slots * membercost
	end) as revenue
	from cd . bookings bks
	inner join cd . facilities facs
		on bks . facid = facs . facid
	group by facs . name
	having sum (case 
		when memid = 0 then slots * facs . guestcost
		else slots * membercost
	end) < 1000
order by revenue;

Having to repeat significant calculation code like this is messy, so our anointed solution instead just wraps the main query body as a subquery, and selects from it using a WHERE clause. In general, I recommend using HAVING for simple queries, as it increases clarity. Otherwise, this subquery approach is often easier to use.

Output the facility id that has the highest number of slots booked

Output the facility id that has the highest number of slots booked. For bonus points, try a version without a LIMIT clause. This version will probably look messy!

Expected results:

facid	Total Slots
4	1404

답변:

 select facid, sum (slots) as " Total Slots "
	from cd . bookings
	group by facid
order by sum (slots) desc
LIMIT 1 ;

Let's start off with what's arguably the simplest way to do this: produce a list of facility IDs and the total number of slots used, order by the total number of slots used, and pick only the top result.

It's worth realising, though, that this method has a significant weakness. In the event of a tie, we will still only get one result! To get all the relevant results, we might try using the MAX aggregate function, something like below:

 select facid, max (totalslots) from (
	select facid, sum (slots) as totalslots    
		from cd . bookings    
		group by facid
	) as sub group by facid

The intent of this query is to get the highest totalslots value and its associated facid(s). Unfortunately, this just won't work! In the event of multiple facids having the same number of slots booked, it would be ambiguous which facid should be paired up with the single (or scalar ) value coming out of the MAX function. This means that Postgres will tell you that facid ought to be in a GROUP BY section, which won't produce the results we're looking for.

Let's take a first stab at a working query:

 select facid, sum (slots) as totalslots
	from cd . bookings
	group by facid
	having sum (slots) = ( select max ( sum2 . totalslots ) from
		( select sum (slots) as totalslots
		from cd . bookings
		group by facid
		) as sum2);

The query produces a list of facility IDs and number of slots used, and then uses a HAVING clause that works out the maximum totalslots value. We're essentially saying: 'produce a list of facids and their number of slots booked, and filter out all the ones that doen't have a number of slots booked equal to the maximum.'

Useful as HAVING is, however, our query is pretty ugly. To improve on that, let's introduce another new concept: Common Table Expressions (CTEs). CTEs can be thought of as allowing you to define a database view inline in your query. It's really helpful in situations like this, where you're having to repeat yourself a lot.

CTEs are declared in the form WITH CTEName as (SQL-Expression) . You can see our query redefined to use a CTE below:

with sum as ( select facid, sum (slots) as totalslots
	from cd . bookings
	group by facid
)
select facid, totalslots 
	from sum
	where totalslots = ( select max (totalslots) from sum);

You can see that we've factored out our repeated selections from cd.bookings into a single CTE, and made the query a lot simpler to read in the process!

BUT WAIT. There's more. It's also possible to complete this problem using Window Functions. We'll leave these until later, but even better solutions to problems like these are available.

That's a lot of information for a single exercise. Don't worry too much if you don't get it all right now - we'll reuse these concepts in later exercises.

List the total slots booked per facility per month, Part 2

Produce a list of the total number of slots booked per facility per month in the year of 2012. In this version, include output rows containing totals for all months per facility, and a total for all months for all facilities. The output table should consist of facility id, month and slots, sorted by the id and month. When calculating the aggregated values for all months and all facids, return null values in the month and facid columns.

Expected results:

facid	월	slots
0	7	270
0	8	459
0	9	591
0		1320
1	7	207
1	8	483
1	9	588
1		1278
2	7	180
2	8	459
2	9	570
2		1209
3	7	104
3	8	304
3	9	422
3		830
4	7	264
4	8	492
4	9	648
4		1404
5	7	24
5	8	82
5	9	122
5		228
6	7	164
6	8	400
6	9	540
6		1104
7	7	156
7	8	326
7	9	426
7		908
8	7	117
8	8	322
8	9	471
8		910
		9191

답변:

 select facid, extract(month from starttime) as month, sum (slots) as slots
	from cd . bookings
	where
		starttime >= ' 2012-01-01 '
		and starttime < ' 2013-01-01 '
	group by rollup(facid, month)
order by facid, month;

When we are doing data analysis, we sometimes want to perform multiple levels of aggregation to allow ourselves to 'zoom' in and out to different depths. In this case, we might be looking at each facility's overall usage, but then want to dive in to see how they've performed on a per-month basis. Using the SQL we know so far, it's quite cumbersome to produce a single query that does what we want - we effectively have to resort to concatenating multiple queries using UNION ALL :

 select facid, extract(month from starttime) as month, sum (slots) as slots
    from cd . bookings
    where
        starttime >= ' 2012-01-01 '
        and starttime < ' 2013-01-01 '
    group by facid, month
union all
select facid, null , sum (slots) as slots
    from cd . bookings
    where
        starttime >= ' 2012-01-01 '
        and starttime < ' 2013-01-01 '
    group by facid
union all
select null , null , sum (slots) as slots
    from cd . bookings
    where
        starttime >= ' 2012-01-01 '
        and starttime < ' 2013-01-01 '
order by facid, month;

As you can see, each subquery performs a different level of aggregation, and we just combine the results. We can clean this up a lot by factoring out commonalities using a CTE:

with bookings as (
	select facid, extract(month from starttime) as month, slots
	from cd . bookings
	where
		starttime >= ' 2012-01-01 '
		and starttime < ' 2013-01-01 '
)
select facid, month, sum (slots) from bookings group by facid, month
union all
select facid, null , sum (slots) from bookings group by facid
union all
select null , null , sum (slots) from bookings
order by facid, month;

This version is not excessively hard on the eyes, but it becomes cumbersome as the number of aggregation columns increases. Fortunately, PostgreSQL 9.5 introduced support for the ROLLUP operator, which we've used to simplify our accepted answer.

ROLLUP produces a hierarchy of aggregations in the order passed into it: for example, ROLLUP(facid, month) outputs aggregations on (facid, month), (facid), and (). If we wanted an aggregation of all facilities for a month (instead of all months for a facility) we'd have to reverse the order, using ROLLUP(month, facid) . Alternatively, if we instead want all possible permutations of the columns we pass in, we can use CUBE rather than ROLLUP . This will produce (facid, month), (month), (facid), and ().

ROLLUP and CUBE are special cases of GROUPING SETS . GROUPING SETS allow you to specify the exact aggregation permutations you want: you could, for example, ask for just (facid, month) and (facid), skipping the top-level aggregation.

List the total hours booked per named facility

Produce a list of the total number of hours booked per facility, remembering that a slot lasts half an hour. The output table should consist of the facility id, name, and hours booked, sorted by facility id. Try formatting the hours to two decimal places.

Expected results:

facid	이름	Total Hours
0	Tennis Court 1	660.00
1	Tennis Court 2	639.00
2	Badminton Court	604.50
3	탁구	415.00
4	Massage Room 1	702.00
5	Massage Room 2	114.00
6	Squash Court	552.00
7	Snooker Table	454.00
8	Pool Table	455.50

답변:

 select facs . facid , facs . name ,
	trim (to_char( sum ( bks . slots ) / 2 . 0 , ' 9999999999999999D99 ' )) as " Total Hours "

	from cd . bookings bks
	inner join cd . facilities facs
		on facs . facid = bks . facid
	group by facs . facid , facs . name
order by facs . facid ;

There's a few little pieces of interest in this question. Firstly, you can see that our aggregation works just fine when we join to another table on a 1:1 basis. Also note that we group by both facs.facid and facs.name . This is might seem odd: after all, since facid is the primary key of the facilities table, each facid has exactly one name, and grouping by both fields is the same as grouping by facid alone. In fact, you'll find that if you remove facs.name from the GROUP BY clause, the query works just fine: Postgres works out that this 1:1 mapping exists, and doesn't insist that we group by both columns.

Unfortunately, depending on which database system we use, validation might not be so smart, and may not realise that the mapping is strictly 1:1. That being the case, if there were multiple names for each facid and we hadn't grouped by name , the DBMS would have to choose between multiple (equally valid) choices for the name . Since this is invalid, the database system will insist that we group by both fields. In general, I recommend grouping by all columns you don't have an aggregate function on: this will ensure better cross-platform compatibility.

Next up is the division. Those of you familiar with MySQL may be aware that integer divisions are automatically cast to floats. Postgres is a little more traditional in this respect, and expects you to tell it if you want a floating point division. You can do that easily in this case by dividing by 2.0 rather than 2.

Finally, let's take a look at formatting. The TO_CHAR function converts values to character strings. It takes a formatting string, which we specify as (up to) lots of numbers before the decimal place, decimal place, and two numbers after the decimal place. The output of this function can be prepended with a space, which is why we include the outer TRIM function.

List each member's first booking after September 1st 2012

Produce a list of each member name, id, and their first booking after September 1st 2012. Order by member ID.

Expected results:

성	FirstName	memid	starttime
손님	손님	0	2012-09-01 08:00:00
스미스	대런	1	2012-09-01 09:00:00
스미스	Tracy	2	2012-09-01 11:30:00
Rownam	팀	3	2012-09-01 16:00:00
Joplette	제니스	4	2012-09-01 15:00:00
Butters	제랄드	5	2012-09-02 12:30:00
Tracy	버튼	6	2012-09-01 15:00:00
도전	낸시	7	2012-09-01 12:30:00
Boothe	팀	8	2012-09-01 08:30:00
Stibbons	Ponder	9	2012-09-01 11:00:00
오웬	찰스	10	2012-09-01 11:00:00
Jones	David	11	2012-09-01 09:30:00
빵 굽는 사람	앤	12	2012-09-01 14:30:00
Farrell	Jemima	13	2012-09-01 09:30:00
스미스	잭	14	2012-09-01 11:00:00
Bader	Florence	15	2012-09-01 10:30:00
빵 굽는 사람	디모데	16	2012-09-01 15:00:00
Pinker	David	17	2012-09-01 08:30:00
Genting	매튜	20	2012-09-01 18:00:00
Mackenzie	안나	21	2012-09-01 08:30:00
Coplin	Joan	22	2012-09-02 11:30:00
Sarwin	Ramnaresh	24	2012-09-04 11:00:00
Jones	더글러스	26	2012-09-08 13:00:00
Rumney	Henrietta	27	2012-09-16 13:30:00
Farrell	David	28	2012-09-18 09:00:00
Worthington-Smyth	헨리	29	2012-09-19 09:30:00
범위	Millicent	30	2012-09-19 11:30:00
Tupperware	히아신스	33	2012-09-20 08:00:00
사냥	남자	35	2012-09-23 14:00:00
머리	Erica	36	2012-09-27 11:30:00

답변:

 select mems . surname , mems . firstname , mems . memid , min ( bks . starttime ) as starttime
	from cd . bookings bks
	inner join cd . members mems on
		mems . memid = bks . memid
	where starttime >= ' 2012-09-01 '
	group by mems . surname , mems . firstname , mems . memid
order by mems . memid ;

This answer demonstrates the use of aggregate functions on dates. MIN works exactly as you'd expect, pulling out the lowest possible date in the result set. To make this work, we need to ensure that the result set only contains dates from September onwards. We do this using the WHERE clause.

You might typically use a query like this to find a customer's next booking. You can use this by replacing the date '2012-09-01' with the function now()

Produce a list of member names, with each row containing the total member count

Produce a list of member names, with each row containing the total member count. Order by join date.

Expected results:

세다	FirstName	성
31	손님	손님
31	대런	스미스
31	Tracy	스미스
31	팀	Rownam
31	제니스	Joplette
31	제랄드	Butters
31	버튼	Tracy
31	낸시	도전
31	팀	Boothe
31	Ponder	Stibbons
31	찰스	오웬
31	David	Jones
31	앤	빵 굽는 사람
31	Jemima	Farrell
31	잭	스미스
31	Florence	Bader
31	디모데	빵 굽는 사람
31	David	Pinker
31	매튜	Genting
31	안나	Mackenzie
31	Joan	Coplin
31	Ramnaresh	Sarwin
31	더글러스	Jones
31	Henrietta	Rumney
31	David	Farrell
31	헨리	Worthington-Smyth
31	Millicent	범위
31	히아신스	Tupperware
31	남자	사냥
31	Erica	머리
31	대런	스미스

답변:

 select count ( * ) over(), firstname, surname
	from cd . members
order by joindate

Using the knowledge we've built up so far, the most obvious answer to this is below. We use a subquery because otherwise SQL will require us to group by firstname and surname, producing a different result to what we're looking for.

 select ( select count ( * ) from cd . members ) as count, firstname, surname
	from cd . members
order by joindate

There's nothing at all wrong with this answer, but we've chosen a different approach to introduce a new concept called window functions. Window functions provide enormously powerful capabilities, in a form often more convenient than the standard aggregation functions. While this exercise is only a toy, we'll be working on more complicated examples in the near future.

Window functions operate on the result set of your (sub-)query, after the WHERE clause and all standard aggregation. They operate on a window of data. By default this is unrestricted: the entire result set, but it can be restricted to provide more useful results. For example, suppose instead of wanting the count of all members, we want the count of all members who joined in the same month as that member:

 select count ( * ) over(partition by date_trunc( ' month ' ,joindate)),
	firstname, surname
	from cd . members
order by joindate

In this example, we partition the data by month. For each row the window function operates over, the window is any rows that have a joindate in the same month. The window function thus produces a count of the number of members who joined in that month.

You can go further. Imagine if, instead of the total number of members who joined that month, you want to know what number joinee they were that month. You can do this by adding in an ORDER BY to the window function:

 select count ( * ) over(partition by date_trunc( ' month ' ,joindate) order by joindate),
	firstname, surname
	from cd . members
order by joindate

The ORDER BY changes the window again. Instead of the window for each row being the entire partition, the window goes from the start of the partition to the current row, and not beyond. Thus, for the first member who joins in a given month, the count is 1. For the second, the count is 2, and so on.

One final thing that's worth mentioning about window functions: you can have multiple unrelated ones in the same query. Try out the query below for an example - you'll see the numbers for the members going in opposite directions! This flexibility can lead to more concise, readable, and maintainable queries.

 select count ( * ) over(partition by date_trunc( ' month ' ,joindate) order by joindate asc ), 
	count ( * ) over(partition by date_trunc( ' month ' ,joindate) order by joindate desc ), 
	firstname, surname
	from cd . members
order by joindate

Window functions are extraordinarily powerful, and they will change the way you write and think about SQL. Make good use of them!

Produce a numbered list of members

Produce a monotonically increasing numbered list of members, ordered by their date of joining. Remember that member IDs are not guaranteed to be sequential.

Expected results:

row_number	FirstName	성
1	손님	손님
2	대런	스미스
3	Tracy	스미스
4	팀	Rownam
5	제니스	Joplette
6	제랄드	Butters
7	버튼	Tracy
8	낸시	도전
9	팀	Boothe
10	Ponder	Stibbons
11	찰스	오웬
12	David	Jones
13	앤	빵 굽는 사람
14	Jemima	Farrell
15	잭	스미스
16	Florence	Bader
17	디모데	빵 굽는 사람
18	David	Pinker
19	매튜	Genting
20	안나	Mackenzie
21	Joan	Coplin
22	Ramnaresh	Sarwin
23	더글러스	Jones
24	Henrietta	Rumney
25	David	Farrell
26	헨리	Worthington-Smyth
27	Millicent	범위
28	히아신스	Tupperware
29	남자	사냥
30	Erica	머리
31	대런	스미스

답변:

 select row_number() over( order by joindate), firstname, surname
	from cd . members
order by joindate

This exercise is a simple bit of window function practise! You could just as easily use count(*) over(order by joindate) here, so don't worry if you used that instead.

In this query, we don't define a partition, meaning that the partition is the entire dataset. Since we define an order for the window function, for any given row the window is: start of the dataset -> current row.

Output the facility id that has the highest number of slots booked, again

Output the facility id that has the highest number of slots booked. Ensure that in the event of a tie, all tieing results get output.

Expected results:

facid	총
4	1404

답변:

 select facid, total from (
	select facid, sum (slots) total, rank() over ( order by sum (slots) desc ) rank
        	from cd . bookings
		group by facid
	) as ranked
	where rank = 1

You may recall that this is a problem we've already solved in an earlier exercise. We came up with an answer something like below, which we then cut down using CTEs:

 select facid, sum (slots) as totalslots
	from cd . bookings
	group by facid
	having sum (slots) = ( select max ( sum2 . totalslots ) from
		( select sum (slots) as totalslots
		from cd . bookings
		group by facid
		) as sum2);

Once we've cleaned it up, this solution is perfectly adequate. Explaining how the query works makes it seem a little odd, though - 'find the number of slots booked by the best facility. Calculate the total slots booked for each facility, and return only the rows where the slots booked are the same as for the best'. Wouldn't it be nicer to be able to say 'calculate the number of slots booked for each facility, rank them, and pick out any at rank 1'?

Fortunately, window functions allow us to do this - although it's fair to say that doing so is not trivial to the untrained eye. The first key piece of information is the existence of the éfunction. This ranks values based on the ORDER BY that is passed to it. If there's a tie for (say) second place), the next gets ranked at position 4. So, what we need to do is get the number of slots for each facility, rank them, and pick off the ones at the top rank. A first pass at this might look something like the below:

 select facid, total from (
	select facid, total, rank() over ( order by total desc ) rank from (
		select facid, sum (slots) total
			from cd . bookings
			group by facid
		) as sumslots
	) as ranked
where rank = 1

The inner query calculates the total slots booked, the middle one ranks them, and the outer one creams off the top ranked. We can actually tidy this up a little: recall that window function get applied pretty late in the select function, after aggregation. That being the case, we can move the aggregation into the ORDER BY part of the function, as shown in the approved answer.

While the window function approach isn't massively simpler in terms of lines of code, it arguably makes more semantic sense.

Rank members by (rounded) hours used

Produce a list of members, along with the number of hours they've booked in facilities, rounded to the nearest ten hours. Rank them by this rounded figure, producing output of first name, surname, rounded hours, rank. Sort by rank, surname, and first name.

Expected results:

FirstName	성	시간	계급
손님	손님	1200	1
대런	스미스	340	2
팀	Rownam	330	3
팀	Boothe	220	4
Tracy	스미스	220	4
제랄드	Butters	210	6
버튼	Tracy	180	7
찰스	오웬	170	8
제니스	Joplette	160	9
앤	빵 굽는 사람	150	10
디모데	빵 굽는 사람	150	10
David	Jones	150	10
낸시	도전	130	13
Florence	Bader	120	14
안나	Mackenzie	120	14
Ponder	Stibbons	120	14
잭	스미스	110	17
Jemima	Farrell	90	18
David	Pinker	80	19
Ramnaresh	Sarwin	80	19
매튜	Genting	70	21
Joan	Coplin	50	22
David	Farrell	30	23
헨리	Worthington-Smyth	30	23
남자	사냥	20	25
더글러스	Jones	20	25
Millicent	범위	20	25
Henrietta	Rumney	20	25
Erica	머리	10	29
히아신스	Tupperware	10	29

답변:

 select firstname, surname,
	(( sum ( bks . slots ) + 10 ) / 20 ) * 10 as hours,
	rank() over ( order by (( sum ( bks . slots ) + 10 ) / 20 ) * 10 desc ) as rank

	from cd . bookings bks
	inner join cd . members mems
		on bks . memid = mems . memid
	group by mems . memid
order by rank, surname, firstname;

This answer isn't a great stretch over our previous exercise, although it does illustrate the function of RANK better. You can see that some of the clubgoers have an equal rounded number of hours booked in, and their rank is the same. If position 2 is shared between two members, the next one along gets position 4. There's a different function, DENSE_RANK , that would assign that member position 3 instead.

It's worth noting the technique we use to do rounding here. Adding 5, dividing by 10, and multiplying by 10 has the effect (thanks to integer arithmetic cutting off fractions) of rounding a number to the nearest 10. In our case, because slots are half an hour, we need to add 10, divide by 20, and multiply by 10. One could certainly make the argument that we should do the slots -> hours conversion independently of the rounding, which would increase clarity.

Talking of clarity, this rounding malarky is starting to introduce a noticeable amount of code repetition. At this point it's a judgement call, but you may wish to factor it out using a subquery as below:

 select firstname, surname, hours, rank() over ( order by hours desc ) from
	( select firstname, surname,
		(( sum ( bks . slots ) + 10 ) / 20 ) * 10 as hours

		from cd . bookings bks
		inner join cd . members mems
			on bks . memid = mems . memid
		group by mems . memid
	) as subq
order by rank, surname, firstname;

Find the top three revenue generating facilities

Produce a list of the top three revenue generating facilities (including ties). Output facility name and rank, sorted by rank and facility name.

Expected results:

이름	계급
Massage Room 1	1
Massage Room 2	2
Tennis Court 2	3

답변:

 select name, rank from (
	select facs . name as name, rank() over ( order by sum (case
				when memid = 0 then slots * facs . guestcost
				else slots * membercost
			end) desc ) as rank
		from cd . bookings bks
		inner join cd . facilities facs
			on bks . facid = facs . facid
		group by facs . name
	) as subq
	where rank <= 3
order by rank;

This question doesn't introduce any new concepts, and is just intended to give you the opportunity to practise what you already know. We use the CASE statement to calculate the revenue for each slot, and aggregate that on a per-facility basis using SUM . We then use the RANK window function to produce a ranking, wrap it all up in a subquery, and extract everything with a rank less than or equal to 3.

Classify facilities by value

Classify facilities into equally sized groups of high, average, and low based on their revenue. Order by classification and facility name.

Expected results:

이름	수익
Massage Room 1	높은
Massage Room 2	높은
Tennis Court 2	높은
Badminton Court	평균
Squash Court	평균
Tennis Court 1	평균
Pool Table	낮은
Snooker Table	낮은
탁구	낮은

답변:

 select name, case when class = 1 then ' high '
		when class = 2 then ' average '
		else ' low '
		end revenue
	from (
		select facs . name as name, ntile( 3 ) over ( order by sum (case
				when memid = 0 then slots * facs . guestcost
				else slots * membercost
			end) desc ) as class
		from cd . bookings bks
		inner join cd . facilities facs
			on bks . facid = facs . facid
		group by facs . name
	) as subq
order by class, name;

This exercise should mostly use familiar concepts, although we do introduce the NTILE window function. NTILE groups values into a passed-in number of groups, as evenly as possible. It outputs a number from 1->number of groups. We then use a CASE statement to turn that number into a label!

Calculate the payback time for each facility

Based on the 3 complete months of data so far, calculate the amount of time each facility will take to repay its cost of ownership. Remember to take into account ongoing monthly maintenance. Output facility name and payback time in months, order by facility name. Don't worry about differences in month lengths, we're only looking for a rough value here!

Expected results:

이름	달
Badminton Court	6.8317677198975235
Massage Room 1	0.18885741265344664778
Massage Room 2	1.7621145374449339
Pool Table	5.3333333333333333
Snooker Table	6.9230769230769231
Squash Court	1.1339582703356516
탁구	6.4000000000000000
Tennis Court 1	2.2624434389140271
Tennis Court 2	1.7505470459518600

답변:

 select 	facs . name as name,
	facs . initialoutlay / (( sum (case
			when memid = 0 then slots * facs . guestcost
			else slots * membercost
		end) / 3 ) - facs . monthlymaintenance ) as months
	from cd . bookings bks
	inner join cd . facilities facs
		on bks . facid = facs . facid
	group by facs . facid
order by name;

In contrast to all our recent exercises, there's no need to use window functions to solve this problem: it's just a bit of maths involving monthly revenue, initial outlay, and monthly maintenance. Again, for production code you might want to clarify what's going on a little here using a subquery (although since we've hard-coded the number of months, putting this into production is unlikely!). A tidied-up version might look like:

 select 	name, 
	initialoutlay / (monthlyrevenue - monthlymaintenance) as repaytime 
	from 
		( select facs . name as name, 
			facs . initialoutlay as initialoutlay,
			facs . monthlymaintenance as monthlymaintenance,
			sum (case
				when memid = 0 then slots * facs . guestcost
				else slots * membercost
			end) / 3 as monthlyrevenue
		from cd . bookings bks
		inner join cd . facilities facs
			on bks . facid = facs . facid
		group by facs . facid
	) as subq
order by name;

But, I hear you ask, what would an automatic version of this look like? One that didn't need to have a hard-coded number of months in it? That's a little more complicated, and involves some date arithmetic. I've factored that out into a CTE to make it a little more clear.

with monthdata as (
	select 	mincompletemonth,
		maxcompletemonth,
		(extract(year from maxcompletemonth) * 12 ) +
			extract(month from maxcompletemonth) -
			(extract(year from mincompletemonth) * 12 ) -
			extract(month from mincompletemonth) as nummonths 
	from (
		select 	date_trunc( ' month ' , 
				( select max (starttime) from cd . bookings )) as maxcompletemonth,
			date_trunc( ' month ' , 
				( select min (starttime) from cd . bookings )) as mincompletemonth
	) as subq
)
select 	name, 
	initialoutlay / (monthlyrevenue - monthlymaintenance) as repaytime 
	
	from
		( select facs . name as name,
			facs . initialoutlay as initialoutlay,
			facs . monthlymaintenance as monthlymaintenance,
			sum (case
				when memid = 0 then slots * facs . guestcost
				else slots * membercost
			end) / ( select nummonths from monthdata) as monthlyrevenue
			
			from cd . bookings bks
			inner join cd . facilities facs
				on bks . facid = facs . facid
			where bks . starttime < ( select maxcompletemonth from monthdata)
			group by facs . facid
		) as subq
order by name;

This code restricts the data that goes in to complete months. It does this by selecting the maximum date, rounding down to the month, and stripping out all dates larger than that. Even this code is not completely-complete. It doesn't handle the case of a facility making a loss. Fixing that is not too hard, and is left as (another) exercise for the reader!

Calculate a rolling average of total revenue

For each day in August 2012, calculate a rolling average of total revenue over the previous 15 days. Output should contain date and revenue columns, sorted by the date. Remember to account for the possibility of a day having zero revenue. This one's a bit tough, so don't be afraid to check out the hint!

Expected results:

날짜	수익
2012-08-01	1126.8333333333333333
2012-08-02	1153.0000000000000000
2012-08-03	1162.9000000000000000
2012-08-04	1177.3666666666666667
2012-08-05	1160.9333333333333333
2012-08-06	1185.4000000000000000
2012-08-07	1182.8666666666666667
2012-08-08	1172.6000000000000000
2012-08-09	1152.4666666666666667
2012-08-10	1175.0333333333333333
2012-08-11	1176.6333333333333333
2012-08-12	1195.6666666666666667
2012-08-13	1218.0000000000000000
2012-08-14	1247.4666666666666667
2012-08-15	1274.1000000000000000
2012-08-16	1281.2333333333333333
2012-08-17	1324.4666666666666667
2012-08-18	1373.7333333333333333
2012-08-19	1406.0666666666666667
2012-08-20	1427.0666666666666667
2012-08-21	1450.3333333333333333
2012-08-22	1539.7000000000000000
2012-08-23	1567.3000000000000000
2012-08-24	1592.3333333333333333
2012-08-25	1615.0333333333333333
2012-08-26	1631.2000000000000000
2012-08-27	1659.4333333333333333
2012-08-28	1687.0000000000000000
2012-08-29	1684.6333333333333333
2012-08-30	1657.9333333333333333
2012-08-31	1703.4000000000000000

답변:

 select 	dategen . date ,
	(
		-- correlated subquery that, for each day fed into it,
		-- finds the average revenue for the last 15 days
		select sum (case
			when memid = 0 then slots * facs . guestcost
			else slots * membercost
		end) as rev

		from cd . bookings bks
		inner join cd . facilities facs
			on bks . facid = facs . facid
		where bks . starttime > dategen . date - interval ' 14 days '
			and bks . starttime < dategen . date + interval ' 1 day '
	) / 15 as revenue
	from
	(
		-- generates a list of days in august
		select 	cast(generate_series( timestamp ' 2012-08-01 ' ,
			' 2012-08-31 ' , ' 1 day ' ) as date ) as date
	)  as dategen
order by dategen . date ;

There's at least two equally good solutions to this question. I've put the simplest to write as the answer, but there's also a more flexible solution that uses window functions.

Let's look at the selected answer first. When I read SQL queries, I tend to read the SELECT part of the statement last - the FROM and WHERE parts tend to be more interesting. So, what do we have in our FROM ? A call to the GENERATE_SERIES function. This does pretty much what it says on the tin - generates a series of values. You can specify a start value, a stop value, and an increment. It works for integer types and dates - although, as you can see, we need to be explicit about what types are going into and out of the function. Try removing the casts, and seeing the result!

So, we've generated a timestamp for each day in August. Now, for each day, we need to generate our average. We can do this using a correlated subquery . If you remember, a correlated subquery is a subquery that uses values from the outer query. This means that it gets executed once for each result row in the outer query. This is in contrast to an uncorrelated subquery, which only has to be executed once.

If we look at our correlated subquery, we can see that it's correlated on the dategen.date field. It produces a sum of revenue for this day and the 14 days prior to it, and then divides that sum by 15. This produces the output we're looking for!

I mentioned that there's a window function-based solution for this problem as well - you can see it below. The approach we use for this is generating a list of revenue for each day, and then using window function aggregation over that list. The nice thing about this method is that once you have the per-day revenue, you can produce a wide range of results quite easily - you might, for example, want rolling averages for the previous month, 15 days, and 5 days. This is easy to do using this method, and rather harder using conventional aggregation.

 select date , avgrev from (
	-- AVG over this row and the 14 rows before it.
	select 	dategen . date as date ,
		avg ( revdata . rev ) over( order by dategen . date rows 14 preceding) as avgrev
	from
		-- generate a list of days.  This ensures that a row gets generated
		-- even if the day has 0 revenue.  Note that we generate days before
		-- the start of october - this is because our window function needs
		-- to know the revenue for those days for its calculations.
		( select
			cast(generate_series( timestamp ' 2012-07-10 ' , ' 2012-08-31 ' , ' 1 day ' ) as date ) as date
		)  as dategen
		left outer join
			-- left join to a table of per-day revenue
			( select cast( bks . starttime as date ) as date ,
				sum (case
					when memid = 0 then slots * facs . guestcost
					else slots * membercost
				end) as rev

				from cd . bookings bks
				inner join cd . facilities facs
					on bks . facid = facs . facid
				group by cast( bks . starttime as date )
			) as revdata
			on dategen . date = revdata . date
	) as subq
	where date >= ' 2012-08-01 '
order by date ;

You'll note that we've been wanting to work out daily revenue quite frequently. Rather than inserting that calculation into all our queries, which is rather messy (and will cause us a big headache if we ever change our schema), we probably want to store that information somewhere. Your first thought might be to calculate information and store it somewhere for later use. This is a common tactic for large data warehouses, but it can cause us some problems - if we ever go back and edit our data, we need to remember to recalculate. For non-enormous-scale data like we're looking at here, we can just create a view instead. A view is essentially a stored query that looks exactly like a table. Under the covers, the DBMS just subsititutes in the relevant portion of the view definition when you select data from it. They're very easy to create, as you can see below:

 create or replace view cd .dailyrevenue as
	select 	cast( bks . starttime as date ) as date ,
		sum (case
			when memid = 0 then slots * facs . guestcost
			else slots * membercost
		end) as rev

		from cd . bookings bks
		inner join cd . facilities facs
			on bks . facid = facs . facid
		group by cast( bks . starttime as date );

You can see that this makes our query an awful lot simpler!

 select date , avgrev from (
	select  dategen . date as date ,
		avg ( revdata . rev ) over( order by dategen . date rows 14 preceding) as avgrev
	from		
		( select
			cast(generate_series( timestamp ' 2012-07-10 ' , ' 2012-08-31 ' , ' 1 day ' ) as date ) as date
		)  as dategen
		left outer join
			cd . dailyrevenue as revdata on dategen . date = revdata . date
		) as subq
	where date >= ' 2012-08-01 '
order by date ;

As well as storing frequently-used query fragments, views can be used for a variety of purposes, including restricting access to certain columns of a table.

Working with Timestamps

Dates/Times in SQL are a complex topic, deserving of a category of their own. They're also fantastically powerful, making it easier to work with variable-length concepts like 'months' than many programming languages.

Before getting started on this category, it's probably worth taking a look over the PostgreSQL docs page on date/time functions. You might also want to complete the aggregate functions category, since we'll use some of those capabilities in this section.

Produce a timestamp for 1 am on the 31st of August 2012

Produce a timestamp for 1 am on the 31st of August 2012.

Expected results:

timestamp
2012-08-31 01:00:00

답변:

 select timestamp ' 2012-08-31 01:00:00 ' ;

Here's a pretty easy question to start off with! SQL has a bunch of different date and time types, which you can peruse at your leisure over at the excellent Postgres documentation. These basically allow you to store dates, times, or timestamps (date+time).

The approved answer is the best way to create a timestamp under normal circumstances. You can also use casts to change a correctly formatted string into a timestamp, for example:

 select ' 2012-08-31 01:00:00 ' :: timestamp ;
select cast( ' 2012-08-31 01:00:00 ' as timestamp );

The former approach is a Postgres extension, while the latter is SQL-standard. You'll note that in many of our earlier questions, we've used bare strings without specifying a data type. This works because when Postgres is working with a value coming out of a timestamp column of a table (say), it knows to cast our strings to timestamps.

Timestamps can be stored with or without time zone information. We've chosen not to here, but if you like you could format the timestamp like "2012-08-31 01:00:00 +00:00", assuming UTC. Note that timestamp with time zone is a different type to timestamp - when you're declaring it, you should use TIMESTAMP WITH TIME ZONE 2012-08-31 01:00:00 +00:00.

Finally, have a bit of a play around with some of the different date/time serialisations described in the Postgres docs. You'll find that Postgres is extremely flexible with the formats it accepts, although my recommendation to you would be to use the standard serialisation we've used here - you'll find it unambiguous and easy to port to other DBs.

Subtract timestamps from each other

Find the result of subtracting the timestamp '2012-07-30 01:00:00' from the timestamp '2012-08-31 01:00:00'

Expected results:

간격
32 days

답변:

 select timestamp ' 2012-08-31 01:00:00 ' - timestamp ' 2012-07-30 01:00:00 ' as interval;

Subtracting timestamps produces an INTERVAL data type. INTERVAL s are a special data type for representing the difference between two TIMESTAMP types. When subtracting timestamps, Postgres will typically give an interval in terms of days, hours, minutes, seconds, without venturing into months. This generally makes life easier, since months are of variable lengths.

One of the useful things about intervals, though, is the fact that they can encode months. Let's imagine that I want to schedule something to occur in exactly one month's time, regardless of the length of my month. To do this, I could use [timestamp] + interval '1 month' .

Intervals stand in contrast to SQL's treatment of DATE types. Dates don't use intervals - instead, subtracting two dates will return an integer representing the number of days between the two dates. You can also add integer values to dates. This is sometimes more convenient, depending on how much intelligence you require in the handling of your dates!

Generate a list of all the dates in October 2012

Produce a list of all the dates in October 2012. They can be output as a timestamp (with time set to midnight) or a date.

Expected results:

TS
2012-10-01 00:00:00
2012-10-02 00:00:00
2012-10-03 00:00:00
2012-10-04 00:00:00
2012-10-05 00:00:00
2012-10-06 00:00:00
2012-10-07 00:00:00
2012-10-08 00:00:00
2012-10-09 00:00:00
2012-10-10 00:00:00
2012-10-11 00:00:00
2012-10-12 00:00:00
2012-10-13 00:00:00
2012-10-14 00:00:00
2012-10-15 00:00:00
2012-10-16 00:00:00
2012-10-17 00:00:00
2012-10-18 00:00:00
2012-10-19 00:00:00
2012-10-20 00:00:00
2012-10-21 00:00:00
2012-10-22 00:00:00
2012-10-23 00:00:00
2012-10-24 00:00:00
2012-10-25 00:00:00
2012-10-26 00:00:00
2012-10-27 00:00:00
2012-10-28 00:00:00
2012-10-29 00:00:00
2012-10-30 00:00:00
2012-10-31 00:00:00

답변:

 select generate_series( timestamp ' 2012-10-01 ' , timestamp ' 2012-10-31 ' , interval ' 1 day ' ) as ts;

One of the best features of Postgres over other DBs is a simple function called GENERATE_SERIES . This function allows you to generate a list of dates or numbers, specifying a start, an end, and an increment value. It's extremely useful for situations where you want to output, say, sales per day over the course of a month. A typical way to do that on a table containing a list of sales might be to use a SUM aggregation, grouping by the date and product type. Unfortunately, this approach has a flaw: if there are no sales for a given day, it won't show up! To make it work properly, you need to left join from a sequential list of timestamps to the aggregated data to fill in the blank spaces.

On other database systems, it's not uncommon to keep a 'calendar table' full of dates, with which you can perform these joins. Alternatively, on some systems you can write an analogue to generate_series using recursive CTEs. Fortunately for us, Postgres makes our lives a lot easier!

Get the day of the month from a timestamp

Get the day of the month from the timestamp '2012-08-31' as an integer.

Expected results:

date_part
31

답변:

 select extract(day from timestamp ' 2012-08-31 ' );

The EXTRACT function is used for getting sections of a timestamp or interval. You can get the value of any field in the timestamp as an integer.

Work out the number of seconds between timestamps

Work out the number of seconds between the timestamps '2012-08-31 01:00:00' and '2012-09-02 00:00:00'

Expected results:

date_part
169200

답변:

 select extract(epoch from ( timestamp ' 2012-09-02 00:00:00 ' - ' 2012-08-31 01:00:00 ' ));

The above answer is a Postgres-specific trick. Extracting the epoch converts an interval or timestamp into a number of seconds, or the number of seconds since epoch (January 1st, 1970) respectively. If you want the number of minutes, hours, etc you can just divide the number of seconds appropriately.

If you want to write more portable code, you will unfortunately find that you cannot use extract epoch . Instead you will need to use something like:

 select 	extract(day from ts . int ) * 60 * 60 * 24 +
	extract(hour from ts . int ) * 60 * 60 + 
	extract(minute from ts . int ) * 60 +
	extract(second from ts . int )
	from
		( select timestamp ' 2012-09-02 00:00:00 ' - ' 2012-08-31 01:00:00 ' as int ) ts

답변:

This is, as you can observe, rather awful. If you're planning to write cross platform SQL, I would consider having a library of common user defined functions for each DBMS, allowing you to normalise any common requirements like this. This keeps your main codebase a lot cleaner.

Work out the number of days in each month of 2012

For each month of the year in 2012, output the number of days in that month. Format the output as an integer column containing the month of the year, and a second column containing an interval data type.

Expected results:

월	길이
1	31 days
2	29 days
3	31 days
4	30 일
5	31 days
6	30 일
7	31 days
8	31 days
9	30 일
10	31 days
11	30 일
12	31 days

답변:

 select 	extract(month from cal . month ) as month,
	( cal . month + interval ' 1 month ' ) - cal . month as length
	from
	(
		select generate_series( timestamp ' 2012-01-01 ' , timestamp ' 2012-12-01 ' , interval ' 1 month ' ) as month
	) cal
order by month;

This answer shows several of the concepts we've learned. We use the GENERATE_SERIES function to produce a year's worth of timestamps, incrementing a month at a time. We then use the EXTRACT function to get the month number. Finally, we subtract each timestamp + 1 month from itself.

It's worth noting that subtracting two timestamps will always produce an interval in terms of days (or portions of a day). You won't just get an answer in terms of months or years, because the length of those time periods is variable.

Work out the number of days remaining in the month

For any given timestamp, work out the number of days remaining in the month. The current day should count as a whole day, regardless of the time. Use '2012-02-11 01:00:00' as an example timestamp for the purposes of making the answer. Format the output as a single interval value.

Expected results:

remaining
19 days

답변:

 select (date_trunc( ' month ' , ts . testts ) + interval ' 1 month ' ) 
		- date_trunc( ' day ' , ts . testts ) as remaining
	from ( select timestamp ' 2012-02-11 01:00:00 ' as testts) ts

The star of this particular show is the DATE_TRUNC function. It does pretty much what you'd expect - truncates a date to a given minute, hour, day, month, and so on. The way we've solved this problem is to truncate our timestamp to find the month we're in, add a month to that, and subtract our timestamp. To ensure partial days get treated as whole days, the timestamp we subtract is truncated to the nearest day.

Note the way we've put the timestamp into a subquery. This isn't required, but it does mean you can give the timestamp a name, rather than having to list the literal repeatedly.

Work out the end time of bookings

Return a list of the start and end time of the last 10 bookings (ordered by the time at which they end, followed by the time at which they start) in the system.

Expected results:

starttime	endtime
2013-01-01 15:30:00	2013-01-01 16:00:00
2012-09-30 19:30:00	2012-09-30 20:30:00
2012-09-30 19:00:00	2012-09-30 20:30:00
2012-09-30 19:30:00	2012-09-30 20:00:00
2012-09-30 19:00:00	2012-09-30 20:00:00
2012-09-30 19:00:00	2012-09-30 20:00:00
2012-09-30 18:30:00	2012-09-30 20:00:00
2012-09-30 18:30:00	2012-09-30 20:00:00
2012-09-30 19:00:00	2012-09-30 19:30:00
2012-09-30 18:30:00	2012-09-30 19:30:00

답변:

 select starttime, starttime + slots * (interval ' 30 minutes ' ) endtime
	from cd . bookings
	order by endtime desc , starttime desc
	limit 10

This question simply returns the start time for a booking, and a calculated end time which is equal to start time + (30 minutes * slots) . Note that it's perfectly okay to multiply intervals.

The other thing you'll notice is the use of order by and limit to get the last ten bookings. All this does is order the bookings by the (descending) time at which they end, and pick off the top ten.

Return a count of bookings for each month

Return a count of bookings for each month, sorted by month

Expected results:

월	세다
2012-07-01 00:00:00	658
2012-08-01 00:00:00	1472
2012-09-01 00:00:00	1913
2013-01-01 00:00:00	1

답변:

 select date_trunc( ' month ' , starttime) as month, count ( * )
	from cd . bookings
	group by month
	order by month

This one is a fairly simple reuse of concepts we've seen before. We simply count the number of bookings, and aggregate by the booking's start time, truncated to the month.

Work out the utilisation percentage for each facility by month

Work out the utilisation percentage for each facility by month, sorted by name and month, rounded to 1 decimal place. Opening time is 8am, closing time is 8.30pm. You can treat every month as a full month, regardless of if there were some dates the club was not open.

Expected results:

이름	월	이용
Badminton Court	2012-07-01 00:00:00	23.2
Badminton Court	2012-08-01 00:00:00	59.2
Badminton Court	2012-09-01 00:00:00	76.0
Massage Room 1	2012-07-01 00:00:00	34.1
Massage Room 1	2012-08-01 00:00:00	63.5
Massage Room 1	2012-09-01 00:00:00	86.4
Massage Room 2	2012-07-01 00:00:00	3.1
Massage Room 2	2012-08-01 00:00:00	10.6
Massage Room 2	2012-09-01 00:00:00	16.3
Pool Table	2012-07-01 00:00:00	15.1
Pool Table	2012-08-01 00:00:00	41.5
Pool Table	2012-09-01 00:00:00	62.8
Pool Table	2013-01-01 00:00:00	0.1
Snooker Table	2012-07-01 00:00:00	20.1
Snooker Table	2012-08-01 00:00:00	42.1
Snooker Table	2012-09-01 00:00:00	56.8
Squash Court	2012-07-01 00:00:00	21.2
Squash Court	2012-08-01 00:00:00	51.6
Squash Court	2012-09-01 00:00:00	72.0
탁구	2012-07-01 00:00:00	13.4
탁구	2012-08-01 00:00:00	39.2
탁구	2012-09-01 00:00:00	56.3
Tennis Court 1	2012-07-01 00:00:00	34.8
Tennis Court 1	2012-08-01 00:00:00	59.2
Tennis Court 1	2012-09-01 00:00:00	78.8
Tennis Court 2	2012-07-01 00:00:00	26.7
Tennis Court 2	2012-08-01 00:00:00	62.3
Tennis Court 2	2012-09-01 00:00:00	78.4

답변:

 select name, month, 
	round(( 100 * slots) /
		cast(
			25 * (cast((month + interval ' 1 month ' ) as date )
			- cast (month as date )) as numeric ), 1 ) as utilisation
	from  (
		select facs . name as name, date_trunc( ' month ' , starttime) as month, sum (slots) as slots
			from cd . bookings bks
			inner join cd . facilities facs
				on bks . facid = facs . facid
			group by facs . facid , month
	) as inn
order by name, month

The meat of this query (the inner subquery) is really quite simple: an aggregation to work out the total number of slots used per facility per month. If you've covered the rest of this section and the category on aggregates, you likely didn't find this bit too challenging.

This query does, unfortunately, have some other complexity in it: working out the number of days in each month. We can calculate the number of days between two months by subtracting two timestamps with a month between them. This, unfortunately, gives us back on interval datatype, which we can't use to do mathematics. In this case we've worked around that limitation by converting our timestamps into dates before subtracting. Subtracting date types gives us an integer number of days.

A alternative to this workaround is to convert the interval into an epoch value: that is, a number of seconds. To do this use EXTRACT(EPOCH FROM month)/(24*60*60) . This is arguably a much nicer way to do things, but is much less portable to other database systems.

String Operations

String operations in most RDBMSs are, arguably, needlessly painful. Fortunately, Postgres is better than most in this regard, providing strong regular expression support. This section covers basic string manipulation, use of the LIKE operator, and use of regular expressions. I also make an effort to show you some alternative approaches that work reliably in most RDBMSs. Be sure to check out Postgres' string function docs page if you're not confident about these exercises.

Anthony Molinaro's SQL Cookbook provides some excellent documentation of (difficult) cross-DBMS compliant SQL string manipulation. I'd strongly recommend his book, particularly as it's published by O'Reilly, whose ethical policy of DRM-free ebook distribution deserves rich rewards.

Format the names of members

Output the names of all members, formatted as 'Surname, Firstname'

Expected results:

이름
GUEST, GUEST
Smith, Darren
Smith, Tracy
Rownam, Tim
Joplette, Janice
Butters, Gerald
Tracy, Burton
Dare, Nancy
Boothe, Tim
Stibbons, Ponder
Owen, Charles
Jones, David
Baker, Anne
Farrell, Jemima
Smith, Jack
Bader, Florence
Baker, Timothy
Pinker, David
Genting, Matthew
Mackenzie, Anna
Coplin, Joan
Sarwin, Ramnaresh
Jones, Douglas
Rumney, Henrietta
Farrell, David
Worthington-Smyth, Henry
Purview, Millicent
Tupperware, Hyacinth
Hunt, John
Crumpet, Erica
Smith, Darren

답변:

 select surname || ' , ' || firstname as name from cd . members

Building strings in sql is similar to other languages, with the exception of the concatenation operator: ||. Some systems (like SQL Server) use +, but || is the SQL standard.

Find facilities by a name prefix

Find all facilities whose name begins with 'Tennis'. Retrieve all columns.

Expected results:

facid	이름	membercost	guestcost	initialoutlay	monthlymaintenance
0	Tennis Court 1	5	25	10000	200
1	Tennis Court 2	5	25	8000	200

답변:

 select * from cd . facilities where name like ' Tennis% ' ;

The SQL LIKE operator is a highly standard way of searching for a string using basic matching. The % character matches any string, while _ matches any single character.

One point that's worth considering when you use LIKE is how it uses indexes. If you're using the 'C' locale, any LIKE string with a fixed beginning (as in our example here) can use an index. If you're using any other locale, LIKE will not use any index by default. See here for details on how to change that.

Perform a case-insensitive search

Perform a case-insensitive search to find all facilities whose name begins with 'tennis'. Retrieve all columns.

Expected results:

facid	이름	membercost	guestcost	initialoutlay	monthlymaintenance
0	Tennis Court 1	5	25	10000	200
1	Tennis Court 2	5	25	8000	200

답변:

 select * from cd . facilities where upper (name) like ' TENNIS% ' ;

There's no direct operator for case-insensitive comparison in standard SQL. Fortunately, we can take a page from many other language's books, and simply force all values into upper case when we do our comparison. This renders case irrelevant, and gives us our result.

Alternatively, Postgres does provide the ILIKE operator, which performs case insensitive searches. This isn't standard SQL, but it's arguably more clear.

You should realise that running a function like UPPER over a column value prevents Postgres from making use of any indexes on the column (the same is true for ILIKE ). Fortunately, Postgres has got your back: rather than simply creating indexes over columns, you can also create indexes over expressions. If you created an index over UPPER(name) , this query could use it quite happily.

Find telephone numbers with parentheses

You've noticed that the club's member table has telephone numbers with very inconsistent formatting. You'd like to find all the telephone numbers that contain parentheses, returning the member ID and telephone number sorted by member ID.

Expected results:

memid	전화
0	(000) 000-0000
3	(844) 693-0723
4	(833) 942-4710
5	(844) 078-4130
6	(822) 354-9973
7	(833) 776-4001
8	(811) 433-2547
9	(833) 160-3900
10	(855) 542-5251
11	(844) 536-8036
13	(855) 016-0163
14	(822) 163-3254
15	(833) 499-3527
20	(811) 972-1377
21	(822) 661-2898
22	(822) 499-2232
24	(822) 413-1470
27	(822) 989-8876
28	(855) 755-9876
29	(855) 894-3758
30	(855) 941-9786
33	(822) 665-5327
35	(899) 720-6978
36	(811) 732-4816
37	(822) 577-3541

답변:

 select memid, telephone from cd . members where telephone ~ ' [()] ' ;

We've chosen to answer this using regular expressions, although Postgres does provide other string functions like POSITION that would do the job at least as well. Postgres implements POSIX regular expression matching via the ~ operator. If you've used regular expressions before, the functionality of the operator will be very familiar to you.

As an alternative, you can use the SQL standard SIMILAR TO operator. The regular expressions for this have similarities to the POSIX standard, but a lot of differences as well. Some of the most notable differences are:

As in the LIKE operator, SIMILAR TO uses the '_' character to mean 'any character', and the '%' character to mean 'any string'.
A SIMILAR TO expression must match the whole string, not just a substring as in posix regular expressions. This means that you'll typically end up bracketing an expression in '%' characters.
The '.' character does not mean 'any character' in SIMILAR TO regexes: it's just a plain character.

The SIMILAR TO equivalent of the given answer is shown below:

 select memid, telephone from cd . members where telephone similar to ' %[()]% ' ;

Finally, it's worth noting that regular expressions usually don't use indexes. Generally you don't want your regex to be responsible for doing heavy lifting in your query, because it will be slow. If you need fuzzy matching that works fast, consider working out if your needs can be met by full text search.

Pad zip codes with leading zeroes

The zip codes in our example dataset have had leading zeroes removed from them by virtue of being stored as a numeric type. Retrieve all zip codes from the members table, padding any zip codes less than 5 characters long with leading zeroes. Order by the new zip code.

Expected results:

지퍼
00000
00234
00234
04321
04321
10383
11986
23423
28563
33862
34232
43532
43533
45678
52365
54333
56754
57392
58393
64577
65332
65464
66796
68666
69302
75655
78533
80743
84923
87630
97676

답변:

 select lpad(cast(zipcode as char ( 5 )), 5 , ' 0 ' ) zip from cd . members order by zip

Postgres' LPAD function is the star of this particular show. It does basically what you'd expect: allow us to produce a padded string. We need to remember to cast the zipcode to a string for it to be accepted by the LPAD function.

When inheriting an old database, It's not that unusual to find wonky decisions having been made over data types. You may wish to fix mistakes like these, but have a lot of code that would break if you changed datatypes. In that case, one option (depending on performance requirements) is to create a view over your table which presents the data in a fixed-up manner, and gradually migrate.

Count the number of members whose surname starts with each letter of the alphabet

You'd like to produce a count of how many members you have whose surname starts with each letter of the alphabet. Sort by the letter, and don't worry about printing out a letter if the count is 0.

Expected results:

편지	세다
비	5
기음	2
디	1
에프	2
G	2
시간	1
J.	3
중	1
영형	1
피	2
아르 자형	2
에스	6
티	2
w	1

답변:

 select substr ( mems . surname , 1 , 1 ) as letter, count ( * ) as count 
    from cd . members mems
    group by letter
    order by letter

This exercise is fairly straightforward. You simply need to retrieve the first letter of the member's surname, and do some basic aggregation to achieve a count. We use the SUBSTR function here, but there's a variety of other ways you can achieve the same thing. The LEFT function, for example, returns you the first n characters from the left of the string. Alternatively, you could use the SUBSTRING function, which allows you to use regular expressions to extract a portion of the string.

One point worth noting: as you can see, string functions in SQL are based on 1-indexing, not the 0-indexing that you're probably used to. This will likely trip you up once or twice before you get used to it :-)

Clean up telephone numbers

The telephone numbers in the database are very inconsistently formatted. You'd like to print a list of member ids and numbers that have had '-','(',')', and ' ' characters removed. Order by member id.

Expected results:

memid	전화
0	0000000000
1	5555555555
2	5555555555
3	8446930723
4	8339424710
5	8440784130
6	8223549973
7	8337764001
8	8114332547
9	8331603900
10	8555425251
11	8445368036
12	8440765141
13	8550160163
14	8221633254
15	8334993527
16	8339410824
17	8114096734
20	8119721377
21	8226612898
22	8224992232
24	8224131470
26	8445368036
27	8229898876
28	8557559876
29	8558943758
30	8559419786
33	8226655327
35	8997206978
36	8117324816
37	8225773541

답변:

 select memid, translate (telephone, ' -() ' , ' ' ) as telephone
    from cd . members
    order by memid;

The most direct solution is probably the TRANSLATE function, which can be used to replace characters in a string. You pass it three strings: the value you want altered, the characters to replace, and the characters you want them replaced with. In our case, we want all the characters deleted, so our third parameter is an empty string.

As is often the way with strings, we can also use regular expressions to solve our problem. The REGEXP_REPLACE function provides what we're looking for: we simply pass a regex that matches all non-digit characters, and replace them with nothing, as shown below. The 'g' flag tells the function to replace as many instances of the pattern as it can find. This solution is perhaps more robust, as it cleans out more bad formatting.

 select memid, regexp_replace(telephone, ' [^0-9] ' , ' ' , ' g ' ) as telephone
    from cd . members
    order by memid;

Making automated use of free-formatted text data can be a chore. Ideally you want to avoid having to constantly write code to clean up the data before using it, so you should consider having your database enforce correct formatting for you. You can do this using a CHECK constraint on your column, which allow you to reject any poorly-formatted entry. It's tempting to perform this kind of validation in the application layer, and this is certainly a valid approach. As a general rule, if your database is getting used by multiple applications, favour pushing more of your checks down into the database to ensure consistent behaviour between the apps.

Occasionally, adding a constraint isn't feasible. You may, for example, have two different legacy applications asserting differently formatted information. If you're unable to alter the applications, you have a couple of options to consider. Firstly, you can define a trigger on your table. This allows you to intercept data before (or after) it gets asserted to your table, and normalise it into a single format. Alternatively, you could build a view over your table that cleans up information on the fly, as it's read out. Newer applications can read from the view and benefit from more reliably formatted information.

Recursive Queries

Common Table Expressions allow us to, effectively, create our own temporary tables for the duration of a query - they're largely a convenience to help us make more readable SQL. Using the WITH RECURSIVE modifier, however, it's possible for us to create recursive queries. This is enormously advantageous for working with tree and graph-structured data - imagine retrieving all of the relations of a graph node to a given depth, for example.

This category shows you some basic recursive queries that are possible using our dataset.

Find the upward recommendation chain for member ID 27

Find the upward recommendation chain for member ID 27: that is, the member who recommended them, and the member who recommended that member, and so on. Return member ID, first name, and surname. Order by descending member id.

Expected results:

recommender	FirstName	성
20	매튜	Genting
5	제랄드	Butters
1	대런	스미스

답변:

with recursive recommenders(recommender) as (
	select recommendedby from cd . members where memid = 27
	union all
	select mems . recommendedby
		from recommenders recs
		inner join cd . members mems
			on mems . memid = recs . recommender
)
select recs . recommender , mems . firstname , mems . surname
	from recommenders recs
	inner join cd . members mems
		on recs . recommender = mems . memid
order by memid desc

WITH RECURSIVE is a fantastically useful piece of functionality that many developers are unaware of. It allows you to perform queries over hierarchies of data, which is very difficult by other means in SQL. Such scenarios often leave developers resorting to multiple round trips to the database system.

You've seen WITH before. The Common Table Expressions (CTEs) defined by WITH give you the ability to produce inline views over your data. This is normally just a syntactic convenience, but the RECURSIVE modifier adds the ability to join against results already produced to produce even more. A recursive WITH takes the basic form of:

WITH RECURSIVE NAME(columns) as (
	< initial statement >
	UNION ALL 
	< recursive statement >
)

The initial statement populates the initial data, and then the recursive statement runs repeatedly to produce more. Each step of the recursion can access the CTE, but it sees within it only the data produced by the previous iteration. It repeats until an iteration produces no additional data.

The most simple example of a recursive WITH might look something like this:

with recursive increment(num) as (
	select 1
	union all
	select increment . num + 1 from increment where increment . num < 5
)
select * from increment;

The initial statement produces '1'. The first iteration of the recursive statement sees this as the content of increment , and produces '2'. The next iteration sees the content of increment as '2', and so on. Execution terminates when the recursive statement produces no additional data.

With the basics out of the way, it's fairly easy to explain our answer here. The initial statement gets the ID of the person who recommended the member we're interested in. The recursive statement takes the results of the initial statement, and finds the ID of the person who recommended them. This value gets forwarded on to the next iteration, and so on.

Now that we've constructed the recommenders CTE, all our main SELECT statement has to do is get the member IDs from recommenders, and join to them members table to find out their names.

Find the downward recommendation chain for member ID 1

Find the downward recommendation chain for member ID 1: that is, the members they recommended, the members those members recommended, and so on. Return member ID and name, and order by ascending member id.

Expected results:

memid	FirstName	성
4	제니스	Joplette
5	제랄드	Butters
7	낸시	도전
10	찰스	오웬
11	David	Jones
14	잭	스미스
20	매튜	Genting
21	안나	Mackenzie
26	더글러스	Jones
27	Henrietta	Rumney

답변:

with recursive recommendeds(memid) as (
	select memid from cd . members where recommendedby = 1
	union all
	select mems . memid
		from recommendeds recs
		inner join cd . members mems
			on mems . recommendedby = recs . memid
)
select recs . memid , mems . firstname , mems . surname
	from recommendeds recs
	inner join cd . members mems
		on recs . memid = mems . memid
order by memid

This is a pretty minor variation on the previous question. The essential difference is that we're now heading in the opposite direction. One interesting point to note is that unlike the previous example, this CTE produces multiple rows per iteration, by virtue of the fact that we're heading down the recommendation tree (following all branches) rather than up it.

Produce a CTE that can return the upward recommendation chain for any member

Produce a CTE that can return the upward recommendation chain for any member. You should be able to select recommender from recommenders where member=x. Demonstrate it by getting the chains for members 12 and 22. Results table should have member and recommender, ordered by member ascending, recommender descending.

Expected results:

회원	recommender	FirstName	성
12	9	Ponder	Stibbons
12	6	버튼	Tracy
22	16	디모데	빵 굽는 사람
22	13	Jemima	Farrell

답변:

with recursive recommenders(recommender, member) as (
	select recommendedby, memid
		from cd . members
	union all
	select mems . recommendedby , recs . member
		from recommenders recs
		inner join cd . members mems
			on mems . memid = recs . recommender
)
select recs . member member, recs . recommender , mems . firstname , mems . surname
	from recommenders recs
	inner join cd . members mems		
		on recs . recommender = mems . memid
	where recs . member = 22 or recs . member = 12
order by recs . member asc , recs . recommender desc

This question requires us to produce a CTE that can calculate the upward recommendation chain for any user. Most of the complexity of working out the answer is in realising that we now need our CTE to produce two columns: one to contain the member we're asking about, and another to contain the members in their recommendation tree. Essentially what we're doing is producing a table that flattens out the recommendation hierarchy.

Since we're looking to produce the chain for every user, our initial statement needs to select data for each user: their ID and who recommended them. Subsequently, we want to pass the member field through each iteration without changing it, while getting the next recommender. You can see that the recursive part of our statement hasn't really changed, except to pass through the 'member' field.

확장하다

facid	월	Total Slots
0	7	270
0	8	459
0	9	591
1	7	207
1	8	483
1	9	588
2	7	180
2	8	459
2	9	570
3	7	104
3	8	304
3	9	422
4	7	264
4	8	492
4	9	648
5	7	24
5	8	82
5	9	122
6	7	164
6	8	400
6	9	540
7	7	156
7	8	326
7	9	426
8	7	117
8	8	322
8	9	471

facid	월	Total Slots
0	7	270
0	8	459
0	9	591
1	7	207
1	8	483
1	9	588
2	7	180
2	8	459
2	9	570
3	7	104
3	8	304
3	9	422
4	7	264
4	8	492
4	9	648
5	7	24
5	8	82
5	9	122
6	7	164
6	8	400
6	9	540
7	7	156
7	8	326
7	9	426
8	7	117
8	8	322
8	9	471

postgresql exercises

PostgreSQL 연습

목차

시작하기

내 자신의 Postgres 시스템을 사용하고 싶습니다

개요

간단한 SQL 쿼리

테이블에서 모든 것을 검색하십시오

테이블에서 특정 열을 검색합니다

검색되는 행을 제어합니다

검색되는 행을 제어, 2 부

기본 문자열 검색

여러 가능한 값과 일치합니다

결과를 버킷으로 분류합니다

날짜로 작업

중복 제거 및 주문 결과

여러 쿼리의 결과를 결합합니다

간단한 집계

더 많은 집계

합류 및 하위 쿼리

회원 예약의 시작 시간을 검색하십시오

테니스 코트 예약의 시작 시간을 해결하십시오.

다른 회원을 추천 한 모든 회원 목록을 작성하십시오.

추천자와 함께 모든 회원 목록을 작성하십시오.

테니스 코트를 사용한 모든 회원 목록을 작성

비용이 많이 드는 예약 목록을 작성합니다

No Join을 사용하여 추천자와 함께 모든 회원 목록을 작성하십시오.

Produce a list of costly bookings, using a subquery

Modifying Data

Insert some data into a table

Insert multiple rows of data into a table

Insert calculated data into a table

Update some existing data

Update multiple rows and columns at the same time

Update a row based on the contents of another row

Delete all bookings

Delete a member from the cd.members table

Delete based on a subquery

집합

Count the number of facilities

Count the number of expensive facilities

Count the number of recommendations each member makes

List the total slots booked per facility

List the total slots booked per facility in a given month

List the total slots booked per facility per month

Find the count of members who have made at least one booking

List facilities with more than 1000 slots booked

Find the total revenue of each facility

Find facilities with a total revenue less than 1000

Output the facility id that has the highest number of slots booked

List the total slots booked per facility per month, Part 2

List the total hours booked per named facility

List each member's first booking after September 1st 2012

Produce a list of member names, with each row containing the total member count

Produce a numbered list of members

Output the facility id that has the highest number of slots booked, again

Rank members by (rounded) hours used

Find the top three revenue generating facilities

Classify facilities by value

Calculate the payback time for each facility

Calculate a rolling average of total revenue

Working with Timestamps

Produce a timestamp for 1 am on the 31st of August 2012

Subtract timestamps from each other

Generate a list of all the dates in October 2012

Get the day of the month from a timestamp

Work out the number of seconds between timestamps

Work out the number of days in each month of 2012

Work out the number of days remaining in the month

Work out the end time of bookings

Return a count of bookings for each month

Work out the utilisation percentage for each facility by month

String Operations

Format the names of members

Find facilities by a name prefix

Perform a case-insensitive search

Find telephone numbers with parentheses

Pad zip codes with leading zeroes

Count the number of members whose surname starts with each letter of the alphabet

Clean up telephone numbers

facid	월	Total Slots
0	7	270
0	8	459
0	9	591
1	7	207
1	8	483
1	9	588
2	7	180
2	8	459
2	9	570
3	7	104
3	8	304
3	9	422
4	7	264
4	8	492
4	9	648
5	7	24
5	8	82
5	9	122
6	7	164
6	8	400
6	9	540
7	7	156
7	8	326
7	9	426
8	7	117
8	8	322
8	9	471