12-8
题目:统计每个学校的答过题的用户的平均答题数
咋一看无从下手,其实很简单。每个学校,则说明按学校分组,平均答题数,则是一个学校的所有学生的答题总数/学生总数。注意前缀,我加了一个学校的,那么按学校分组也是刚刚好,注意一点就是学生总数需要去重,题目总数不必去重
1 2 3 4 |
select university, count(question_id) / count(distinct(q.device_id)) as asavg_answer_cnt from user_profile u join question_practice_detail q on u.device_id = q.device_id group by university |
题目:统计每个学校各难度的用户平均刷题数
从题目分析,这是多分组条件,即根据学校和题目难度两个条件分组,然后求用户平均刷题数,也就是用户的回答题目总数/用户总数。用户总数需要去重
1 2 3 4 5 |
select university,difficult_level, count(answer_cnt) / count(DISTINCT u.device_id) as avg_answer_cnt from user_profile u,question_practice_detail qpd,question_detail qd where u.device_id = qpd.device_id and qpd.question_id = qd.question_id group by university,difficult_level |
这里,三张表连接join我不太会了,参考一下下面的表连接join查询方式
1 2 3 4 5 6 7 8 9 10 11 12 |
SELECT university, difficult_level, count(q.question_id )/count( distinct(q.device_id)) avg_answer_cnt FROM user_profile u JOIN question_practice_detail q ON u.device_id = q.device_id JOIN question_detail qd ON q.question_id = qd.question_id GROUP BY university, difficult_level; |
题目:统计每个用户的平均刷题数
现在做这题目也就是得心应手了,山东大学的用户在不同难度下的平均答题题目数嘛,分组嘛,根据难度分,where过滤山东大学,平均答题数则是总答题数/总人数
1 2 3 4 5 6 7 |
select university,difficult_level, count(qpd.question_id) / count(distinct qpd.device_id) avg_answer_cnt from user_profile u join question_practice_detail qpd on u.device_id = qpd.device_id join question_detail qd on qpd.question_id = qd.question_id where university = "山东大学" group by difficult_level |
12-10
题目:计算25岁以上和以下的用户数量
这里,考了一个新知识点,之前我是没见过的:
条件函数if。if(x=n,a,b)表示如果x=n,则返回a,否则就是b了。
1 2 3 4 |
select age_cut,count(device_id)as number from(Select if(age>=25,'25岁及以上','25岁以下' )as age_cut,device_id From user_profile)u2 Group by age_cut |
但仔细分析题目,就是有一个明显特点,就是我可以改成两句,查询!没错,所有也可以使用联合查询来实现
1 2 3 4 5 6 7 |
select '25岁以下' as age_cut,count(device_id) as number from user_profile where age<25 or age is null union all select '25岁及以上' as age_cut,count(device_id) as number from user_profile where age>=25; |
题目:查看不同年龄段的用户明细
使用case来查询即可,掌握case 的用法
1 2 3 4 5 6 7 8 |
SELECT device_id,gender, (CASE WHEN age >= 25 THEN '25岁及以上' WHEN age BETWEEN 20 AND 24 THEN '20-24岁' WHEN age < 20 THEN '20岁以下' ELSE '其他' END) AS age_cut FROM user_profile; |
题目:计算用户8月每天的练题数量
mysql中的日期是可以分隔的,可能你看到了2021-08-15,陷入了深思,如何分组根据2021-08.其实,mysql早就想好了,可以只查询8月份的,然后分组
1 2 3 4 |
Select day(date) as day, count(question_id) as question_cnt From question_practice_detail Where year(date)=2021 and month(date)=08 Group by day |
题目:计算用户的平均次日留存率
太难,放个答案。思路是计算前一天的用户刷题数量,当天的用户刷题数量,然后相除
1 2 3 4 5 6 7 8 9 10 11 |
select avg(if(b.device_id is not null,1,0)) as avg_ret from (select distinct device_id,date from question_practice_detail )a left join ( select distinct device_id,date_sub(date,interval 1 day) as date from question_practice_detail )b on a.device_id = b.device_id and a.date = b.date |
题目:统计每种性别的人数
1、LOCATE(substr , str ):返回子串 substr 在字符串 str 中第一次出现的位置,如果字符substr在字符串str中不存在,则返回0;
2、POSITION(substr IN str ):返回子串 substr 在字符串 str 中第一次出现的位置,如果字符substr在字符串str中不存在,与LOCATE函数作用相同;
3、LEFT(str, length):从左边开始截取str,length是截取的长度;
4、RIGHT(str, length):从右边开始截取str,length是截取的长度;
5、SUBSTRING_INDEX(str ,substr ,n):返回字符substr在str中第n次出现位置之前的字符串;
6、SUBSTRING(str ,n ,m):返回字符串str从第n个字符截取到第m个字符;
7、REPLACE(str, n, m):将字符串str中的n字符替换成m字符;
8、LENGTH(str):计算字符串str的长度。
1 2 3 |
select substring_index(profile,',',-1) as gender,count(device_id) from user_submit group by gender |
1-4
题目:截取年龄
考察字符串的练习,对函数substring_index和函数substring的掌握
1 2 3 |
select substring_index(SUBSTRING_INDEX(profile,",",3),",",-1) as age,count(device_id) FROM user_submit group by age |
题目:找出每个学校GPA最低的同学
如果只是找出gpq最低的,那么很简单,直接分组,select min(gpa)就行了,但是现在还要关联学校,我之前写的一个答案是错的:
错误的:
1 2 3 4 5 6 7 8 9 |
select device_id,university,gpa FROM user_profile WHERE gpa IN( select min(gpa) from user_profile group by university ) GROUP BY university #保证学校名字不重复 order BY university #保证与题目要求输出一致 |
这个错在哪里?我就不多说了,你肯定需要对某一行数据进行关联,看下面的正确答案
正确的:
1 2 3 4 5 6 7 8 9 10 |
SELECT device_id, university, gpa FROM user_profile u WHERE gpa = ( SELECT MIN( gpa ) FROM user_profile WHERE university = u.university ) ORDER BY university |
题目:统计复旦用户8月练题情况
表连接,然后筛选即可
1 2 3 4 5 6 7 8 9 10 11 12 13 |
SELECT u.device_id, university, SUM(IF(result IS NOT NULL, 1, 0)) AS questino_cnt, SUM(IF(result = "right", 1, 0)) AS right_question_cnt FROM user_profile u LEFT JOIN question_practice_detail q ON u.device_id = q.device_id AND MONTH(q.`date`) = "08" WHERE university = "复旦大学" GROUP BY u.device_id; |
题目:浙大不同难度题目的正确率
很难,但是得去想,我先把正确率求出来吧?不看浙大学校,但我遇到了困难
如何对一个字段中进行统计啊?这里我只会拿到result中的数量,我不会拿到当他等于right的数量,然后相除得到正确率。
1 2 3 4 |
SELECT difficult_level,COUNT(result) FROM question_detail as qd,question_practice_detail as qpd WHERE qd.question_id = qpd.question_id GROUP BY difficult_level |
后来,我会了,原来是使用if关键字….我会了,可她已经不在了…啊偏题了,拿到正确率,接下来就是绑定浙江大学
1 2 3 4 |
SELECT difficult_level,sum(if(result = 'right', 1, 0)) / count(*) as correct_rate FROM question_detail as qd,question_practice_detail as qpd WHERE qd.question_id = qpd.question_id GROUP BY difficult_level |
最后很简单了,只需要关联浙江大学,但是tmd结果还是错了,
1 2 3 4 5 6 7 8 9 10 11 12 |
SELECT difficult_level,sum(if(result = 'right', 1, 0)) / count(*) as correct_rate FROM question_detail as qd, question_practice_detail as qpd, user_profile as up WHERE qd.question_id = qpd.question_id and up.device_id = qpd.device_id and university = "浙江大学" GROUP BY difficult_level |
别慌,题目么看清,加个升序就行了。
总结,对于这种难sql,可以先写一部分拆分,但是有些知识不会,怎么拆都写不出来,比如先前的if,就得记住,得会。拆完之后,绑定浙江大学就十分的轻松了
题目:21年8月份练题总数
1 2 |
select count(DISTINCT device_id),count(question_id) from question_practice_detail where date like "2021-08%" |
到此,基本练习的题目都写完了,加油!