Search This Blog

14 August 2019

Tough SQL Problems: Comprehensive Guide with Explanations for Interviews

Tough SQL Problems: Comprehensive Guide with Explanations for Interviews

Tough SQL Problems: Comprehensive Guide with Explanations for Interviews

SQL (Structured Query Language) is a critical skill for data professionals, including data analysts, data scientists, and database administrators. In interviews, SQL questions can range from basic queries to complex problems that test your understanding of database concepts and your ability to write efficient queries. This comprehensive guide covers some tough SQL problems, their solutions, and detailed explanations to help you prepare for your next interview.

1. Finding the Nth Highest Salary

One of the classic SQL problems is finding the Nth highest salary from a table of employees.

Problem

Given a table Employees with columns id and salary, write a query to find the Nth highest salary.

Solution

SELECT DISTINCT salary
FROM Employees
ORDER BY salary DESC
LIMIT 1 OFFSET N-1;

Explanation

This query uses the ORDER BY clause to sort the salaries in descending order. The DISTINCT keyword ensures that duplicate salaries are not considered. The LIMIT clause limits the number of results, and the OFFSET clause skips the first N-1 rows, effectively selecting the Nth highest salary.

2. Finding Duplicates in a Table

Another common problem is identifying duplicate records in a table.

Problem

Given a table Users with columns id and email, write a query to find duplicate email addresses.

Solution

SELECT email, COUNT(*)
FROM Users
GROUP BY email
HAVING COUNT(*) > 1;

Explanation

This query groups the records by the email column and counts the number of occurrences of each email. The HAVING clause filters the results to include only those groups with a count greater than one, indicating duplicate email addresses.

3. Finding Employees with Salaries Greater Than Their Managers

This problem involves self-joins and subqueries.

Problem

Given a table Employees with columns id, name, salary, and manager_id, write a query to find employees whose salary is greater than their manager's salary.

Solution

SELECT e1.name
FROM Employees e1
JOIN Employees e2 ON e1.manager_id = e2.id
WHERE e1.salary > e2.salary;

Explanation

This query uses a self-join to compare each employee's salary with their manager's salary. The JOIN clause joins the table Employees with itself based on the manager_id and id columns. The WHERE clause filters the results to include only those employees whose salary is greater than their manager's salary.

4. Finding the Second Highest Salary Without Using LIMIT

Finding the second highest salary can also be done using a subquery.

Problem

Given a table Employees with columns id and salary, write a query to find the second highest salary without using the LIMIT clause.

Solution

SELECT MAX(salary)
FROM Employees
WHERE salary < (SELECT MAX(salary) FROM Employees);

Explanation

This query uses a subquery to find the maximum salary, and then it finds the maximum salary that is less than the first maximum salary, effectively selecting the second highest salary.

5. Ranking Employees by Salary

Ranking employees by their salary is a common problem that can be solved using window functions.

Problem

Given a table Employees with columns id, name, and salary, write a query to rank employees by their salary.

Solution

SELECT id, name, salary,
RANK() OVER (ORDER BY salary DESC) as salary_rank
FROM Employees;

Explanation

This query uses the RANK() window function to assign a rank to each employee based on their salary in descending order. The OVER clause specifies the ordering of the rows.

6. Finding the Department with the Highest Average Salary

This problem involves grouping data and calculating averages.

Problem

Given a table Employees with columns id, name, salary, and department_id, and a table Departments with columns id and name, write a query to find the department with the highest average salary.

Solution

SELECT d.name
FROM Departments d
JOIN Employees e ON d.id = e.department_id
GROUP BY d.name
ORDER BY AVG(e.salary) DESC
LIMIT 1;

Explanation

This query joins the Departments and Employees tables based on the department_id. It then groups the results by department name and calculates the average salary for each department. Finally, it orders the results by the average salary in descending order and limits the output to one row, effectively selecting the department with the highest average salary.

7. Finding Consecutive Days of Attendance

This problem involves using window functions to identify patterns in data.

Problem

Given a table Attendance with columns employee_id and date, write a query to find all employees who have attended for three consecutive days or more.

Solution

WITH RankedAttendance AS (
    SELECT employee_id, date,
    ROW_NUMBER() OVER (PARTITION BY employee_id ORDER BY date) as row_num
    FROM Attendance
)
SELECT employee_id, MIN(date) as start_date, MAX(date) as end_date, COUNT(*) as consecutive_days
FROM RankedAttendance
GROUP BY employee_id, DATEADD(DAY, -row_num, date)
HAVING COUNT(*) >= 3;

Explanation

This query first uses a CTE (Common Table Expression) to assign a row number to each attendance record for each employee, ordered by date. It then groups the results by the difference between the date and the row number, effectively identifying sequences of consecutive days. The HAVING clause filters the results to include only those sequences with three or more consecutive days.

8. Finding Top N Records for Each Group

This problem involves using window functions to rank records within groups.

Problem

Given a table Sales with columns salesperson_id,Sure! Here is a comprehensive article on tough SQL problems with detailed explanations, formatted in HTML for easy posting on your blog: ```html Tough SQL Problems: Comprehensive Guide with Explanations for Interviews

Tough SQL Problems: Comprehensive Guide with Explanations for Interviews

SQL (Structured Query Language) is a critical skill for data professionals, including data analysts, data scientists, and database administrators. In interviews, SQL questions can range from basic queries to complex problems that test your understanding of database concepts and your ability to write efficient queries. This comprehensive guide covers some tough SQL problems, their solutions, and detailed explanations to help you prepare for your next interview.

1. Finding the Nth Highest Salary

One of the classic SQL problems is finding the Nth highest salary from a table of employees.

Problem

Given a table Employees with columns id and salary, write a query to find the Nth highest salary.

Solution

SELECT DISTINCT salary
FROM Employees
ORDER BY salary DESC
LIMIT 1 OFFSET N-1;

Explanation

This query uses the ORDER BY clause to sort the salaries in descending order. The DISTINCT keyword ensures that duplicate salaries are not considered. The LIMIT clause limits the number of results, and the OFFSET clause skips the first N-1 rows, effectively selecting the Nth highest salary.

2. Finding Duplicates in a Table

Another common problem is identifying duplicate records in a table.

Problem

Given a table Users with columns id and email, write a query to find duplicate email addresses.

Solution

SELECT email, COUNT(*)
FROM Users
GROUP BY email
HAVING COUNT(*) > 1;

Explanation

This query groups the records by the email column and counts the number of occurrences of each email. The HAVING clause filters the results to include only those groups with a count greater than one, indicating duplicate email addresses.

3. Finding Employees with Salaries Greater Than Their Managers

This problem involves self-joins and subqueries.

Problem

Given a table Employees with columns id, name, salary, and manager_id, write a query to find employees whose salary is greater than their manager's salary.

Solution

SELECT e1.name
FROM Employees e1
JOIN Employees e2 ON e1.manager_id = e2.id
WHERE e1.salary > e2.salary;

Explanation

This query uses a self-join to compare each employee's salary with their manager's salary. The JOIN clause joins the table Employees with itself based on the manager_id and id columns. The WHERE clause filters the results to include only those employees whose salary is greater than their manager's salary.

4. Finding the Second Highest Salary Without Using LIMIT

Finding the second highest salary can also be done using a subquery.

Problem

Given a table Employees with columns id and salary, write a query to find the second highest salary without using the LIMIT clause.

Solution

SELECT MAX(salary)
FROM Employees
WHERE salary < (SELECT MAX(salary) FROM Employees);

Explanation

This query uses a subquery to find the maximum salary, and then it finds the maximum salary that is less than the first maximum salary, effectively selecting the second highest salary.

5. Ranking Employees by Salary

Ranking employees by their salary is a common problem that can be solved using window functions.

Problem

Given a table Employees with columns id, name, and salary, write a query to rank employees by their salary.

Solution

SELECT id, name, salary,
RANK() OVER (ORDER BY salary DESC) as salary_rank
FROM Employees;

Explanation

This query uses the RANK() window function to assign a rank to each employee based on their salary in descending order. The OVER clause specifies the ordering of the rows.

6. Finding the Department with the Highest Average Salary

This problem involves grouping data and calculating averages.

Problem

Given a table Employees with columns id, name, salary, and department_id, and a table Departments with columns id and name, write a query to find the department with the highest average salary.

Solution

SELECT d.name
FROM Departments d
JOIN Employees e ON d.id = e.department_id
GROUP BY d.name
ORDER BY AVG(e.salary) DESC
LIMIT 1;

Explanation

This query joins the Departments and Employees tables based on the department_id. It then groups the results by department name and calculates the average salary for each department. Finally, it orders the results by the average salary in descending order and limits the output to one row, effectively selecting the department with the highest average salary.

7. Finding Consecutive Days of Attendance

This problem involves using window functions to identify patterns in data.

Problem

Given a table Attendance with columns employee_id and date, write a query to find all employees who have attended for three consecutive days or more.

Solution

WITH RankedAttendance AS (
    SELECT employee_id, date,
    ROW_NUMBER() OVER (PARTITION BY employee_id ORDER BY date) as row_num
    FROM Attendance
)
SELECT employee_id, MIN(date) as start_date, MAX(date) as end_date, COUNT(*) as consecutive_days
FROM RankedAttendance
GROUP BY employee_id, DATEADD(DAY, -row_num, date)
HAVING COUNT(*) >= 3;

Explanation

This query first uses a CTE (Common Table Expression) to assign a row number to each attendance record for each employee, ordered by date. It then groups the results by the difference between the date and the row number, effectively identifying sequences of consecutive days. The HAVING clause filters the results to include only those sequences with three or more consecutive days.

8. Finding Top N Records for Each Group

This problem involves using window functions to rank records within groups.

Problem

Given a table Sales with columns salesperson_id,Sure! Here is a comprehensive article on tough SQL problems with detailed explanations, formatted in HTML for easy posting on your blog: ```html Tough SQL Problems: Comprehensive Guide with Explanations for Interviews

Tough SQL Problems: Comprehensive Guide with Explanations for Interviews

SQL (Structured Query Language) is a critical skill for data professionals, including data analysts, data scientists, and database administrators. In interviews, SQL questions can range from basic queries to complex problems that test your understanding of database concepts and your ability to write efficient queries. This comprehensive guide covers some tough SQL problems, their solutions, and detailed explanations to help you prepare for your next interview.

1. Finding the Nth Highest Salary

One of the classic SQL problems is finding the Nth highest salary from a table of employees.

Problem

Given a table Employees with columns id and salary, write a query to find the Nth highest salary.

Solution

SELECT DISTINCT salary
FROM Employees
ORDER BY salary DESC
LIMIT 1 OFFSET N-1;

Explanation

This query uses the ORDER BY clause to sort the salaries in descending order. The DISTINCT keyword ensures that duplicate salaries are not considered. The LIMIT clause limits the number of results, and the OFFSET clause skips the first N-1 rows, effectively selecting the Nth highest salary.

2. Finding Duplicates in a Table

Another common problem is identifying duplicate records in a table.

Problem

Given a table Users with columns id and email, write a query to find duplicate email addresses.

Solution

SELECT email, COUNT(*)
FROM Users
GROUP BY email
HAVING COUNT(*) > 1;

Explanation

This query groups the records by the email column and counts the number of occurrences of each email. The HAVING clause filters the results to include only those groups with a count greater than one, indicating duplicate email addresses.

3. Finding Employees with Salaries Greater Than Their Managers

This problem involves self-joins and subqueries.

Problem

Given a table Employees with columns id, name, salary, and manager_id, write a query to find employees whose salary is greater than their manager's salary.

Solution

SELECT e1.name
FROM Employees e1
JOIN Employees e2 ON e1.manager_id = e2.id
WHERE e1.salary > e2.salary;

Explanation

This query uses a self-join to compare each employee's salary with their manager's salary. The JOIN clause joins the table Employees with itself based on the manager_id and id columns. The WHERE clause filters the results to include only those employees whose salary is greater than their manager's salary.

4. Finding the Second Highest Salary Without Using LIMIT

Finding the second highest salary can also be done using a subquery.

Problem

Given a table Employees with columns id and salary, write a query to find the second highest salary without using the LIMIT clause.

Solution

SELECT MAX(salary)
FROM Employees
WHERE salary < (SELECT MAX(salary) FROM Employees);

Explanation

This query uses a subquery to find the maximum salary, and then it finds the maximum salary that is less than the first maximum salary, effectively selecting the second highest salary.

5. Ranking Employees by Salary

Ranking employees by their salary is a common problem that can be solved using window functions.

Problem

Given a table Employees with columns id, name, and salary, write a query to rank employees by their salary.

Solution

SELECT id, name, salary,
RANK() OVER (ORDER BY salary DESC) as salary_rank
FROM Employees;

Explanation

This query uses the RANK() window function to assign a rank to each employee based on their salary in descending order. The OVER clause specifies the ordering of the rows.

6. Finding the Department with the Highest Average Salary

This problem involves grouping data and calculating averages.

Problem

Given a table Employees with columns id, name, salary, and department_id, and a table Departments with columns id and name, write a query to find the department with the highest average salary.

Solution

SELECT d.name
FROM Departments d
JOIN Employees e ON d.id = e.department_id
GROUP BY d.name
ORDER BY AVG(e.salary) DESC
LIMIT 1;

Explanation

This query joins the Departments and Employees tables based on the department_id. It then groups the results by department name and calculates the average salary for each department. Finally, it orders the results by the average salary in descending order and limits the output to one row, effectively selecting the department with the highest average salary.

7. Finding Consecutive Days of Attendance

This problem involves using window functions to identify patterns in data.

Problem

Given a table Attendance with columns employee_id and date, write a query to find all employees who have attended for three consecutive days or more.

Solution

WITH RankedAttendance AS (
    SELECT employee_id, date,
    ROW_NUMBER() OVER (PARTITION BY employee_id ORDER BY date) as row_num
    FROM Attendance
)
SELECT employee_id, MIN(date) as start_date, MAX(date) as end_date, COUNT(*) as consecutive_days
FROM RankedAttendance
GROUP BY employee_id, DATEADD(DAY, -row_num, date)
HAVING COUNT(*) >= 3;

Explanation

This query first uses a CTE (Common Table Expression) to assign a row number to each attendance record for each employee, ordered by date. It then groups the results by the difference between the date and the row number, effectively identifying sequences of consecutive days. The HAVING clause filters the results to include only those sequences with three or more consecutive days.

8. Finding Top N Records for Each Group

This problem involves using window functions to rank records within groups.

Problem

Given a table Sales with columns salesperson_id,date, and amount, write a query to find the top 3 sales amounts for each salesperson.

Solution

WITH RankedSales AS (
SELECT salesperson_id, date, amount,
ROW_NUMBER() OVER (PARTITION BY salesperson_id ORDER BY amount DESC) as rank
FROM Sales)
SELECT salesperson_id, date, amount
FROM RankedSales
WHERE rank <= 3;

Explanation

This query first uses a CTE (Common Table Expression) to assign a rank to each sales record for each salesperson, ordered by the sales amount in descending order. It then filters the results to include only the top 3 sales amounts for each salesperson.

9. Finding Employees Who Never Received a Bonus

This problem involves using a subquery to filter results.

Problem

Given a table Employees with columns id and name, and a table Bonuses with columns employee_id and bonus, write a query to find all employees who never received a bonus.

Solution

SELECT e.name FROM Employees e
LEFT JOIN Bonuses b ON e.id = b.employee_id
WHERE b.employee_id IS NULL;

Explanation

This query uses a left join to include all employees and any matching records from the Bonuses table. The WHERE clause filters the results to include only those employees who do not have a matching record in the Bonuses table, indicating that they never received a bonus.

10. Finding Employees with the Same Salary

This problem involves identifying records with duplicate values.

Problem

Given a table Employees with columns id, name, and salary, write a query to find all employees who have the same salary as another employee.

Solution

SELECT e1.name, e1.salary FROM Employees e1
JOIN Employees e2 ON e1.salary = e2.salary AND e1.id <> e2.id;

Explanation

This query uses a self-join to compare each employee's salary with the salaries of other employees. The JOIN clause matches employees with the same salary and different IDs, effectively identifying employees who have the same salary as another employee.

Conclusion

SQL is a powerful language for managing and querying relational databases. Mastering these tough SQL problems and understanding their solutions will help you perform well in interviews and improve your ability to write efficient queries. Practice these problems regularly, and you'll be well-prepared for any SQL challenge you encounter.

No comments:

Post a Comment