To choose 10 arbitrary rows from 600K entries in MySQL, multiple techniques can be employed to achieve the desired results. In scenarios involving extensive datasets where filtering or acquiring random entries is necessary, methods such as OrderBy Rand() and Random OFFSET() can be utilized. This article will elaborate on all the techniques for selecting 10 random rows from 600k entries in MySQL, complete with examples for each method.
Table of Contents:
- Techniques to Choose Random 10 Rows from 600k Rows in MySQL
- Performance Assessment
- Practical Examples
- Optimal Techniques
- Final Thoughts
We will create a table named your_table and insert some values into it for demonstration purposes with subsequent methods.
-- Create table
CREATE TABLE your_table (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(50),
age INT
);
INSERT INTO your_table (name, age)
SELECT
CONCAT('User', id) AS name,
FLOOR(18 + (RAND() * 42)) AS age
FROM
(SELECT @row := @row + 1 AS id FROM
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4) t1,
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4) t2,
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4) t3,
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4) t4,
(SELECT @row := 0) init) temp
LIMIT 600000;
SELECT * FROM your_table
Results:

The table will contain 600k entries…
Techniques to Choose Random 10 Rows from 600k Rows in MySQL
Methods such as ORDER BY RAND(), Random OFFSET(), and user variable can be leveraged for quickly retrieving 10 random rows.
Technique 1: Utilizing ORDER BY Rand() in MySQL
ORDER BY RAND() is an approach in MySQL that sorts all the entries effectively. This technique may prove inefficient with larger datasets since it necessitates scanning the complete table.
Example:
SELECT * FROM your_table
ORDER BY RAND()
LIMIT 10;
Results:

Clarification: The RAND() function efficiently retrieved the random 10 rows from 600k entries.
Technique 2: Employing Indexed Primary Key in MySQL
This approach works best when the data within the table is suitably indexed and the table comprises an auto-increment primary key (id). Nevertheless, if the entries are not sequential, this may create fewer gaps during row retrieval.
Example:
SELECT * FROM your_table
WHERE id >= (SELECT FLOOR(RAND() * (SELECT MAX(id) FROM your_table)))
ORDER BY id
LIMIT 10;
Results:

Clarification: The SELECT FLOOR(RAND() * (SELECT MAX(id)) retrieved the random 10 rows from the dataset.
Technique 3: Applying JOIN with Random OFFSET() in MySQL
The JOIN with Random OFFSET() retrieves random entries without needing to sort the complete table.
This method is considerably more efficient compared to ORDER BY RAND().
Example:
-- Declare a variable for the random offset
SET @rand_offset = FLOOR(RAND() * (SELECT COUNT(*) FROM your_table));
-- Prepare and execute the query with the computed offset
SET @query = CONCAT('SELECT * FROM your_table LIMIT 10 OFFSET ', @rand_offset);
PREPARE stmt FROM @query;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
Results:

Clarification: The offset utilizes the prepared statements SET @rand_offset = FLOOR(RAND() * (SELECT COUNT(*) FROM your_table)); which allows the retrieval of random rows from the dataset. The offset will not work directly in MySQL to acquire the 10 random rows.
Technique 4: Utilizing User Variable Technique in MySQL
The User Variable method designates a random number to each row without necessitating the sorting of the entire table.
Example:
SELECT * FROM (
SELECT *, (@row_num```sql
:= @row_num + 1) AS row_num
FROM your_table, (SELECT @row_num := 0) AS init
) AS t
WHERE row_num >= FLOOR(RAND() * (SELECT COUNT(*) FROM your_table))
LIMIT 10;
Outcome:

Clarification: The user variable operates efficiently when the identifiers have gaps. The SELECT *, (@row_num := @row_num + 1) AS row_num statement will assign random numbers to rows.
Performance Evaluation
Technique | Performance Factor | Effectiveness | Optimal Use Case |
ORDER BY RAND() | The ORDER BY RAND() will traverse the entire table and sort the rows before randomly choosing 10 rows. | Effectiveness is sluggish as it requires scanning the whole table. | It is best utilized on smaller tables. |
Indexed Primary Key | This method utilizes an indexed ID, thus preventing a complete scan of the table. | Quickest, since it does not require a full table scan. | Optimal for large tables. |
JOIN with Random OFFSET | The JOIN with random offset retrieves random rows from the table. | Intermediate as COUNT and offset can slow down efficiency. | Best practice if the tables have gaps in IDs. |
User Variable Method | This method employs row numbers with specified variables, thereby avoiding the need to scan the entire table. | It is efficient as it can easily assign random numbers to rows. | It excels when managing large datasets. |
Real-Life Scenarios
Example 1: To select 10 random film titles from a collection.
Illustration:
-- Create a movies table
CREATE TABLE movies (
id INT AUTO_INCREMENT PRIMARY KEY,
title VARCHAR(255),
genre VARCHAR(50),
release_year INT
);
-- Insert sample films (15 rows)
INSERT INTO movies (title, genre, release_year) VALUES
('Inception', 'Sci-Fi', 2010),
('Titanic', 'Romance', 1997),
('The Matrix', 'Action', 1999),
('Interstellar', 'Sci-Fi', 2014),
('Joker', 'Drama', 2019),
('Gladiator', 'Action', 2000),
('The Dark Knight', 'Action', 2008),
('Forrest Gump', 'Drama', 1994),
('Parasite', 'Thriller', 2019),
('The Godfather', 'Crime', 1972),
('Avengers: Endgame', 'Superhero', 2019),
('Pulp Fiction', 'Crime', 1994),
('Schindler List', 'Historical', 1993),
('The Lion King', 'Animation', 1994),
('Fight Club', 'Drama', 1999);
-- Query to select 10 random films
SELECT * FROM movies
WHERE id >= (SELECT FLOOR(RAND() * (SELECT MAX(id) FROM movies)))
ORDER BY id
LIMIT 10;
Outcome:

Clarification: The indexed primary key function has randomly chosen 10 films from the collection.
Example 2: To randomly retrieve usernames from a social media platform.
Illustration:
-- Create a user table
CREATE TABLE users (
id INT AUTO_INCREMENT PRIMARY KEY,
username VARCHAR(50),
country VARCHAR(50)
);
-- Insert sample users (15 rows)
INSERT INTO users (username, country) VALUES
('user_01', 'USA'),
('user_02', 'Canada'),
('user_03', 'India'),
('user_04', 'Germany'),
('user_05', 'France'),
('user_06', 'Japan'),
('user_07', 'Australia'),
('user_08', 'Brazil'),
('user_09', 'UK'),
('user_10', 'Mexico'),
('user_11', 'Italy'),
('user_12', 'Spain'),
('user_13', 'South Korea'),
('user_14', 'Netherlands'),
('user_15', 'Sweden');
-- Query to choose 10 random users using user variables
SELECT * FROM (
SELECT *, (@row_num := @row_num + 1) AS row_num
FROM users, (SELECT @row_num := 0) AS init
) AS t
WHERE row_num >= FLOOR(RAND() * (SELECT COUNT(*) FROM users))
LIMIT 10;
Outcome:

Clarification: To select a random username, the user-variable method was utilized, proving effective with large datasets and accommodating gaps in the identifiers effectively.
Example 3: To collect 10 random quiz questions from the online learning platform Intellipaat.
Illustration:
-- Create quiz questions table
CREATE TABLE quiz_questions (
id INT AUTO_INCREMENT PRIMARY KEY,
question TEXT,
difficulty VARCHAR(10)
);
-- Insert sample quiz questions (15 rows)
INSERT INTO quiz_questions (question, difficulty) VALUES
('What is SQL and what does it represent?', 'Easy'),
('What distinguishes SQL from MySQL?', 'Easy'),
('What various forms of SQL commands exist?', 'Easy'),
('What is the function of the GROUP BY clause in SQL?', 'Medium'),
('What is the difference between WHERE and HAVING clauses?', 'Medium'),
('How is the ORDER BY clause utilized in SQL?', 'Easy'),
('What constitutes a primary key and why is it crucial?', 'Easy'),
('What are the different types of joins in SQL?', 'Medium'),
('What does normalization mean in SQL? Elaborate on various normal forms.', 'Hard'),
('What is an index in SQL, and how does it enhance performance?', 'Medium'),
('Differentiate between DELETE, TRUNCATE, and DROP.', 'Medium'),
('What is a stored procedure in SQL?', 'Medium'),
('What are the properties of ACID in a database?', 'Hard'),
('What is a foreign key in SQL?', 'Easy'),
('Define the subquery concept with an example.', 'Hard');
SELECT * FROM (
SELECT * FROM quiz_questions ORDER BY RAND() LIMIT 1000
) AS subset ORDER BY RAND()
LIMIT 10;
Outcome:

Explanation: The question selected randomly from the Intellipaat educational platform was accomplished through the subset+order by rand() method.
Optimal Practices
- Refrain from employing ORDER BY RAND() on extensive datasets, as it necessitates sorting the entire table, rendering it ineffective.
- Utilize indexed Columns for enhanced optimization. Since an auto-incremented primary key is used, it will facilitate the discovery of random variables effortlessly.
- Ensure you apply appropriate selection techniques corresponding to your dataset.
- For exceptionally large tables, contemplate leveraging indexed columns and sustaining a secondary table with random row selections to enhance efficiency.
- Aim to store all frequently generated random sets in a distinct area. This will decrease the cache space and minimize database repetition.
Final Thoughts
To summarize, swiftly selecting 10 rows from 600k rows in MySQL can be accomplished through various methods like ORDER BY RAND(), JOIN with OFFSET(), indexed primary keys, and so on. Opt for the most optimal method based on its efficacy, speed of performance, and size of the dataset. This approach allows for the retrieval of any random rows from any table. Through this blog, you have acquired insights on how to select random rows from numerous records.
For further exploration of SQL functionalities, consider checking out this SQL course and also delve into SQL Interview Questions crafted by industry professionals.
How to Quickly Select 10 Random Rows from 600K Rows in MySQL – FAQs
The article How to Select 10 Random Rows from 600K Rows Fast in MySQL? was first seen on Intellipaat Blog.