Fastest way to test if there are any duplicates in a Java list/array
The fastest practical way to detect duplicates in a Java list or array is to use a hash-based set—it gives you O(n) time with very low overhead.
Best general solution (early exit, O(n))
import java.util.HashSet;
import java.util.List;
public static <T> boolean hasDuplicates(List<T> list) {
HashSet<T> seen = new HashSet<>();
for (T item : list) {
if (!seen.add(item)) {
return true; // duplicate found
}
}
return false;
}
Why this is fast:
• HashSet.add() is O(1) average
• Stops immediately when a duplicate is found (no wasted work)
Even shorter (but no early exit)
boolean hasDuplicates = list.size() != new HashSet<>(list).size();
This is clean, but:
• Always processes the whole list
• Slightly more memory churn
For primitive arrays (even faster)
If you're working with int[], avoid boxing:
import java.util.HashSet;
public static boolean hasDuplicates(int[] arr) {
HashSet<Integer> seen = new HashSet<>();
for (int n : arr) {
if (!seen.add(n)) return true;
}
return false;
}
If the value range is small (e.g., 0–1,000,000), a boolean array beats everything:
public static boolean hasDuplicates(int[] arr) {
boolean[] seen = new boolean[1_000_001]; // adjust range
for (int n : arr) {
if (seen[n]) return true;
seen[n] = true;
}
return false;
}
Comparison of Three Different Methods
We will try to determine if there are any duplicates in an array or a list in Java programming language. We will compare three different algorithms and compare their durations.
1. The first algorithm runs on an array. We have implemented two loops to compare the elements of the list to other elements. The complexity is O(n2).
2. The second algorithms is similar to the first one. Instead of using an array, we take a list as an input.
3. And the last algorithm uses Java HashSets. The complexity of inserting an element to a HashSet is O(1), but the worst case performance might increase up to O(n). Thus, the complexity of inserting all the elements may be somewhere between O(n) and O(n2).
Here is the test code and comparison results.
public class TestDuplicate {
public static boolean hasDuplicateArray(double[] arr)
{
int count = arr.length;
for (int i = 1; i < count; ++i)
for (int j = 0; j < i; ++j)
if (arr[i] == arr[j])
return true;
return false;
}
public static boolean hasDuplicateList(List list)
{
int count = list.size();
for (int i = 1; i < count; ++i)
{
Double elem1 = list.get(i);
for (int j = 0; j < i; ++j)
{
Double elem2 = list.get(j);
if (elem1 == elem2)
return true;
}
}
return false;
}
public static boolean hasDuplicateMap(List list)
{
HashSet set = new HashSet();
int count = list.size();
for (int i = 0; i < count; ++i)
{
Double elem = list.get(i);
if (set.contains(elem))
return true;
set.add(elem);
}
return false;
}
public static void main(String[] args)
{
Random rand = new Random(2020);
double[] arr = new double[100000];
List list = new ArrayList();
for (int i = 0; i < arr.length; ++i)
{
arr[i] = rand.nextDouble();
list.add(arr[i]);
}
long startTime = System.currentTimeMillis();
boolean result = hasDuplicateArray(arr);
long duration = System.currentTimeMillis() - startTime;
System.out.println("Java Array: \t" + result + "\t" + duration + " ms");
startTime = System.currentTimeMillis();
result = hasDuplicateList(list);
duration = System.currentTimeMillis() - startTime;
System.out.println("Java List: \t" + result + "\t" + duration + " ms");
startTime = System.currentTimeMillis();
result = hasDuplicateMap(list);
duration = System.currentTimeMillis() - startTime;
System.out.println("Java HashSet: \t" + result + "\t" + duration + " ms");
}
}
The test results are obtained on Intel Core i5-8250U processor. Here are the results for 100K elements:
Java Array: false 2910 ms
Java List: false 3976 ms
Java HashSet: false 27 ms
If you want to repeat the test for 1M elements, the result is:
Java Array: false 409085 ms
Java List: false 1793858 ms
Java HashSet: false 605 ms
According to the results, the fastest way to understand if there are any duplicates in a Java collection is to use a HashSet.