The most fundamental data structure in computing. If you only master one thing, make it this.
โฌก BeginnerAn array is a contiguous block of memory that stores elements of the same type, one after another, with no gaps. Think of it like a row of numbered lockers in a hallway. Each locker has a number (the index โ a zero-based position identifier), and you can walk directly to locker #47 without opening any of the others.
That "walk directly to any locker" property is called random access โ you can read or write any element in constant time, O(1), because the computer just calculates the memory address: start_address + index ร element_size. No searching required.
Arrays are as old as programming itself. The concept dates back to the 1940s and 1950s, when early languages like Fortran (1957) needed a way to represent mathematical vectors and matrices. John Backus and his team at IBM baked arrays directly into Fortran's design โ they called them "subscripted variables." The idea was borrowed straight from mathematics, where you'd write ai to mean the i-th element of sequence a.
The reason arrays became the default structure is hardware. CPUs are built around sequential memory access. When you read element [5] of an array, the CPU's cache prefetcher guesses you'll want [6] next and loads it speculatively. This prediction works because arrays store elements contiguously. That hardware-level optimization is why arrays remain the fastest general-purpose data structure 70 years later.
A string is just an array of characters. In C, it's literally a char[] with a null terminator (\0) marking the end. In Python and JavaScript, strings are immutable โ meaning you can't change individual characters in place. Every "modification" actually creates a new string, which matters for performance.
When someone says "array problem" in an interview context, they usually mean arrays, strings, or both. The techniques overlap almost completely: two pointers, sliding window, prefix sums โ all work on both.
A static array has a fixed size set at creation time. C arrays and Java arrays work this way โ once you declare int arr[10], you get exactly 10 slots. No more, no less. A dynamic array (Python's list, JavaScript's Array, Java's ArrayList, C++'s vector) can grow and shrink. Under the hood, a dynamic array starts with some initial capacity and doubles when it runs out of room โ a strategy called amortized doubling.
Here's why doubling works so well: if you start with capacity 4 and keep appending, you'll resize at 4, 8, 16, 32... elements. Each resize copies everything, but the total copies across all resizes is roughly 2n. Spread across n appends, that's O(1) per append on average. Contrast this with growing by 1 each time โ you'd copy 1 + 2 + 3 + ... + n = O(nยฒ) total.
When you create an array of 5 integers, the OS hands you a block of 5 ร 4 = 20 bytes (assuming 32-bit ints). Element 0 starts at the beginning. Element 1 starts 4 bytes later. Element 4 starts 16 bytes later. That's it โ no pointers, no metadata between elements, just raw data packed tight.
This layout is why arrays are so fast for reading. Your CPU's cache (a small, fast memory sitting between the CPU and RAM) loads data in chunks called cache lines, typically 64 bytes at a time. Since array elements sit next to each other, accessing one element often "pre-loads" its neighbors for free. This is called cache locality (also known as spatial locality), and it's the main reason arrays beat linked lists in practice even when Big-O says they should be equal.
To put numbers on it: accessing data in the L1 cache takes about 1 nanosecond. Fetching from main RAM takes 50-100 nanoseconds. That's a 50-100x difference, and it's entirely due to data layout in memory.
Here's the catch. To insert at index 2 in a 5-element array, you need to shift elements 2, 3, and 4 one position to the right to make room. That's O(n) in the worst case. Deleting works the same way in reverse โ shift everything left to fill the gap.
Inserting at the beginning is the worst case: every element moves. Inserting at the end (appending) is the best case. If there's room, it's O(1). This asymmetry is important โ it means array-based data structures naturally favor operations at the end.
Appending to the end of a dynamic array is different from inserting in the middle. If there's room in the underlying buffer, it's O(1). If the dynamic array is full, it allocates a new block (usually 2ร the size), copies everything over, then appends. This copy is O(n), but it happens so rarely that the amortized (averaged-over-time) cost per append is still O(1).
In Python and JavaScript, building a string by repeated concatenation (s += char) creates a new string object each time. If you do this n times, you're copying 1 + 2 + 3 + ... + n characters = O(nยฒ) total work. The fix: collect characters in a list/array, then join once at the end.
This trap catches people in interviews constantly. If you see yourself building a string character by character, stop and switch to a list.
A 2D array (matrix) is an array of arrays. In memory, most languages store them in row-major order โ all of row 0 first, then row 1, and so on. This means iterating row-by-row is fast (good cache locality), but column-by-column is slow (each access jumps to a different row). Fortran uses column-major order instead, which is why scientific code in Fortran and NumPy (which inherited the convention) iterates columns first.
When processing a 2D array, always iterate in the order that matches your language's memory layout. In C, Java, Python: outer loop over rows, inner loop over columns. Getting this wrong can cause a 10x slowdown on large matrices because of constant cache misses.
Click the buttons to see how arrays handle insertion, deletion, and access. Watch how elements shift during insert and delete operations.
| Operation | Time (Average) | Time (Worst) | Notes |
|---|---|---|---|
| Access by index | O(1) | O(1) | Direct address calculation |
| Search (unsorted) | O(n) | O(n) | Linear scan |
| Search (sorted) | O(log n) | O(log n) | Binary search |
| Insert at end | O(1)* | O(n) | *Amortized for dynamic arrays |
| Insert at index | O(n) | O(n) | Must shift elements right |
| Delete at end | O(1) | O(1) | Just decrement length |
| Delete at index | O(n) | O(n) | Must shift elements left |
| Append (dynamic) | O(1)* | O(n) | *Amortized; resize copies everything |
Space complexity: O(n) โ one slot per element, no overhead per element.
You rarely implement arrays from scratch (the language gives you one), but understanding how dynamic arrays work under the hood is fair game in interviews.
class DynamicArray:
def __init__(self):
self._capacity = 4 # Start with space for 4 elements
self._size = 0 # Nothing stored yet
self._data = [None] * 4 # The underlying fixed-size storage
def __len__(self):
return self._size
def __getitem__(self, index):
# Bounds check โ Python lists do this too, you just don't see it
if index < 0 or index >= self._size:
raise IndexError("index out of range")
return self._data[index]
def append(self, value):
# If we've filled every slot, double the capacity
# This is why append is O(1) amortized โ the doubling
# happens so rarely that it averages out
if self._size == self._capacity:
self._resize(self._capacity * 2)
self._data[self._size] = value
self._size += 1
def insert(self, index, value):
if index < 0 or index > self._size:
raise IndexError("index out of range")
if self._size == self._capacity:
self._resize(self._capacity * 2)
# Shift everything from index onward one slot right
# This is the O(n) cost of mid-array insertion
for i in range(self._size, index, -1):
self._data[i] = self._data[i - 1]
self._data[index] = value
self._size += 1
def delete(self, index):
if index < 0 or index >= self._size:
raise IndexError("index out of range")
# Shift everything after index one slot left
for i in range(index, self._size - 1):
self._data[i] = self._data[i + 1]
self._data[self._size - 1] = None # Help garbage collector
self._size -= 1
# Optional: shrink if usage drops below 25% of capacity
if self._size > 0 and self._size <= self._capacity // 4:
self._resize(self._capacity // 2)
def _resize(self, new_capacity):
# Allocate a bigger block and copy everything over
# This is the expensive part, but it happens O(log n) times
new_data = [None] * new_capacity
for i in range(self._size):
new_data[i] = self._data[i]
self._data = new_data
self._capacity = new_capacity
class DynamicArray {
constructor() {
this._capacity = 4;
this._size = 0;
// TypedArrays give us real fixed-size storage
// (JS regular arrays are already dynamic, so this is educational)
this._data = new Array(4).fill(undefined);
}
get length() { return this._size; }
get(index) {
if (index < 0 || index >= this._size) throw new RangeError("Index out of bounds");
return this._data[index];
}
push(value) {
if (this._size === this._capacity) this._resize(this._capacity * 2);
this._data[this._size] = value;
this._size++;
}
insertAt(index, value) {
if (index < 0 || index > this._size) throw new RangeError("Index out of bounds");
if (this._size === this._capacity) this._resize(this._capacity * 2);
// Shift right โ same O(n) cost as Python version
for (let i = this._size; i > index; i--) {
this._data[i] = this._data[i - 1];
}
this._data[index] = value;
this._size++;
}
deleteAt(index) {
if (index < 0 || index >= this._size) throw new RangeError("Index out of bounds");
// Shift left to fill the gap
for (let i = index; i < this._size - 1; i++) {
this._data[i] = this._data[i + 1];
}
this._data[this._size - 1] = undefined;
this._size--;
}
_resize(newCapacity) {
const newData = new Array(newCapacity).fill(undefined);
for (let i = 0; i < this._size; i++) newData[i] = this._data[i];
this._data = newData;
this._capacity = newCapacity;
}
}
# BAD โ O(nยฒ) because strings are immutable
# Each += creates a brand new string object
result = ""
for char in some_list:
result += char # copies entire string each time
# GOOD โ O(n) total, one allocation at the end
parts = []
for char in some_list:
parts.append(char)
result = "".join(parts)
# Even better โ use a generator expression
result = "".join(char for char in some_list)
The sliding window technique maintains a "window" (a contiguous subarray defined by two pointers) and moves it across the array. Here's a concrete example โ finding the maximum sum of any subarray of size k:
def max_subarray_sum(arr, k):
"""Find the maximum sum of any contiguous subarray of size k.
Uses the sliding window technique:
1. Compute the sum of the first window (indices 0..k-1)
2. Slide the window right by one position at a time
3. Subtract the element leaving the window, add the one entering
4. Track the maximum sum seen
Time: O(n) โ each element is added and removed exactly once
Space: O(1) โ just a few variables
"""
if len(arr) < k:
return None
# Step 1: sum of the first window
window_sum = sum(arr[:k])
best = window_sum
# Step 2-4: slide the window
for i in range(k, len(arr)):
window_sum += arr[i] - arr[i - k] # add new, remove old
best = max(best, window_sum)
return best
# Example: daily temperatures, find hottest 3-day streak
temps = [72, 68, 75, 80, 78, 73, 71, 77]
print(max_subarray_sum(temps, 3)) # 233 (75 + 80 + 78)
Two pointers converging from both ends of a sorted array โ the foundation of dozens of interview problems:
def two_sum_sorted(arr, target):
"""Find two numbers in a SORTED array that add to target.
Left pointer starts at beginning, right pointer at end.
If the sum is too small, move left pointer right (increase sum).
If the sum is too big, move right pointer left (decrease sum).
Time: O(n) โ each pointer moves at most n times total
Space: O(1) โ no extra data structures
"""
left, right = 0, len(arr) - 1
while left < right:
current_sum = arr[left] + arr[right]
if current_sum == target:
return [left, right]
elif current_sum < target:
left += 1 # need a bigger sum
else:
right -= 1 # need a smaller sum
return [] # no pair found
# Example
print(two_sum_sorted([1, 3, 5, 7, 11, 15], 16)) # [2, 4] โ 5 + 11
s += char allocates a new string and copies everything. On a string of length 10,000 you're doing 50 million character copies. Use "".join(list) or array.join("") instead.list.pop(0) in Python is O(n). It shifts every remaining element left. If you need frequent removal from the front, use collections.deque instead, which gives O(1) popleft.== compares element-by-element (correct). In JavaScript, === compares object references, not contents โ [1,2] === [1,2] is false. Use JSON.stringify() or a manual loop.Here's a practical problem you'd face in production: build a rate limiter that allows at most N requests per second from a given user. The sliding window approach uses an array (or deque) of timestamps.
from collections import deque
import time
class RateLimiter:
"""Sliding window rate limiter using a deque of timestamps.
For each user, we store the timestamps of their recent requests.
When a new request comes in, we:
1. Remove timestamps older than the window
2. Check if the count is under the limit
3. If yes, record the timestamp and allow
"""
def __init__(self, max_requests, window_seconds):
self.max_requests = max_requests
self.window = window_seconds
self.user_requests = {} # user_id -> deque of timestamps
def allow_request(self, user_id):
now = time.time()
cutoff = now - self.window
if user_id not in self.user_requests:
self.user_requests[user_id] = deque()
timestamps = self.user_requests[user_id]
# Remove expired timestamps from the front
while timestamps and timestamps[0] < cutoff:
timestamps.popleft()
if len(timestamps) < self.max_requests:
timestamps.append(now)
return True
return False
# Allow 5 requests per 10 seconds
limiter = RateLimiter(max_requests=5, window_seconds=10)
# Simulate requests
for i in range(7):
allowed = limiter.allow_request("user_123")
print(f"Request {i+1}: {'โ allowed' if allowed else 'โ blocked'}")
# First 5 allowed, last 2 blocked
This is the same logic behind API rate limiting in services like Stripe, GitHub, and AWS. The deque acts as a sliding window over recent request timestamps. Old entries fall off the left side as time passes.
Arrays and strings appear in more interview problems than any other topic. Here are the patterns that show up over and over.
Use two index variables that move toward each other (or in the same direction) to solve problems in O(n) instead of O(nยฒ). Classic examples: checking palindromes, removing duplicates from a sorted array, container with most water.
The key insight: if the array is sorted (or has some order property), two pointers let you skip checking pairs that can't possibly be the answer.
Maintain a "window" (a contiguous subarray) and slide it across the array, expanding or shrinking as needed. Perfect for substring/subarray problems with constraints like "longest substring without repeating characters" or "minimum window substring."
The pattern: expand the right pointer until you have a valid window, then shrink from the left to minimize it. Track the best answer as you go.
Many array/string problems boil down to counting. "Are these two strings anagrams?" โ compare character frequency maps. "Find the first unique character?" โ count occurrences, then scan again.
Time complexity drops from O(nยฒ) brute force to O(n) with a hash map. Space trade-off: O(k) where k is the number of distinct elements.
Pre-compute cumulative sums so you can answer "what's the sum of elements from index i to j?" in O(1). Build it once in O(n), query any subarray sum instantly: prefix[j+1] - prefix[i].
Shows up in: subarray sum equals k, range sum queries, product of array except self (using prefix and suffix products).
When an interviewer says "do it in O(1) extra space," they want you to rearrange the array without allocating a new one. Common trick: use the array itself as a hash map by marking visited indices (e.g., negating values). Examples: find the duplicate, find all disappeared numbers.
Sometimes the brute-force approach is O(nยฒ) because you're searching for matches. Sorting the array first (O(n log n)) can reduce the inner search to O(log n) with binary search or O(1) with two pointers. Three Sum is the classic: sort, fix one element, then two-pointer the rest.
When the interviewer doesn't say "in-place" or "preserve order," sorting is almost always worth considering.
Track the maximum sum ending at each position. At each element, decide: extend the current subarray or start fresh from this element. The recurrence: max_ending_here = max(arr[i], max_ending_here + arr[i]). Update the global max at each step.
This runs in O(n) time and O(1) space. It shows up directly (LC #53) and as a building block inside harder problems like maximum product subarray and maximum circular subarray.
Start from the top and work down. Each problem builds on techniques from the ones above it.
| # | Problem | Difficulty | Key Pattern |
|---|---|---|---|
| 1 | Two Sum | Easy | Hash map lookup |
| 121 | Best Time to Buy and Sell Stock | Easy | Track min so far |
| 238 | Product of Array Except Self | Medium | Prefix/suffix products |
| 3 | Longest Substring Without Repeating Characters | Medium | Sliding window + hash set |
| 15 | 3Sum | Medium | Sort + two pointers |
| 53 | Maximum Subarray | Medium | Kadane's algorithm |
| 76 | Minimum Window Substring | Hard | Sliding window + frequency map |
| 42 | Trapping Rain Water | Hard | Two pointers or prefix max |
| 41 | First Missing Positive | Hard | In-place cyclic sort |