DSA Lab | Arrays & Strings

What Is an Array?

An array is a contiguous block of memory that stores elements of the same type, one after another, with no gaps. Think of it like a row of numbered lockers in a hallway. Each locker has a number (the index — a zero-based position identifier), and you can walk directly to locker #47 without opening any of the others.

That "walk directly to any locker" property is called random access — you can read or write any element in constant time, O(1), because the computer just calculates the memory address: start_address + index × element_size. No searching required.

A Brief History

Arrays are as old as programming itself. The concept dates back to the 1940s and 1950s, when early languages like Fortran (1957) needed a way to represent mathematical vectors and matrices. John Backus and his team at IBM baked arrays directly into Fortran's design — they called them "subscripted variables." The idea was borrowed straight from mathematics, where you'd write a_i to mean the i-th element of sequence a.

The reason arrays became the default structure is hardware. CPUs are built around sequential memory access. When you read element [5] of an array, the CPU's cache prefetcher guesses you'll want [6] next and loads it speculatively. This prediction works because arrays store elements contiguously. That hardware-level optimization is why arrays remain the fastest general-purpose data structure 70 years later.

Arrays vs. Strings

A string is just an array of characters. In C, it's literally a char[] with a null terminator (\0) marking the end. In Python and JavaScript, strings are immutable — meaning you can't change individual characters in place. Every "modification" actually creates a new string, which matters for performance.

When someone says "array problem" in an interview context, they usually mean arrays, strings, or both. The techniques overlap almost completely: two pointers, sliding window, prefix sums — all work on both.

Static vs. Dynamic Arrays

A static array has a fixed size set at creation time. C arrays and Java arrays work this way — once you declare int arr[10], you get exactly 10 slots. No more, no less. A dynamic array (Python's list, JavaScript's Array, Java's ArrayList, C++'s vector) can grow and shrink. Under the hood, a dynamic array starts with some initial capacity and doubles when it runs out of room — a strategy called amortized doubling.

Here's why doubling works so well: if you start with capacity 4 and keep appending, you'll resize at 4, 8, 16, 32... elements. Each resize copies everything, but the total copies across all resizes is roughly 2n. Spread across n appends, that's O(1) per append on average. Contrast this with growing by 1 each time — you'd copy 1 + 2 + 3 + ... + n = O(n²) total.

Real-World Uses

Buffers — network packets, audio samples, pixel data all land in arrays
Lookup tables — hash maps use arrays internally for O(1) access
Matrices — images, spreadsheets, game boards are 2D arrays
Strings everywhere — URLs, JSON, SQL queries, user input
GPU computing — graphics pipelines process arrays of vertices, pixels, and textures in parallel
Time-series data — stock prices, sensor readings, log entries stored as ordered arrays

How It Works

Memory Layout

When you create an array of 5 integers, the OS hands you a block of 5 × 4 = 20 bytes (assuming 32-bit ints). Element 0 starts at the beginning. Element 1 starts 4 bytes later. Element 4 starts 16 bytes later. That's it — no pointers, no metadata between elements, just raw data packed tight.

This layout is why arrays are so fast for reading. Your CPU's cache (a small, fast memory sitting between the CPU and RAM) loads data in chunks called cache lines, typically 64 bytes at a time. Since array elements sit next to each other, accessing one element often "pre-loads" its neighbors for free. This is called cache locality (also known as spatial locality), and it's the main reason arrays beat linked lists in practice even when Big-O says they should be equal.

To put numbers on it: accessing data in the L1 cache takes about 1 nanosecond. Fetching from main RAM takes 50-100 nanoseconds. That's a 50-100x difference, and it's entirely due to data layout in memory.

Insertion and Deletion

Here's the catch. To insert at index 2 in a 5-element array, you need to shift elements 2, 3, and 4 one position to the right to make room. That's O(n) in the worst case. Deleting works the same way in reverse — shift everything left to fill the gap.

Inserting at the beginning is the worst case: every element moves. Inserting at the end (appending) is the best case. If there's room, it's O(1). This asymmetry is important — it means array-based data structures naturally favor operations at the end.

Appending to the end of a dynamic array is different from inserting in the middle. If there's room in the underlying buffer, it's O(1). If the dynamic array is full, it allocates a new block (usually 2× the size), copies everything over, then appends. This copy is O(n), but it happens so rarely that the amortized (averaged-over-time) cost per append is still O(1).

String Immutability

In Python and JavaScript, building a string by repeated concatenation (s += char) creates a new string object each time. If you do this n times, you're copying 1 + 2 + 3 + ... + n characters = O(n²) total work. The fix: collect characters in a list/array, then join once at the end.

This trap catches people in interviews constantly. If you see yourself building a string character by character, stop and switch to a list.

Multi-Dimensional Arrays

A 2D array (matrix) is an array of arrays. In memory, most languages store them in row-major order — all of row 0 first, then row 1, and so on. This means iterating row-by-row is fast (good cache locality), but column-by-column is slow (each access jumps to a different row). Fortran uses column-major order instead, which is why scientific code in Fortran and NumPy (which inherited the convention) iterates columns first.

🔧 Performance Tip

When processing a 2D array, always iterate in the order that matches your language's memory layout. In C, Java, Python: outer loop over rows, inner loop over columns. Getting this wrong can cause a 10x slowdown on large matrices because of constant cache misses.

Interactive Visualization

Click the buttons to see how arrays handle insertion, deletion, and access. Watch how elements shift during insert and delete operations.

Array Operations

Index: Value:

Operations & Complexity

Operation	Time (Average)	Time (Worst)	Notes
Access by index	O(1)	O(1)	Direct address calculation
Search (unsorted)	O(n)	O(n)	Linear scan
Search (sorted)	O(log n)	O(log n)	Binary search
Insert at end	O(1)*	O(n)	*Amortized for dynamic arrays
Insert at index	O(n)	O(n)	Must shift elements right
Delete at end	O(1)	O(1)	Just decrement length
Delete at index	O(n)	O(n)	Must shift elements left
Append (dynamic)	O(1)*	O(n)	*Amortized; resize copies everything

Space complexity: O(n) — one slot per element, no overhead per element.

Implementation

You rarely implement arrays from scratch (the language gives you one), but understanding how dynamic arrays work under the hood is fair game in interviews.

Dynamic Array in Python

Python

class DynamicArray:
    def __init__(self):
        self._capacity = 4          # Start with space for 4 elements
        self._size = 0              # Nothing stored yet
        self._data = [None] * 4     # The underlying fixed-size storage

    def __len__(self):
        return self._size

    def __getitem__(self, index):
        # Bounds check — Python lists do this too, you just don't see it
        if index < 0 or index >= self._size:
            raise IndexError("index out of range")
        return self._data[index]

    def append(self, value):
        # If we've filled every slot, double the capacity
        # This is why append is O(1) amortized — the doubling
        # happens so rarely that it averages out
        if self._size == self._capacity:
            self._resize(self._capacity * 2)
        self._data[self._size] = value
        self._size += 1

    def insert(self, index, value):
        if index < 0 or index > self._size:
            raise IndexError("index out of range")
        if self._size == self._capacity:
            self._resize(self._capacity * 2)
        # Shift everything from index onward one slot right
        # This is the O(n) cost of mid-array insertion
        for i in range(self._size, index, -1):
            self._data[i] = self._data[i - 1]
        self._data[index] = value
        self._size += 1

    def delete(self, index):
        if index < 0 or index >= self._size:
            raise IndexError("index out of range")
        # Shift everything after index one slot left
        for i in range(index, self._size - 1):
            self._data[i] = self._data[i + 1]
        self._data[self._size - 1] = None  # Help garbage collector
        self._size -= 1
        # Optional: shrink if usage drops below 25% of capacity
        if self._size > 0 and self._size <= self._capacity // 4:
            self._resize(self._capacity // 2)

    def _resize(self, new_capacity):
        # Allocate a bigger block and copy everything over
        # This is the expensive part, but it happens O(log n) times
        new_data = [None] * new_capacity
        for i in range(self._size):
            new_data[i] = self._data[i]
        self._data = new_data
        self._capacity = new_capacity

Dynamic Array in JavaScript

JavaScript

class DynamicArray {
  constructor() {
    this._capacity = 4;
    this._size = 0;
    // TypedArrays give us real fixed-size storage
    // (JS regular arrays are already dynamic, so this is educational)
    this._data = new Array(4).fill(undefined);
  }

  get length() { return this._size; }

  get(index) {
    if (index < 0 || index >= this._size) throw new RangeError("Index out of bounds");
    return this._data[index];
  }

  push(value) {
    if (this._size === this._capacity) this._resize(this._capacity * 2);
    this._data[this._size] = value;
    this._size++;
  }

  insertAt(index, value) {
    if (index < 0 || index > this._size) throw new RangeError("Index out of bounds");
    if (this._size === this._capacity) this._resize(this._capacity * 2);
    // Shift right — same O(n) cost as Python version
    for (let i = this._size; i > index; i--) {
      this._data[i] = this._data[i - 1];
    }
    this._data[index] = value;
    this._size++;
  }

  deleteAt(index) {
    if (index < 0 || index >= this._size) throw new RangeError("Index out of bounds");
    // Shift left to fill the gap
    for (let i = index; i < this._size - 1; i++) {
      this._data[i] = this._data[i + 1];
    }
    this._data[this._size - 1] = undefined;
    this._size--;
  }

  _resize(newCapacity) {
    const newData = new Array(newCapacity).fill(undefined);
    for (let i = 0; i < this._size; i++) newData[i] = this._data[i];
    this._data = newData;
    this._capacity = newCapacity;
  }
}

Common String Tricks

Python

# BAD — O(n²) because strings are immutable
# Each += creates a brand new string object
result = ""
for char in some_list:
    result += char  # copies entire string each time

# GOOD — O(n) total, one allocation at the end
parts = []
for char in some_list:
    parts.append(char)
result = "".join(parts)

# Even better — use a generator expression
result = "".join(char for char in some_list)

Variation: Sliding Window Maximum

The sliding window technique maintains a "window" (a contiguous subarray defined by two pointers) and moves it across the array. Here's a concrete example — finding the maximum sum of any subarray of size k:

Python

def max_subarray_sum(arr, k):
    """Find the maximum sum of any contiguous subarray of size k.

    Uses the sliding window technique:
    1. Compute the sum of the first window (indices 0..k-1)
    2. Slide the window right by one position at a time
    3. Subtract the element leaving the window, add the one entering
    4. Track the maximum sum seen

    Time: O(n) — each element is added and removed exactly once
    Space: O(1) — just a few variables
    """
    if len(arr) < k:
        return None

    # Step 1: sum of the first window
    window_sum = sum(arr[:k])
    best = window_sum

    # Step 2-4: slide the window
    for i in range(k, len(arr)):
        window_sum += arr[i] - arr[i - k]  # add new, remove old
        best = max(best, window_sum)

    return best

# Example: daily temperatures, find hottest 3-day streak
temps = [72, 68, 75, 80, 78, 73, 71, 77]
print(max_subarray_sum(temps, 3))  # 233 (75 + 80 + 78)

Variation: Two-Pointer Technique

Two pointers converging from both ends of a sorted array — the foundation of dozens of interview problems:

Python

def two_sum_sorted(arr, target):
    """Find two numbers in a SORTED array that add to target.

    Left pointer starts at beginning, right pointer at end.
    If the sum is too small, move left pointer right (increase sum).
    If the sum is too big, move right pointer left (decrease sum).

    Time: O(n) — each pointer moves at most n times total
    Space: O(1) — no extra data structures
    """
    left, right = 0, len(arr) - 1

    while left < right:
        current_sum = arr[left] + arr[right]
        if current_sum == target:
            return [left, right]
        elif current_sum < target:
            left += 1    # need a bigger sum
        else:
            right -= 1   # need a smaller sum

    return []  # no pair found

# Example
print(two_sum_sorted([1, 3, 5, 7, 11, 15], 16))  # [2, 4] → 5 + 11

Common Mistakes

⚠️ Mistakes That Trip Up Beginners

Off-by-one errors with indices. Arrays are 0-indexed in most languages. An array of length 5 has valid indices 0 through 4, not 1 through 5. Accessing index 5 is an out-of-bounds error. This is the single most common bug in array code.
Building strings with += in a loop. In Python and JavaScript, strings are immutable. Each s += char allocates a new string and copies everything. On a string of length 10,000 you're doing 50 million character copies. Use "".join(list) or array.join("") instead.
Modifying an array while iterating over it. Deleting elements from an array while looping forward skips elements (because indices shift). Either iterate backwards, use a separate output array, or collect indices to delete and process them after the loop.
Forgetting that list.pop(0) in Python is O(n). It shifts every remaining element left. If you need frequent removal from the front, use collections.deque instead, which gives O(1) popleft.
Using the wrong equality check for arrays. In Python, == compares element-by-element (correct). In JavaScript, === compares object references, not contents — [1,2] === [1,2] is false. Use JSON.stringify() or a manual loop.

Real-World Example: Rate Limiter

Here's a practical problem you'd face in production: build a rate limiter that allows at most N requests per second from a given user. The sliding window approach uses an array (or deque) of timestamps.

Python

from collections import deque
import time

class RateLimiter:
    """Sliding window rate limiter using a deque of timestamps.

    For each user, we store the timestamps of their recent requests.
    When a new request comes in, we:
    1. Remove timestamps older than the window
    2. Check if the count is under the limit
    3. If yes, record the timestamp and allow
    """
    def __init__(self, max_requests, window_seconds):
        self.max_requests = max_requests
        self.window = window_seconds
        self.user_requests = {}  # user_id -> deque of timestamps

    def allow_request(self, user_id):
        now = time.time()
        cutoff = now - self.window

        if user_id not in self.user_requests:
            self.user_requests[user_id] = deque()

        timestamps = self.user_requests[user_id]

        # Remove expired timestamps from the front
        while timestamps and timestamps[0] < cutoff:
            timestamps.popleft()

        if len(timestamps) < self.max_requests:
            timestamps.append(now)
            return True
        return False

# Allow 5 requests per 10 seconds
limiter = RateLimiter(max_requests=5, window_seconds=10)

# Simulate requests
for i in range(7):
    allowed = limiter.allow_request("user_123")
    print(f"Request {i+1}: {'✓ allowed' if allowed else '✗ blocked'}")
# First 5 allowed, last 2 blocked

This is the same logic behind API rate limiting in services like Stripe, GitHub, and AWS. The deque acts as a sliding window over recent request timestamps. Old entries fall off the left side as time passes.

Interview Patterns

Arrays and strings appear in more interview problems than any other topic. Here are the patterns that show up over and over.

💡 Two Pointers

Use two index variables that move toward each other (or in the same direction) to solve problems in O(n) instead of O(n²). Classic examples: checking palindromes, removing duplicates from a sorted array, container with most water.

The key insight: if the array is sorted (or has some order property), two pointers let you skip checking pairs that can't possibly be the answer.

💡 Sliding Window

Maintain a "window" (a contiguous subarray) and slide it across the array, expanding or shrinking as needed. Perfect for substring/subarray problems with constraints like "longest substring without repeating characters" or "minimum window substring."

The pattern: expand the right pointer until you have a valid window, then shrink from the left to minimize it. Track the best answer as you go.

💡 Hash Map for Frequency

Many array/string problems boil down to counting. "Are these two strings anagrams?" → compare character frequency maps. "Find the first unique character?" → count occurrences, then scan again.

Time complexity drops from O(n²) brute force to O(n) with a hash map. Space trade-off: O(k) where k is the number of distinct elements.

💡 Prefix Sum

Pre-compute cumulative sums so you can answer "what's the sum of elements from index i to j?" in O(1). Build it once in O(n), query any subarray sum instantly: prefix[j+1] - prefix[i].

Shows up in: subarray sum equals k, range sum queries, product of array except self (using prefix and suffix products).

💡 In-Place Modification

When an interviewer says "do it in O(1) extra space," they want you to rearrange the array without allocating a new one. Common trick: use the array itself as a hash map by marking visited indices (e.g., negating values). Examples: find the duplicate, find all disappeared numbers.

💡 Sorting as Preprocessing

Sometimes the brute-force approach is O(n²) because you're searching for matches. Sorting the array first (O(n log n)) can reduce the inner search to O(log n) with binary search or O(1) with two pointers. Three Sum is the classic: sort, fix one element, then two-pointer the rest.

When the interviewer doesn't say "in-place" or "preserve order," sorting is almost always worth considering.

💡 Kadane's Algorithm (Maximum Subarray)

Track the maximum sum ending at each position. At each element, decide: extend the current subarray or start fresh from this element. The recurrence: max_ending_here = max(arr[i], max_ending_here + arr[i]). Update the global max at each step.

This runs in O(n) time and O(1) space. It shows up directly (LC #53) and as a building block inside harder problems like maximum product subarray and maximum circular subarray.

Practice Problems

Start from the top and work down. Each problem builds on techniques from the ones above it.

#	Problem	Difficulty	Key Pattern
1	Two Sum	Easy	Hash map lookup
121	Best Time to Buy and Sell Stock	Easy	Track min so far
238	Product of Array Except Self	Medium	Prefix/suffix products
3	Longest Substring Without Repeating Characters	Medium	Sliding window + hash set
15	3Sum	Medium	Sort + two pointers
53	Maximum Subarray	Medium	Kadane's algorithm
76	Minimum Window Substring	Hard	Sliding window + frequency map
42	Trapping Rain Water	Hard	Two pointers or prefix max
41	First Missing Positive	Hard	In-place cyclic sort

Arrays & Strings

On This Page

What Is an Array?

A Brief History

Arrays vs. Strings

Static vs. Dynamic Arrays

Real-World Uses

How It Works

Memory Layout

Insertion and Deletion

String Immutability

Multi-Dimensional Arrays

Interactive Visualization

Operations & Complexity

Implementation

Dynamic Array in Python

Dynamic Array in JavaScript

Common String Tricks

Variation: Sliding Window Maximum

Variation: Two-Pointer Technique

Common Mistakes

Real-World Example: Rate Limiter

Interview Patterns

Practice Problems