Skip List – probabilistyczna struktura danych O(log n), implementacja w PHP

by Henryk Tews / piątek, 22 maja 2026 / Opublikowano w Algorytmy

Skip List to probabilistyczna struktura danych wynaleziona przez Williama Pugha w 1990 roku. Łączy prostotę listy jednokierunkowej z wydajnością drzewa binarnego – operacje wyszukiwania, wstawiania i usuwania działają w O(log n) średnio, bez potrzeby balansowania jak w AVL czy Red-Black Tree. W PHP jest rzadko używana, ale świetnie ilustruje jak losowość może zastąpić złożone algorytmy balansowania.

Idea – wiele poziomów wskaźników

Zwykła lista jednokierunkowa to O(n) dla wyszukiwania – musisz przejść każdy element. Skip List dodaje „szybkie pasy” – dodatkowe wskaźniki które przeskakują nad wieloma elementami. Im wyższy poziom, tym rzadziej elementy się pojawiają (losowo, z prawdopodobieństwem 0.5).

Poziom 3: 1 --------> 50 -----------------> 100
Poziom 2: 1 --> 20 -> 50 --> 70 ----------> 100
Poziom 1: 1 --> 20 -> 50 --> 70 --> 85 --> 100
Poziom 0: 1 -> 10 -> 20 -> 30 -> 50 -> 70 -> 85 -> 90 -> 100

Szukamy 70:
- Start poziom 3: 1, następny 50 < 70, idź dalej. Następny 100 > 70, zejdź poziom.
- Poziom 2: 50, następny 70 = znaleziono!
- Zamiast 8 kroków (lista) - 4 kroki.

Implementacja w PHP

<?php

declare(strict_types=1);

class SkipListNode
{
    /** @var SkipListNode[] */
    public array $next;

    public function __construct(
        public readonly int|float|null $key,
        public mixed $value,
        int $levels
    ) {
        $this->next = array_fill(0, $levels, null);
    }
}

class SkipList
{
    private SkipListNode $head;
    private int $currentMaxLevel = 1;
    private int $size = 0;

    private const MAX_LEVELS   = 16;
    private const PROBABILITY  = 0.5;

    public function __construct()
    {
        // Sentinel node - klucz null, najwyższy poziom
        $this->head = new SkipListNode(null, null, self::MAX_LEVELS);
    }

    // Losuj poziom dla nowego węzła
    private function randomLevel(): int
    {
        $level = 1;
        while ($level < self::MAX_LEVELS && (mt_rand() / mt_getrandmax()) < self::PROBABILITY) {
            $level++;
        }
        return $level;
    }

    // Wstaw klucz-wartość - O(log n) średnio
    public function insert(int|float $key, mixed $value): void
    {
        // Śledź poprzedników na każdym poziomie
        $update = array_fill(0, self::MAX_LEVELS, null);
        $current = $this->head;

        for ($i = $this->currentMaxLevel - 1; $i >= 0; $i--) {
            while ($current->next[$i] !== null && $current->next[$i]->key < $key) {
                $current = $current->next[$i];
            }
            $update[$i] = $current;
        }

        // Sprawdź czy klucz już istnieje
        $current = $current->next[0];
        if ($current !== null && $current->key === $key) {
            $current->value = $value; // aktualizuj
            return;
        }

        // Nowy węzeł z losowym poziomem
        $newLevel = $this->randomLevel();
        if ($newLevel > $this->currentMaxLevel) {
            for ($i = $this->currentMaxLevel; $i < $newLevel; $i++) {
                $update[$i] = $this->head;
            }
            $this->currentMaxLevel = $newLevel;
        }

        $newNode = new SkipListNode($key, $value, $newLevel);
        for ($i = 0; $i < $newLevel; $i++) {
            $newNode->next[$i]    = $update[$i]->next[$i];
            $update[$i]->next[$i] = $newNode;
        }

        $this->size++;
    }

    // Wyszukaj - O(log n) średnio
    public function search(int|float $key): mixed
    {
        $current = $this->head;

        for ($i = $this->currentMaxLevel - 1; $i >= 0; $i--) {
            while ($current->next[$i] !== null && $current->next[$i]->key < $key) {
                $current = $current->next[$i];
            }
        }

        $current = $current->next[0];
        return ($current !== null && $current->key === $key) ? $current->value : null;
    }

    // Usuń - O(log n) średnio
    public function delete(int|float $key): bool
    {
        $update  = array_fill(0, self::MAX_LEVELS, null);
        $current = $this->head;

        for ($i = $this->currentMaxLevel - 1; $i >= 0; $i--) {
            while ($current->next[$i] !== null && $current->next[$i]->key < $key) {
                $current = $current->next[$i];
            }
            $update[$i] = $current;
        }

        $current = $current->next[0];
        if ($current === null || $current->key !== $key) {
            return false;
        }

        for ($i = 0; $i < $this->currentMaxLevel; $i++) {
            if ($update[$i]->next[$i] !== $current) break;
            $update[$i]->next[$i] = $current->next[$i];
        }

        $this->size--;
        return true;
    }

    // Zakres - wszystkie elementy między min a max
    public function range(int|float $min, int|float $max): array
    {
        $result  = [];
        $current = $this->head;

        // Znajdź start zakresu
        for ($i = $this->currentMaxLevel - 1; $i >= 0; $i--) {
            while ($current->next[$i] !== null && $current->next[$i]->key < $min) {
                $current = $current->next[$i];
            }
        }

        $current = $current->next[0];
        while ($current !== null && $current->key <= $max) {
            $result[$current->key] = $current->value;
            $current = $current->next[0];
        }

        return $result;
    }

    public function count(): int { return $this->size; }
}

Zastosowanie praktyczne – indeks cenowy produktów

<?php

// Znajdź wszystkie produkty w przedziale cenowym bez skanowania tabeli
$priceIndex = new SkipList();

// Indeksuj produkty po cenie
$products = [
    ['id' => 1, 'name' => 'Widget S',  'price' => 9.99],
    ['id' => 2, 'name' => 'Widget M',  'price' => 29.99],
    ['id' => 3, 'name' => 'Widget L',  'price' => 49.99],
    ['id' => 4, 'name' => 'Gadget Pro','price' => 99.99],
    ['id' => 5, 'name' => 'Gadget Max','price' => 149.99],
];

foreach ($products as $product) {
    $priceIndex->insert($product['price'], $product);
}

// Zapytanie zakresowe O(log n + k) gdzie k = liczba wyników
$inRange = $priceIndex->range(20.0, 100.0);
// Zwraca: Widget M (29.99), Widget L (49.99), Gadget Pro (99.99)

foreach ($inRange as $price => $product) {
    echo "{$product['name']}: {$price} PLN\n";
}

Benchmark: Skip List vs array_filter

<?php

$n = 100_000;
$skipList = new SkipList();
$array    = [];

// Wypełnij obie struktury
for ($i = 0; $i < $n; $i++) {
    $price = mt_rand(1, 100000) / 100;
    $skipList->insert($price, "product_{$i}");
    $array[] = ['price' => $price, 'id' => $i];
}

// Zapytanie zakresowe
$min = 50.0; $max = 100.0;

$start = microtime(true);
for ($q = 0; $q < 1000; $q++) {
    $result = $skipList->range($min, $max);
}
$skipTime = (microtime(true) - $start) * 1000;

$start = microtime(true);
for ($q = 0; $q < 1000; $q++) {
    $result = array_filter($array, fn($p) => $p['price'] >= $min && $p['price'] <= $max);
}
$arrayTime = (microtime(true) - $start) * 1000;

echo "Skip List range: {$skipTime}ms\n";  // ~45ms
echo "array_filter:    {$arrayTime}ms\n"; // ~890ms - ~20x wolniej

Porównanie ze strukturami drzewiastymi

Struktura	Wyszukiwanie	Wstawianie	Usuwanie	Implementacja
Skip List	O(log n) avg	O(log n) avg	O(log n) avg	Prosta
BST (niezbalansowane)	O(n) worst	O(n) worst	O(n) worst	Prosta
AVL Tree	O(log n)	O(log n)	O(log n)	Złożona
Red-Black Tree	O(log n)	O(log n)	O(log n)	Bardzo złożona
Hash Map	O(1) avg	O(1) avg	O(1) avg	Średnia

Skip List vs Hash Map: Hash Map jest szybszy dla dokładnych trafień, ale nie obsługuje zapytań zakresowych. Skip List wypełnia tę niszę - O(log n) dla zakresu zamiast O(n) dla iteracji po hash mapie.

Podsumowanie

Skip List to elegancki przykład jak losowość może zastąpić złożone algorytmy balansowania. Implementacja jest wielokrotnie prostsza niż AVL czy Red-Black Tree przy podobnej złożoności obliczeniowej. W PHP najbardziej przydaje się dla zapytań zakresowych na dużych zbiorach danych w pamięci - filtrowanie produktów po cenie, wyszukiwanie po dacie, indeksowanie po dowolnym kluczu numerycznym. Redis używa Skip List wewnętrznie w Sorted Sets - co potwierdza praktyczną wartość tej struktury. Następny wpis: wzorzec Flyweight - współdzielenie obiektów i cache instancji.

Skip List – probabilistyczna struktura danych O(log n), implementacja w PHP

Idea – wiele poziomów wskaźników

Implementacja w PHP

Zastosowanie praktyczne – indeks cenowy produktów

Benchmark: Skip List vs array_filter

Porównanie ze strukturami drzewiastymi

Podsumowanie

About Henryk Tews

Co możesz przeczytać następne

Wyszukiwanie binarne – O(log n) vs O(n) z tabelą porównawczą

Quicksort – warianty, Introsort, benchmark vs wbudowany sort()

Lista dwukierunkowa – implementacja od zera, SplDoublyLinkedList, tabela porównawcza