Weak AVL Tree
Weak AVL trees are a type of balanced binary search tree that combine the properties of AVL and red-black trees. They use rank differences to maintain balance, allowing for efficient insertion and deletion with minimal rotations. The paper introduces a framework for defining balanced trees and demonstrates how weak AVL trees can replace traditional AVL and red-black trees in applications like the FreeBSD sys/tree.h implementation. 2025-12-14 08:0:0 Author: maskray.me(查看原文) 阅读量:1 收藏

tl;dr: Weak AVL trees are replacements for AVL trees and red-black trees.

The 2014 paper Rank-Balanced Trees (Haeupler, Sen, Tarjan) presents a framework using ranks and rank differences to define binary search trees.

  • Each node has a non-negative integer rank r(x). Null nodes have rank -1.
  • The rank difference of a node x with parent p(x) is r(p(x)) − r(x).
  • A node is i,j if its children have rank differences i and j (unordered), e.g., a 1,2 node has children with rank differences 1 and 2.
  • A node is called 1-node if its rank difference is 1.

Several balanced trees fit this framework:

  • AVL tree: Ranks are defined as heights. Every node is 1,1 or 1,2 (rank differences of children)
  • Red-Black tree: All rank differences are 0 or 1, and no parent of a 0-child is a 0-child. (red: 0-child; black: 1-child; null nodes are black)
  • Weak AVL tree (new tree described by this paper): All rank differences are 1 or 2, and every leaf has rank 0.
    • A weak AVL tree without 2,2 nodes is an AVL tree.
1
AVL trees ⫋ weak AVL trees ⫋ red-black trees

Weak AVL trees are replacements for AVL trees and red-black trees. A single insertion or deletion operation requires at most two rotations (forming a double rotation when two are needed). In contrast, AVL deletion requires O(log n) rotations, and red-black deletion requires up to three.

Without deletions, a weak AVL tree is exactly an AVL tree. With deletions, its height remains at most that of an AVL tree with the same number of insertions but no deletions.

The rank rules imply:

  • Null nodes have rank -1, leaves have rank 0, unary nodes have rank 1.

Insertion

The new node x has a rank of 0, changed from the null node of rank -1. There are three cases.

  • If the tree was previously empty, the new node becomes the root.
  • If the parent of the new node was previously a unary node (1,2 node), it is now a 1,1 binary node.
  • If the parent of the new node was previously a leaf (1,1 node), it is now a 0,1 binary node, leading to a rank violation.

When the tree was previously non-empty, x has a parent node. We call the following subroutine with x indicating the new node to handle the second and third cases.

The following subroutine handles the rank increase of x. We call break if there is no more rank violation, i.e. we are done.

The 2014 paper isn't very clear about the conditions.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41


p = x->parent;
if (rank_diff(x) == 1) {


} else {
for (;;) {



Promote p.



x = p;
p = p->parent;

if (!p) break;
d = p->ch[1] == x;

if (rank_diff(x) == 1) { break; }


auto sib = p->ch[!d];
if (rank_diff(sib) == 2) {
auto y = x->ch[d^1];
if (y && rank_diff(y) == 1) {

Perform a double rotation involving `p`, `x`, and `y`.
} else {

Perform a single rotation involving `p` and `x`.
x is now a 1,1 node and there is no more violation.
}
break;
}


}
}

Insertion never introduces a 2,2 node, so insertion-only sequences produce AVL trees.

Deletion

TODO: Describe deletion

Implementation

In 2020, FreeBSD has changed its sys/tree.h to use weak AVL trees instead of red-black trees. https://reviews.freebsd.org/D25480 The rb_ prefix remains as it can also indicate Rank-Balanced:)

Here is a C++ implementation with the following operations

  • insert: insert a node
  • remove: remove a node
  • rank: count elements less than a key
  • select: find the k-th smallest element (0-indexed)
  • prev: find the largest element less than a key
  • next: find the smallest element greater than a key

The insertion and deletion operations are inspired by the FreeBSD implementation, with the insertion further optimized.

We encode rank differences instead of the absolute rank values. Since valid rank differences can only be 1 or 2, a single bit suffices to encode each. We take 2 bits from the parent pointer to store the two child rank differences.

The paper suggests that each node can use one bit to encode its own rank difference. However, this representation is not ideal. A null node can be either a 1-child (parent is binary) or a 2-child (parent is unary). Retrieving the child rank difference requires probing the sibling node:

int rank_diff(Node *p, int d) { return p->ch[d] ? p->ch[d]->par_and_flg & 1 : p->ch[!d] ? 2 : 1; }

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
#include <algorithm>
#include <cassert>
#include <cstdint>
#include <cstdio>
#include <cstdlib>
#include <numeric>
#include <random>
#include <vector>
using namespace std;

struct Node {
Node *ch[2]{};
uintptr_t par_and_flg{};
int i{}, sum{}, size{};

Node *parent() const { return reinterpret_cast<Node*>(par_and_flg & ~3UL); }
void set_parent(Node *p) { par_and_flg = (par_and_flg & 3) | reinterpret_cast<uintptr_t>(p); }
uintptr_t flags() const { return par_and_flg & 3; }
bool rd2(int d) const { return par_and_flg & (1 << d); }
void flip(int d) { par_and_flg ^= (1 << d); }
void clr_flags() { par_and_flg &= ~3UL; }

void mconcat() {
sum = i;
size = 1;
if (ch[0]) sum += ch[0]->sum, size += ch[0]->size;
if (ch[1]) sum += ch[1]->sum, size += ch[1]->size;
}

bool operator<(const Node &o) const { return i < o.i; }
};

struct WAVL {
Node *root{};

~WAVL() {
auto destroy = [](auto &self, Node *n) -> void {
if (!n) return;
self(self, n->ch[0]);
self(self, n->ch[1]);
delete n;
};
destroy(destroy, root);
}

Node *rotate(Node *x, int d) {
auto pivot = x->ch[d];
if ((x->ch[d] = pivot->ch[d^1])) x->ch[d]->set_parent(x);
pivot->set_parent(x->parent());
if (!x->parent()) root = pivot;
else x->parent()->ch[x != x->parent()->ch[0]] = pivot;
pivot->ch[d^1] = x;
x->set_parent(pivot);
x->mconcat();
return pivot;
}

void insert(Node *x) {
Node *p = nullptr;
int d = 0;
for (auto tmp = root; tmp; ) {
p = tmp;
d = *p < *x;
tmp = tmp->ch[d];
}
x->par_and_flg = reinterpret_cast<uintptr_t>(p);
x->ch[0] = x->ch[1] = nullptr;
if (!p) return root = x, x->mconcat();
p->ch[d] = x;
auto *x2 = x;
if (p->rd2(d)) {
p->flip(d);
} else {
assert(p->rd2(d^1) == 0);
p->flip(d^1);
int d1 = d;
for (x = p, p = x->parent(); p; x = p, p = x->parent()) {
d = (p->ch[1] == x);
if (p->rd2(d)) {
p->flip(d);
break;
}
p->flip(d^1);
if (!p->rd2(d ^ 1)) {
if ((d^1) == d1) {
assert(!x->rd2(d1) && (x->ch[d1] == x2 || x->ch[d1]->flags() == 1 || x->ch[d1]->flags() == 2));
x->flip(d);
auto y = rotate(x, d^1);
if (y->rd2(d))
x->flip(d^1);
else if (y->rd2(d^1))
p->flip(d);
x = y;
}
x = rotate(p, d);
x->clr_flags();
break;
}
d1 = d;
}
}
for (; x2; x2 = x2->parent()) x2->mconcat();
}

void remove(Node *x) {
auto old = x;
auto p = x->parent();
auto right = x->ch[1];
Node *child;
if (!x->ch[0]) x = child = right;
else if (!right) x = child = x->ch[0];
else {
if (!(child = right->ch[0])) {
child = right->ch[1];
p = x = right;
} else {
do x = child; while ((child = x->ch[0]));
child = x->ch[1];
p = x->parent();
p->ch[0] = child;
old->ch[1]->set_parent(x);
x->ch[1] = old->ch[1];
}
old->ch[0]->set_parent(x);
x->ch[0] = old->ch[0];
x->par_and_flg = old->par_and_flg;
}
if (!old->parent()) root = x;
else old->parent()->ch[old != old->parent()->ch[0]] = x;
if (child) child->set_parent(p);

Node *x2 = p;
if (p) {
x = child;
if (p->ch[0] == x && p->ch[1] == x) {
p->clr_flags();
x = p;
p = x->parent();
}
while (p) {
int d2 = (p->ch[1] == x);
if (!p->rd2(d2)) {
p->flip(d2);
break;
}
if (p->rd2(d2 ^ 1)) {
p->flip(d2 ^ 1);
x = p;
p = x->parent();
continue;
}
auto sib = p->ch[d2^1];
if (sib->flags() == 3) {
sib->clr_flags();
x = p;
p = x->parent();
continue;
}
sib->flip(d2^1);
if (sib->rd2(d2))
p->flip(d2);
else if (!sib->rd2(d2^1)) {
p->flip(d2);
x = rotate(sib, d2);
if (x->rd2(d2^1)) sib->flip(d2);
if (x->rd2(d2)) p->flip(d2^1);
x->par_and_flg |= 3;
}
rotate(p, d2^1);
break;
}
}
for (; x2; x2 = x2->parent()) x2->mconcat();
}

Node *find(int key) const {
auto tmp = root;
while (tmp) {
if (key < tmp->i) tmp = tmp->ch[0];
else if (key > tmp->i) tmp = tmp->ch[1];
else return tmp;
}
return nullptr;
}

Node *min() const {
Node *p = nullptr;
for (auto n = root; n; n = n->ch[0]) p = n;
return p;
}

int rank(int key) const {
int r = 0;
for (auto n = root; n; ) {
if (key <= n->i) n = n->ch[0];
else {
r += 1 + (n->ch[0] ? n->ch[0]->size : 0);
n = n->ch[1];
}
}
return r;
}

int select(int k) const {
auto x = root;
while (x) {
int lsz = x->ch[0] ? x->ch[0]->size : 0;
if (k < lsz) x = x->ch[0];
else if (k == lsz) return x->i;
else k -= lsz + 1, x = x->ch[1];
}
return -1;
}

int prev(int key) const {
int res = -1;
for (auto x = root; x; )
if (key <= x->i) x = x->ch[0];
else { res = x->i; x = x->ch[1]; }
return res;
}

int next(int key) const {
int res = -1;
for (auto x = root; x; )
if (key >= x->i) x = x->ch[1];
else { res = x->i; x = x->ch[0]; }
return res;
}

static Node *next(Node *x) {
if (x->ch[1]) {
x = x->ch[1];
while (x->ch[0]) x = x->ch[0];
} else {
while (x->parent() && x == x->parent()->ch[1]) x = x->parent();
x = x->parent();
}
return x;
}
};

void print_tree(Node *n, int d = 0) {
if (!n) { printf("%*snil\n", 2*d, ""); return; }
print_tree(n->ch[0], d + 1);
printf("%*s%d (%d,%d)\n", 2*d, "", n->i, n->rd2(0) ? 2 : 1, n->rd2(1) ? 2 : 1);
print_tree(n->ch[1], d + 1);
}

int compute_rank(Node *n, bool debug = false) {
if (!n) return -1;
int lr = compute_rank(n->ch[0], debug), rr = compute_rank(n->ch[1], debug);
if (lr < -1 || rr < -1) return -2;
int rank_l = lr + (n->rd2(0) ? 2 : 1);
int rank_r = rr + (n->rd2(1) ? 2 : 1);
if (rank_l != rank_r) {
if (debug) printf("node %d: rank mismatch left=%d right=%d\n", n->i, rank_l, rank_r);
return -2;
}
if (!n->ch[0] && !n->ch[1] && n->flags() != 0) {
if (debug) printf("node %d: leaf must be 1,1 but flags=%lu\n", n->i, n->flags());
return -2;
}
int expected_sum = n->i + (n->ch[0] ? n->ch[0]->sum : 0) + (n->ch[1] ? n->ch[1]->sum : 0);
if (n->sum != expected_sum) {
if (debug) printf("node %d: sum mismatch got=%d expected=%d\n", n->i, n->sum, expected_sum);
return -2;
}
int expected_size = 1 + (n->ch[0] ? n->ch[0]->size : 0) + (n->ch[1] ? n->ch[1]->size : 0);
if (n->size != expected_size) {
if (debug) printf("node %d: size mismatch got=%d expected=%d\n", n->i, n->size, expected_size);
return -2;
}
return rank_l;
}

bool verify_tree(const WAVL &tree, bool verbose = false) {
int rank = compute_rank(tree.root);
if (rank < -1) {
printf("INVALID TREE\n");
compute_rank(tree.root, true);
return false;
}
if (verbose) printf("Tree verified, rank = %d\n", rank);
return true;
}

int main() {
srand(42);
WAVL tree;
int i = 0;
std::vector<int> a(20);
std::iota(a.begin(), a.end(), 1);
std::shuffle(a.begin(), a.end(), std::default_random_engine(42));
for (int val : a) {
auto n = new Node;
n->i = val;
tree.insert(n);
if (i++ < 6) {
printf("-- %d After insertion of %d\n", i, val);
print_tree(tree.root);
}
}
printf("\nSum\tof values = %d\n", tree.root->sum);
verify_tree(tree, true);

for (int val : {5, 10, 15}) {
if (auto found = tree.find(val)) {
tree.remove(found);
delete found;
}
}
printf("After removing 5, 10, 15:\n");
printf("\nSum\tof values = %d\n", tree.root->sum);
verify_tree(tree, true);

std::vector<Node*> ref;
for (auto n = tree.min(); n; n = WAVL::next(n)) ref.push_back(n);

for (int i = 0; i < 100000; i++) {
if (ref.size() < 5 || (ref.size() < 1000 && rand() % 2 == 0)) {
auto n = new Node;
n->i = rand() % 100000;
tree.insert(n);
ref.push_back(n);
} else {
int idx = rand() % ref.size();
tree.remove(ref[idx]);
delete ref[idx];
ref[idx] = ref.back();
ref.pop_back();
}
if (i%100 == 0 && !verify_tree(tree)) {
printf("FAILED at iteration %d\n", i);
return 1;
}
}

while (!ref.empty()) {
tree.remove(ref.back());
delete ref.back();
ref.pop_back();
if (tree.root && !verify_tree(tree)) {
printf("FAILED during final cleanup\n");
return 1;
}
}
printf("Stress test passed\n");


printf("\nTesting rank/select/prev/next...\n");
std::vector<int> vals = {10, 20, 30, 40, 50};
for (int v : vals) {
auto n = new Node;
n->i = v;
tree.insert(n);
}


assert(tree.rank(5) == 0);
assert(tree.rank(10) == 0);
assert(tree.rank(15) == 1);
assert(tree.rank(20) == 1);
assert(tree.rank(25) == 2);
assert(tree.rank(50) == 4);
assert(tree.rank(55) == 5);


assert(tree.select(0) == 10);
assert(tree.select(1) == 20);
assert(tree.select(2) == 30);
assert(tree.select(3) == 40);
assert(tree.select(4) == 50);
assert(tree.select(5) == -1);


assert(tree.prev(10) == -1);
assert(tree.prev(11) == 10);
assert(tree.prev(20) == 10);
assert(tree.prev(21) == 20);
assert(tree.prev(50) == 40);
assert(tree.prev(55) == 50);


assert(tree.next(5) == 10);
assert(tree.next(10) == 20);
assert(tree.next(15) == 20);
assert(tree.next(40) == 50);
assert(tree.next(50) == -1);
assert(tree.next(55) == -1);

printf("rank/select/prev/next tests passed\n");
}

Misc

Visualization: https://tjkendev.github.io/bst-visualization/avl-tree/bu-weak.html


文章来源: https://maskray.me/blog/2025-12-14-weak-avl-tree
如有侵权请联系:admin#unsafe.sh