-
Notifications
You must be signed in to change notification settings - Fork 6
/
Copy pathrandomfunc.html
executable file
·3038 lines (2548 loc) · 270 KB
/
randomfunc.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html><html xmlns:dc="http://purl.org/dc/terms/" itemscope itemtype="http://schema.org/Article"><head><meta http-equiv=Content-Type content="text/html; charset=utf-8"><meta name="citation_pdf_url" content="https://peteroupc.github.io/randomfunc.pdf"><meta name="citation_url" content="https://peteroupc.github.io/randomfunc.html"><meta name="citation_date" content="2025/01/27"><meta name="citation_online_date" content="2025/01/27"><meta name="og:description" content="This page discusses many ways applications can sample randomized content by transforming the numbers produced by an underlying source of random numbers, such as numbers produced by a pseudorandom number generator, and offers pseudocode and Python sample code for many of these methods."><meta name="description" content="This page discusses many ways applications can sample randomized content by transforming the numbers produced by an underlying source of random numbers, such as numbers produced by a pseudorandom number generator, and offers pseudocode and Python sample code for many of these methods."><meta name="twitter:description" content="This page discusses many ways applications can sample randomized content by transforming the numbers produced by an underlying source of random numbers, such as numbers produced by a pseudorandom number generator, and offers pseudocode and Python sample code for many of these ..."><meta name="og:type" content="article"><meta name="og:url" content="https://peteroupc.github.io/randomfunc.html"><meta name="og:site_name" content="peteroupc.github.io"><meta name="author" content="Peter Occil"/><meta name="citation_author" content="Peter Occil"/><meta name="viewport" content="width=device-width"><link rel=stylesheet type="text/css" href="/style.css">
<script type="text/x-mathjax-config"> MathJax.Hub.Config({"HTML-CSS": { availableFonts: ["STIX","TeX"], linebreaks: { automatic:true }, preferredFont: "TeX" },
tex2jax: { displayMath: [ ["$$","$$"], ["\\[", "\\]"] ], inlineMath: [ ["$", "$"], ["\\\\(","\\\\)"] ], processEscapes: true } });
</script><script type="text/javascript" async src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-AMS_HTML-full"></script></head><body> <div class="header">
<nav><p><a href="#navigation">Menu</a> - <a href="#top">Top</a> - <a href="/">Home</a></nav></div>
<div class="mainarea" id="top">
<h1 id="randomization-and-sampling-methods">Randomization and Sampling Methods</h1>
<p><a href="mailto:poccil14@gmail.com"><strong>Peter Occil</strong></a></p>
<p><strong>Abstract:</strong> This page discusses many ways applications can sample randomized content by transforming the numbers produced by an underlying source of random numbers, such as numbers produced by a pseudorandom number generator, and offers pseudocode and Python sample code for many of these methods.</p>
<p><strong>2020 Mathematics Subject Classification:</strong> 68W20.</p>
<p><a id="Introduction"></a></p>
<h2 id="introduction">Introduction</h2>
<p>This page catalogs <em>randomization methods</em> and <em>sampling methods</em>. A randomization or sampling method is driven by a “source of random numbers” and produces numbers or other values called <strong><em>random variates</em></strong>. These variates are the result of the randomization. (The “source of random numbers” is often simulated in practice by so-called pseudorandom number generators, or PRNGs.) This document covers many methods, including—</p>
<ul>
<li>ways to sample integers or real numbers from a uniform distribution (such as the <a href="#RNDINT_Random_Integers_in_0_N"><strong>core method, <code>RNDINT(N)</code></strong></a>),</li>
<li>ways to generate randomized content and conditions, such as <a href="#Boolean_True_False_Conditions"><strong>true/false conditions</strong></a>, <a href="#Shuffling"><strong>shuffling</strong></a>, and <a href="#Sampling_Without_Replacement_Choosing_Several_Unique_Items"><strong>sampling unique items from a list</strong></a>, and</li>
<li>nonuniform distributions, including <a href="#Weighted_Choice"><strong>weighted choice</strong></a>, the <a href="#Poisson_Distribution"><strong>Poisson distribution</strong></a>, and <a href="#Index_of_Non_Uniform_Distributions"><strong>other probability distributions</strong></a>.</li>
</ul>
<p>This page is devoted to randomization and sampling methods that <em>exactly</em> sample from the distribution described, without introducing additional errors beyond those already present in the inputs (and assuming that an ideal “source of random numbers” is available). This will be the case if there is a finite number of values to choose from. But for the normal distribution and other distributions that take on infinitely many values, there will always be some level of approximation involved; in this case, the focus of this page is on methods that <em>minimize</em> the error they introduce.</p>
<p>This document shows pseudocode for many of the methods, and <a href="https://peteroupc.github.io/randomgen.zip"><strong>sample Python code</strong></a> that implements many of the methods in this document is available, together with <a href="https://peteroupc.github.io/randomgendoc.html"><strong>documentation for the code</strong></a>.</p>
<p>The randomization methods presented on this page assume we have an endless source of numbers chosen independently at random and with a uniform distribution. For more information, see “<a href="#Sources_of_Random_Numbers"><strong>Sources of Random Numbers</strong></a>” in the appendix.</p>
<p><strong>In general, the following are outside the scope of this document:</strong></p>
<ul>
<li>This document does not cover how to choose an underlying PRNG (or device or program that simulates a “source of random numbers”) for a particular application, including in terms of security, performance, and quality. I have written more on recommendations in <a href="https://peteroupc.github.io/random.html"><strong>another document</strong></a>.</li>
<li>This document does not include algorithms for specific PRNGs, such as Mersenne Twister, PCG, xorshift, linear congruential generators, or generators based on hash functions.</li>
<li>This document does not cover how to test PRNGs for correctness or adequacy, and the same applies to other devices and programs that simulate a “source of random numbers”. Testing is covered, for example, in “<a href="https://peteroupc.github.io/randomtest.html"><strong>Testing PRNGs for High-Quality Randomness</strong></a>”.</li>
<li>This document does not explain how to specify or generate “seeds” for use in PRNGs. This is <a href="https://peteroupc.github.io/random.html#Nondeterministic_Sources_and_Seed_Generation"><strong>covered in detail</strong></a> elsewhere.</li>
<li>This document does not show how to generate random security parameters such as encryption keys.</li>
<li>This document does not cover randomness extraction (also known as <em>unbiasing</em>, <em>deskewing</em>, or <em>whitening</em>). See my <a href="https://peteroupc.github.io/randextract.html"><strong>Note on Randomness Extraction</strong></a>.</li>
<li>“Variance reduction” methods, such as importance sampling or common random numbers, are outside the scope of this document.</li>
</ul>
<p>In addition, this page is not devoted to sampling methods used for computer graphics rendering (such as Poisson disk sampling, multiple importance sampling, blue noise, and gradient noise), because this application tends to give performance and visual acceptability a greater importance than accuracy and exact sampling.</p>
<p><a id="About_This_Document"></a></p>
<h3 id="about-this-document">About This Document</h3>
<p><strong>This is an open-source document; for an updated version, see the</strong> <a href="https://github.com/peteroupc/peteroupc.github.io/raw/master/randomfunc.md"><strong>source code</strong></a> <strong>or its</strong> <a href="https://github.com/peteroupc/peteroupc.github.io/blob/master/randomfunc.md"><strong>rendering on GitHub</strong></a><strong>. You can send comments on this document either on</strong> <a href="https://www.codeproject.com/Articles/1190459/Random-Number-Generation-and-Sampling-Methods"><strong>CodeProject</strong></a> <strong>or on the</strong> <a href="https://github.com/peteroupc/peteroupc.github.io/issues"><strong>GitHub issues page</strong></a><strong>.</strong></p>
<p>My audience for this article is <strong>computer programmers with mathematics knowledge, but little or no familiarity with calculus</strong>.</p>
<p>I encourage readers to implement any of the algorithms given in this page, and report their implementation experiences. In particular, <a href="https://github.com/peteroupc/peteroupc.github.io/issues/18"><strong>I seek comments on the following aspects</strong></a>:</p>
<ul>
<li>Are the algorithms in this article easy to implement? Is each algorithm written so that someone could write code for that algorithm after reading the article?</li>
<li>Does this article have errors that should be corrected?</li>
<li>Are there ways to make this article more useful to the target audience?</li>
</ul>
<p>Comments on other aspects of this document are welcome.</p>
<p><a id="Contents"></a></p>
<h2 id="contents">Contents</h2>
<ul>
<li><a href="#Introduction"><strong>Introduction</strong></a>
<ul>
<li><a href="#About_This_Document"><strong>About This Document</strong></a></li>
</ul>
</li>
<li><a href="#Contents"><strong>Contents</strong></a></li>
<li><a href="#Notation"><strong>Notation</strong></a></li>
<li><a href="#Uniform_Random_Integers"><strong>Uniform Random Integers</strong></a>
<ul>
<li><a href="#RNDINT_Random_Integers_in_0_N"><code>RNDINT</code>: Random Integers in [0, N]</a></li>
<li><a href="#RNDINTRANGE_Random_Integers_in_N_M"><code>RNDINTRANGE</code>: Random Integers in [N, M]</a></li>
<li><a href="#RNDINTEXC_Random_Integers_in_0_N"><strong><code>RNDINTEXC</code>: Random Integers in [0, N)</strong></a></li>
<li><a href="#RNDINTEXCRANGE_Random_Integers_in_N_M"><strong><code>RNDINTEXCRANGE</code>: Random Integers in [N, M)</strong></a></li>
<li><a href="#Uniform_Random_Bits"><strong>Uniform Random Bits</strong></a></li>
<li><a href="#Examples_of_Using_the_RNDINT_Family"><strong>Examples of Using the <code>RNDINT</code> Family</strong></a></li>
</ul>
</li>
<li><a href="#Randomization_Techniques"><strong>Randomization Techniques</strong></a>
<ul>
<li><a href="#Boolean_True_False_Conditions"><strong>Boolean (True/False) Conditions</strong></a></li>
<li><a href="#Random_Sampling"><strong>Random Sampling</strong></a>
<ul>
<li><a href="#Sampling_With_Replacement_Choosing_a_Random_Item_from_a_List"><strong>Sampling With Replacement: Choosing a Random Item from a List</strong></a></li>
<li><a href="#Sampling_Without_Replacement_Choosing_Several_Unique_Items"><strong>Sampling Without Replacement: Choosing Several Unique Items</strong></a></li>
<li><a href="#Shuffling"><strong>Shuffling</strong></a></li>
<li><a href="#Random_Character_Strings"><strong>Random Character Strings</strong></a></li>
<li><a href="#Pseudocode_for_Random_Sampling"><strong>Pseudocode for Random Sampling</strong></a></li>
</ul>
</li>
<li><a href="#Rejection_Sampling"><strong>Rejection Sampling</strong></a></li>
<li><a href="#Random_Walks"><strong>Random Walks</strong></a></li>
<li><a href="#Random_Dates_and_Times"><strong>Random Dates and Times</strong></a></li>
<li><a href="#Randomization_in_Statistical_Testing"><strong>Randomization in Statistical Testing</strong></a></li>
<li><a href="#Markov_Chains"><strong>Markov Chains</strong></a></li>
<li><a href="#Random_Graphs"><strong>Random Graphs</strong></a></li>
<li><a href="#A_Note_on_Sorting_Random_Variates"><strong>A Note on Sorting Random Variates</strong></a></li>
</ul>
</li>
<li><a href="#General_Non_Uniform_Distributions"><strong>General Non-Uniform Distributions</strong></a>
<ul>
<li><a href="#Weighted_Choice"><strong>Weighted Choice</strong></a>
<ul>
<li><a href="#Weighted_Choice_With_Replacement"><strong>Weighted Choice With Replacement</strong></a></li>
<li><a href="#Weighted_Choice_Without_Replacement"><strong>Weighted Choice Without Replacement</strong></a></li>
<li><a href="#Unequal_Probability_Sampling"><strong>Unequal Probability Sampling</strong></a></li>
</ul>
</li>
<li><a href="#Mixtures_of_Distributions"><strong>Mixtures of Distributions</strong></a></li>
<li><a href="#Transformations_of_Random_Variates"><strong>Transformations of Random Variates</strong></a></li>
</ul>
</li>
<li><a href="#Specific_Non_Uniform_Distributions"><strong>Specific Non-Uniform Distributions</strong></a>
<ul>
<li><a href="#Dice"><strong>Dice</strong></a></li>
<li><a href="#Binomial_Distribution"><strong>Binomial Distribution</strong></a></li>
<li><a href="#Negative_Binomial_Distribution"><strong>Negative Binomial Distribution</strong></a></li>
<li><a href="#Geometric_Distribution"><strong>Geometric Distribution</strong></a></li>
<li><a href="#Exponential_Distribution"><strong>Exponential Distribution</strong></a></li>
<li><a href="#Poisson_Distribution"><strong>Poisson Distribution</strong></a></li>
<li><a href="#P_lya_ndash_Eggenberger_Distribution"><strong>Pólya–Eggenberger Distribution</strong></a></li>
<li><a href="#Random_Integers_with_a_Given_Positive_Sum"><strong>Random Integers with a Given Positive Sum</strong></a></li>
<li><a href="#Multinomial_Distribution"><strong>Multinomial Distribution</strong></a></li>
</ul>
</li>
<li><a href="#Randomization_with_Real_Numbers"><strong>Randomization with Real Numbers</strong></a>
<ul>
<li><a href="#Uniform_Random_Real_Numbers"><strong>Uniform Random Real Numbers</strong></a>
<ul>
<li><a href="#For_Fixed_Point_Number_Formats"><strong>For Fixed-Point Number Formats</strong></a></li>
<li><a href="#For_Rational_Number_Formats"><strong>For Rational Number Formats</strong></a></li>
<li><a href="#For_Floating_Point_Number_Formats"><strong>For Floating-Point Number Formats</strong></a></li>
</ul>
</li>
<li><a href="#Monte_Carlo_Sampling_Expected_Values_Integration_and_Optimization"><strong>Monte Carlo Sampling: Expected Values, Integration, and Optimization</strong></a></li>
<li><a href="#Point_Sample_Selection"><strong>Point Sample Selection</strong></a></li>
<li><a href="#Notes_on_Randomization_Involving_Real_Numbers"><strong>Notes on Randomization Involving Real Numbers</strong></a>
<ul>
<li><a href="#Random_Walks_Additional_Examples"><strong>Random Walks: Additional Examples</strong></a></li>
<li><a href="#Transformations_Additional_Examples"><strong>Transformations: Additional Examples</strong></a></li>
</ul>
</li>
<li><a href="#Sampling_from_a_Distribution_of_Data_Points"><strong>Sampling from a Distribution of Data Points</strong></a></li>
<li><a href="#Sampling_from_an_Arbitrary_Distribution"><strong>Sampling from an Arbitrary Distribution</strong></a>
<ul>
<li><a href="#Sampling_for_Discrete_Distributions"><strong>Sampling for Discrete Distributions</strong></a></li>
<li><a href="#Inverse_Transform_Sampling"><strong>Inverse Transform Sampling</strong></a></li>
<li><a href="#Rejection_Sampling_with_a_PDF_Like_Function"><strong>Rejection Sampling with a PDF-Like Function</strong></a></li>
<li><a href="#Alternating_Series"><strong>Alternating Series</strong></a></li>
<li><a href="#Markov_Chain_Monte_Carlo"><strong>Markov-Chain Monte Carlo</strong></a></li>
</ul>
</li>
<li><a href="#Piecewise_Linear_Distribution"><strong>Piecewise Linear Distribution</strong></a></li>
<li><a href="#Specific_Distributions"><strong>Specific Distributions</strong></a></li>
<li><a href="#Index_of_Non_Uniform_Distributions"><strong>Index of Non-Uniform Distributions</strong></a></li>
<li><a href="#Geometric_Sampling"><strong>Geometric Sampling</strong></a>
<ul>
<li><a href="#Random_Points_Inside_a_Simplex"><strong>Random Points Inside a Simplex</strong></a></li>
<li><a href="#Random_Points_on_a_Sphere"><strong>Random Points on a Sphere</strong></a></li>
<li><a href="#Random_Points_Inside_a_Box_Ball_Shell_or_Cone"><strong>Random Points Inside a Box, Ball, Shell, or Cone</strong></a></li>
<li><a href="#Random_Latitude_and_Longitude"><strong>Random Latitude and Longitude</strong></a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#Acknowledgments"><strong>Acknowledgments</strong></a></li>
<li><a href="#Other_Documents"><strong>Other Documents</strong></a></li>
<li><a href="#Notes"><strong>Notes</strong></a></li>
<li><a href="#Appendix"><strong>Appendix</strong></a>
<ul>
<li><a href="#Sources_of_Random_Numbers"><strong>Sources of Random Numbers</strong></a></li>
<li><a href="#Implementation_Considerations"><strong>Implementation Considerations</strong></a></li>
<li><a href="#Security_Considerations"><strong>Security Considerations</strong></a></li>
</ul>
</li>
<li><a href="#License"><strong>License</strong></a></li>
</ul>
<p><a id="Notation"></a></p>
<h2 id="notation">Notation</h2>
<p>In this document:</p>
<ul>
<li>The <a href="https://peteroupc.github.io/pseudocode.html"><strong>pseudocode conventions</strong></a> apply to this document.</li>
<li>The following notation for intervals is used: [<code>a</code>, <code>b</code>) means “<code>a</code> or greater, but less than <code>b</code>”. (<code>a</code>, <code>b</code>) means “greater than <code>a</code>, but less than <code>b</code>”. (<code>a</code>, <code>b</code>] means “greater than <code>a</code> and less than or equal to <code>b</code>”. [<code>a</code>, <code>b</code>] means “<code>a</code> or greater and <code>b</code> or less”.</li>
<li><code>log1p(x)</code> is equivalent to <code>ln(1 + x)</code> and is a “robust” alternative to <code>ln(1 + x)</code> where <code>x</code> is a floating-point number (Pedersen 2018)<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>.</li>
<li><code>MakeRatio(n, d)</code> creates a rational number with the specified numerator <code>n</code> and denominator <code>d</code>.</li>
<li><code>Sum(list)</code> calculates the sum of the numbers in the specified list of integers or rational numbers. (Summing floating-point numbers naïvely can introduce round-off errors.)</li>
</ul>
<p><a id="Uniform_Random_Integers"></a></p>
<h2 id="uniform-random-integers">Uniform Random Integers</h2>
<p>This section shows how to derive independent uniform random integers with the help of a “source of random numbers” (or a device or program that simulates that source).</p>
<p>This section describes four methods: <code>RNDINT</code>, <code>RNDINTEXC</code>, <code>RNDINTRANGE</code>, <code>RNDINTEXCRANGE</code>. Of these, <code>RNDINT</code>, described next, can serve as the basis for the remaining methods.</p>
<p><a id="RNDINT_Random_Integers_in_0_N"></a></p>
<h3 id="rndint-random-integers-in-0-n"><code>RNDINT</code>: Random Integers in [0, N]</h3>
<p>In this document, <strong><code>RNDINT(maxInclusive)</code></strong> is the core method in this document; it derives independent uniform integers <strong>in the interval [0, <code>maxInclusive</code>]</strong> from a “source of random numbers”<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>. In the pseudocode below, which implements <code>RNDINT</code>:</p>
<ul>
<li><code>NEXTRAND()</code> reads the next number generated by a “source of (uniform) random numbers” as defined in the appendix (an endless source of numbers, each of which is chosen independently of any other choice and with a uniform distribution). As noted in the appendix, a pseudorandom number generator can simulate this source in practice. For this method, the source can have a nonuniform instead of uniform distribution.</li>
<li><code>MODULUS</code> is the number of different outcomes possible with that source.</li>
</ul>
<p>Specifically:</p>
<table>
<thead>
<tr>
<th>If the underlying source produces:</th>
<th>Then <code>NEXTRAND()</code> is:</th>
<th>And <code>MODULUS</code> is:</th>
</tr>
</thead>
<tbody>
<tr>
<td>Nonuniform numbers<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup>.</td>
<td>The next bit from a new source formed by taking the underlying source’s outputs as input to a <a href="https://peteroupc.github.io/randextract.html"><strong><em>randomness extraction</em></strong></a> technique to produce independent random integers that equal 1 or 0 with equal probability.</td>
<td>2.</td>
</tr>
<tr>
<td>Uniform numbers not described below.</td>
<td>Same as earlier.</td>
<td>2<sup><em>n</em></sup>.</td>
</tr>
<tr>
<td>Uniform 32-bit nonnegative integers.</td>
<td>The next number from the source.</td>
<td>2<sup>32</sup>.</td>
</tr>
<tr>
<td>Uniform 64-bit nonnegative integers.</td>
<td>The next number from the source.</td>
<td>2<sup>64</sup>.</td>
</tr>
<tr>
<td>Uniform integers in the interval [0, <em>n</em>).</td>
<td>The next number from the source.</td>
<td><em>n</em>.</td>
</tr>
<tr>
<td>Uniform numbers in the interval [0, 1) known to be evenly spaced by a number <em>p</em> (for example, dSFMT).</td>
<td>The next number from the source, multiplied by <em>p</em>.</td>
<td>1/<em>p</em>.</td>
</tr>
<tr>
<td>Uniform numbers in the interval [0, 1), where numbers in [0, 0.5) and those in [0.5, 1) are known to occur with equal probability (for example, Java’s <code>Math.random()</code>).</td>
<td>0 if the source gives out a number less than 0.5, or 1 otherwise.</td>
<td>2.</td>
</tr>
</tbody>
</table>
<pre>METHOD RndIntHelperNonPowerOfTwo(maxInclusive)
if maxInclusive <= MODULUS - 1:
// NOTE: If the programming language implements
// division with two integers by discarding the result's
// fractional part, the division can be used as is without
// using a "floor" function.
nPlusOne = maxInclusive + 1
maxexc = floor((MODULUS - 1) / nPlusOne) * nPlusOne
while true // until a value is returned
ret = NEXTRAND()
if ret < nPlusOne: return ret
if ret < maxexc: return rem(ret, nPlusOne)
end
else
cx = floor(maxInclusive / MODULUS) + 1
while true // until a value is returned
ret = cx * NEXTRAND()
// NOTE: The addition operation below should
// check for integer overflow and should reject the
// number if overflow would result.
ret = ret + RNDINT(cx - 1)
if ret <= maxInclusive: return ret
end
end
END METHOD
METHOD RndIntHelperPowerOfTwo(maxInclusive)
// NOTE: Finds the number of bits minus 1 needed
// to represent MODULUS (in other words, the number
// of random bits returned by NEXTRAND() ). In practice,
// this will be a constant and can be calculated in advance.
modBits = ln(MODULUS)/ln(2)
// Lumbroso's Fast Dice Roller.
x = 1
y = 0
nextBit = modBits
rngv = 0
maxIncMinus1 = maxInclusive - 1
while true // until a value is returned
if nextBit >= modBits
nextBit = 0
rngv = NEXTRAND()
end
nextBit = nextBit + 1
// if modBits=1, this can read "bit=rngv"
bit = rem(rngv, 2)
x = x * 2
y = y * 2 + bit
// if modBits=1, the following line
// can be left out
rngv = floor(rngv / 2)
if x > maxInclusive
x = x - maxIncMinus1 + 2
if y <= maxInclusive: return y
y = y - maxIncMinus1 + 2
end
end
END METHOD
METHOD RNDINT(maxInclusive)
// maxInclusive must be 0 or greater
if maxInclusive < 0: return error
if maxInclusive == 0: return 0
if maxInclusive == MODULUS - 1: return NEXTRAND()
// NOTE: Finds the number of bits minus 1 needed
// to represent MODULUS (if it's a power of 2).
// This will be a constant here, though.
modBits=ln(MODULUS)/ln(2)
if floor(modBits) == modBits // Is an integer
return RndIntHelperPowerOfTwo(maxInclusive)
else
return RndIntHelperNonPowerOfTwo(maxInclusive)
end
END METHOD
</pre>
<p>The table below shows algorithms that have been proposed for choosing an integer uniformly at random. Some are exact and others are not, but in general, the algorithm can be exact only if it runs forever in the worst case. The algorithms listed take <code>n</code> as a parameter, where <code>n = maxInclusive + 1</code>, and thus sample from the interval [0, <code>n</code>). (The column “Exact?” means whether the algorithm generates random integers with the correct probability in theory, even if <code>n</code> is not a power of 2.)</p>
<table>
<thead>
<tr>
<th>Algorithm</th>
<th>Optimal?</th>
<th>Exact?</th>
<th>Time Complexity</th>
</tr>
</thead>
<tbody>
<tr>
<td><em>Rejection sampling</em>: Sample in a bigger range until a sampled number fits the smaller range.</td>
<td>Not always</td>
<td>Yes</td>
<td>Runs forever in worst case</td>
</tr>
<tr>
<td><em>Multiply-and-shift reduction</em>: Generate <code>bignumber</code>, an integer made of <code>k</code> bits, where <code>k</code> is much greater than <code>n</code> and each bit is set or clear with equal probability, then find <code>(bignumber * n) >> k</code> (see (Lemire 2016)<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">4</a></sup>, (Lemire 2018)<sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">5</a></sup>, and the “Integer Multiplication” algorithm surveyed by M. O’Neill).</td>
<td>No</td>
<td>No</td>
<td>Constant</td>
</tr>
<tr>
<td><em>Modulo reduction</em>: Generate <code>bignumber</code> as above, then find <code>rem(bignumber, n)</code>.</td>
<td>No</td>
<td>No</td>
<td>Constant</td>
</tr>
<tr>
<td><em>Fast Dice Roller</em> (Lumbroso 2013)<sup id="fnref:6" role="doc-noteref"><a href="#fn:6" class="footnote" rel="footnote">6</a></sup> (see pseudocode above).</td>
<td>Yes</td>
<td>Yes</td>
<td>Runs forever in worst case</td>
</tr>
<tr>
<td>Algorithm FYKY (Bacher et al. 2017)<sup id="fnref:7" role="doc-noteref"><a href="#fn:7" class="footnote" rel="footnote">7</a></sup>. Effectively the same as replacing the lines “<code>if y <= maxInclusive: return y; y = y - maxIncMinus1</code>” in the pseudocode above with “<code>if y >= x: return y-x</code>”.</td>
<td>Yes</td>
<td>Yes</td>
<td>Runs forever in worst case</td>
</tr>
<tr>
<td>Math Forum (2004)<sup id="fnref:8" role="doc-noteref"><a href="#fn:8" class="footnote" rel="footnote">8</a></sup> or (Mennucci 2018)<sup id="fnref:9" role="doc-noteref"><a href="#fn:9" class="footnote" rel="footnote">9</a></sup> (batching/recycling random bits).</td>
<td>Yes</td>
<td>Yes</td>
<td>Runs forever in worst case</td>
</tr>
<tr>
<td>“FP Multiply” surveyed by <a href="https://www.pcg-random.org/posts/bounded-rands.html"><strong>M. O’Neill</strong></a>.</td>
<td>No</td>
<td>No</td>
<td>Constant</td>
</tr>
<tr>
<td>Algorithm in “Conclusion” section by O’Neill.</td>
<td>No</td>
<td>Yes</td>
<td>Runs forever in worst case</td>
</tr>
<tr>
<td>“Debiased” and “Bitmask with Rejection” surveyed by M. O’Neill.</td>
<td>No</td>
<td>Yes</td>
<td>Runs forever in worst case</td>
</tr>
</tbody>
</table>
<blockquote>
<p><strong>Notes:</strong></p>
<ol>
<li><strong><code>RNDINT</code> as a binary tree walker.</strong> Donald E. Knuth and Andrew C. Yao (1976)<sup id="fnref:10" role="doc-noteref"><a href="#fn:10" class="footnote" rel="footnote">10</a></sup> showed that any algorithm that generates random integers using “fair coin flips”, or random integers that each equal 1 or 0 with equal probability, can be described as a <em>binary tree</em> (also known as a <em>DDG tree</em> or <em>discrete distribution generating tree</em>). (This also applies to <code>RNDINT</code> algorithms.) “Fair coin flips” trace a path in this tree, and each leaf (terminal node) in the tree represents an outcome. In the case of <code>RNDINT(maxInclusive)</code>, there are <code>n = maxInclusive + 1</code> outcomes that each occur with probability <code>1/n</code>.<br />Knuth and Yao showed that any <em>optimal</em> DDG tree algorithm needs at least <code>log2(n)</code> and at most <code>log2(n) + 2</code> bits on average (where <code>log2(x) = ln(x)/ln(2)</code>).<sup id="fnref:11" role="doc-noteref"><a href="#fn:11" class="footnote" rel="footnote">11</a></sup> But as they also showed, for the algorithm to be exact, it must run forever in the worst case, even if it uses few random bits on average (that is, there is no way in general to “fix” this worst case while remaining exact). This is because <code>1/n</code> will have an infinite run of base-2 digits except when <code>n</code> is a power of 2, so that the resulting DDG tree will have to either be infinitely deep, or include “rejection leaves” at the end of the tree.<br />For instance, the <em>modulo reduction</em> method can be represented by a DDG tree in which rejection leaves are replaced with labeled outcomes, but the method is inexact because only some outcomes can replace rejection leaves this way. For the same reason, stopping the <em>rejection sampler</em> after a fixed number of tries likewise leads to some outcomes being slightly more probable than others. However, which outcomes are slightly more probable this way depends on the algorithm.</li>
<li><strong>Reducing “bit waste”.</strong> Any implementation of <code>RNDINT</code> needs at least <code>log2(n)</code> bits per chosen integer on average, as noted above, but most of them use many more. There are various ways to bring an algorithm closer to <code>log2(n)</code>. They include batching, bit recycling, and randomness extraction, and they are discussed, for example, in the Math Forum page and the Lumbroso and Mennucci papers referenced above, and in Devroye and Gravel (2020, section 2.3)<sup id="fnref:12" role="doc-noteref"><a href="#fn:12" class="footnote" rel="footnote">12</a></sup>. <em>Batching example</em>: To generate three digits from 0 through 9, we can call <code>RNDINT(999)</code> to generate an integer in [0, 999], then break the number it returns into three digits.</li>
<li><strong>Simulating dice.</strong> If we have a (virtual) fair <em>p</em>-sided die, how can we use it to simulate rolls of a <em>k</em>-sided die? This can’t be done without “wasting” randomness, unless “every prime number dividing <em>k</em> also divides <em>p</em>” (see “<a href="https://perso.math.u-pem.fr/kloeckner.benoit/papiers/DiceSimulation.pdf"><strong>Simulating a dice with a dice</strong></a>” by B. Kloeckner, 2008). However, <em>randomness extraction</em> (see my <a href="https://peteroupc.github.io/randextract.html"><strong>Note on Randomness Extraction</strong></a>) can turn die rolls into “fair coin flips” (see note 1 above), so that the discussion earlier in this section applies.</li>
</ol>
</blockquote>
<p><a id="RNDINTRANGE_Random_Integers_in_N_M"></a></p>
<h3 id="rndintrange-random-integers-in-n-m"><code>RNDINTRANGE</code>: Random Integers in [N, M]</h3>
<p>The naïve way of generating a <strong>random integer in the interval [<code>minInclusive</code>, <code>maxInclusive</code>]</strong>, shown next, works well for nonnegative integers and arbitrary-precision integers.</p>
<pre> METHOD RNDINTRANGE(minInclusive, maxInclusive)
// minInclusive must not be greater than maxInclusive
if minInclusive > maxInclusive: return error
return minInclusive + RNDINT(maxInclusive - minInclusive)
END METHOD
</pre>
<p>The naïve approach won’t work as well, though, if the integer format can express negative and nonnegative integers and the difference between <code>maxInclusive</code> and <code>minInclusive</code> exceeds the highest possible integer for the format. For integer formats that can express—</p>
<ol>
<li>every integer in the interval [-1 - <code>MAXINT</code>, <code>MAXINT</code>] (for example, Java <code>int</code>, <code>short</code>, or <code>long</code>), or</li>
<li>every integer in the interval [-<code>MAXINT</code>, <code>MAXINT</code>] (for example, Java <code>float</code> and <code>double</code> and .NET’s implementation of <code>System.Decimal</code>),</li>
</ol>
<p>where <code>MAXINT</code> is an integer greater than 0, the following pseudocode for <code>RNDINTRANGE</code> can be used.</p>
<pre>METHOD RNDINTRANGE(minInclusive, maxInclusive)
// minInclusive must not be greater than maxInclusive
if minInclusive > maxInclusive: return error
if minInclusive == maxInclusive: return minInclusive
if minInclusive==0: return RNDINT(maxInclusive)
// Difference does not exceed maxInclusive
if minInclusive > 0 or minInclusive + MAXINT >= maxInclusive
return minInclusive + RNDINT(maxInclusive - minInclusive)
end
while true // until a value is returned
ret = RNDINT(MAXINT)
// NOTE: For case 1, use the following line:
if RNDINT(1) == 0: ret = -1 - ret
// NOTE: For case 2, use the following three lines
// instead of the preceding line; these lines
// avoid negative zero
// negative = RNDINT(1) == 0
// if negative: ret = 0 - ret
// if negative and ret == 0: continue
if ret >= minInclusive and ret <= maxInclusive: return ret
end
END METHOD
</pre>
<p><a id="RNDINTEXC_Random_Integers_in_0_N"></a></p>
<h3 id="rndintexc-random-integers-in-0-n"><code>RNDINTEXC</code>: Random Integers in [0, N)</h3>
<p><code>RNDINTEXC(maxExclusive)</code>, which generates a <strong>random integer in the interval</strong> <strong>[0, <code>maxExclusive</code>)</strong>, can be implemented as follows<sup id="fnref:13" role="doc-noteref"><a href="#fn:13" class="footnote" rel="footnote">13</a></sup>:</p>
<pre> METHOD RNDINTEXC(maxExclusive)
if maxExclusive <= 0: return error
return RNDINT(maxExclusive - 1)
END METHOD
</pre>
<blockquote>
<p><strong>Note:</strong> <code>RNDINTEXC</code> is not given as the core random generation method because it’s harder to fill integers in popular integer formats with random bits with this method.</p>
</blockquote>
<p><a id="RNDINTEXCRANGE_Random_Integers_in_N_M"></a></p>
<h3 id="rndintexcrange-random-integers-in-n-m"><code>RNDINTEXCRANGE</code>: Random Integers in [N, M)</h3>
<p><strong><code>RNDINTEXCRANGE</code></strong> returns a <strong>random integer in the interval</strong> <strong>[<code>minInclusive</code>, <code>maxExclusive</code>)</strong>. It can be implemented using <a href="#RNDINTRANGE_Random_Integers_in_N_M"><strong><code>RNDINTRANGE</code></strong></a>, as the following pseudocode demonstrates.</p>
<pre>METHOD RNDINTEXCRANGE(minInclusive, maxExclusive)
if minInclusive >= maxExclusive: return error
if minInclusive >=0: return RNDINTRANGE(
minInclusive, maxExclusive - 1)
while true // until a value is returned
ret = RNDINTRANGE(minInclusive, maxExclusive)
if ret < maxExclusive: return ret
end
END METHOD
</pre>
<p><a id="Uniform_Random_Bits"></a></p>
<h3 id="uniform-random-bits">Uniform Random Bits</h3>
<p>The idiom <code>RNDINT((1 << b) - 1)</code> is a naïve way of generating a <strong>uniform random <code>b</code>-bit integer</strong> (with maximum 2<sup><code>b</code></sup> - 1).</p>
<p>In practice, memory is usually divided into <em>bytes</em>, or 8-bit integers in the interval [0, 255]. In this case, a block of memory can be filled with random bits—</p>
<ul>
<li>by setting each byte in the block to <code>RNDINT(255)</code>, or</li>
<li>via a PRNG (or another device or program that simulates a “source of random numbers”), if it outputs one or more 8-bit chunks at a time.</li>
</ul>
<p><a id="Examples_of_Using_the_RNDINT_Family"></a></p>
<h3 id="examples-of-using-the-rndint-family">Examples of Using the <code>RNDINT</code> Family</h3>
<ol>
<li>To choose either −1 or 1 with equal probability (the <em>Rademacher distribution</em>), one of the following idioms can be used: <code>(RNDINT(1) * 2 - 1)</code> or <code>(RNDINTEXC(2) * 2 - 1)</code>.</li>
<li>To generate a random integer that’s divisible by a positive integer (<code>DIV</code>), generate the integer with any method (such as <code>RNDINT</code>), let <code>X</code> be that integer, then generate <code>X - rem(X, DIV)</code> if <code>X >= 0</code>, or <code>X - (DIV - rem(abs(X), DIV))</code> otherwise. (Depending on the method, the resulting integer may be out of range, in which case this procedure is to be repeated.)</li>
<li>A random 2-dimensional point on an N×M grid can be expressed as a single integer as follows:
<ul>
<li>To generate a random N×M point <code>P</code>, generate <code>P = RNDINT(N * M - 1)</code> (<code>P</code> is thus in the interval [0, <code>N * M</code>)).</li>
<li>To convert a point <code>P</code> to its 2D coordinates, generate <code>[rem(P, N), floor(P / N)]</code>. (Each coordinate starts at 0.)</li>
<li>To convert 2D coordinates <code>coord</code> to an N×M point, generate <code>P = coord[1] * N + coord[0]</code>.</li>
</ul>
</li>
<li>To simulate rolling an N-sided die (N greater than 1): <code>RNDINTRANGE(1, N)</code>, which chooses a number in the interval [1, N] with equal probability.</li>
<li>To generate a random integer with one base-10 digit: <code>RNDINTRANGE(0, 9)</code>.</li>
<li>To generate a random integer with N base-10 digits (where N is 2 or greater), where the first digit can’t be 0: <code>RNDINTRANGE(pow(10, N-1), pow(10, N) - 1)</code>.</li>
<li>To choose a number in the interval [<code>mn</code>, <code>mx</code>), with equal probability, in increments equal to <code>step</code>: <code>mn+step*RNDINTEXC(ceil((mx-mn)/(1.0*step)))</code>.</li>
<li>To choose an integer in the interval [0, <code>X</code>) at random:
<ul>
<li>And favor numbers in the middle: <code>floor((RNDINTEXC(X) + RNDINTEXC(X)) / 2)</code>.</li>
<li>And favor high numbers: <code>max(RNDINTEXC(X), RNDINTEXC(X))</code>.</li>
<li>And favor low numbers: <code>min(RNDINTEXC(X), RNDINTEXC(X))</code>.</li>
<li>And strongly favor high numbers: <code>max(RNDINTEXC(X), RNDINTEXC(X), RNDINTEXC(X))</code>.</li>
<li>And strongly favor low numbers: <code>min(RNDINTEXC(X), RNDINTEXC(X), RNDINTEXC(X))</code>.</li>
</ul>
</li>
</ol>
<p><a id="Randomization_Techniques"></a></p>
<h2 id="randomization-techniques">Randomization Techniques</h2>
<p>This section describes commonly used randomization techniques, such as shuffling, selection of several unique items, and creating random strings of text.</p>
<p><a id="Boolean_True_False_Conditions"></a></p>
<h3 id="boolean-truefalse-conditions">Boolean (True/False) Conditions</h3>
<p>To generate a condition that is true at the specified probabilities, use
the following idioms in an <code>if</code> condition:</p>
<ul>
<li>True or false with equal probability: <code>RNDINT(1) == 0</code>.</li>
<li>True with X percent probability: <code>RNDINTEXC(100) < X</code>.</li>
<li>True with probability X/Y (a <em>Bernoulli trial</em>): <code>RNDINTEXC(Y) < X</code>.</li>
<li>True with odds of X to Y: <code>RNDINTEXC(X + Y) < X</code>.</li>
</ul>
<p>The following helper method generates 1 with probability <code>x</code>/<code>y</code> and 0 otherwise:</p>
<pre>METHOD ZeroOrOne(x,y)
if RNDINTEXC(y)<x: return 1
return 0
END METHOD
</pre>
<p>The method can also be implemented in the following way (as pointed out by Lumbroso (2013, Appendix B)<sup id="fnref:6:1" role="doc-noteref"><a href="#fn:6" class="footnote" rel="footnote">6</a></sup>):</p>
<pre>// NOTE: Modified from Lumbroso
// Appendix B to add 'z==0' and error checks
METHOD ZeroOrOne(x,y)
if y <= 0: return error
if x==y: return 1
z = x
while true // until a value is returned
z = z * 2
if z >= y
if RNDINT(1) == 0: return 1
z = z - y
else if z == 0 or RNDINT(1) == 0: return 0
end
END METHOD
</pre>
<blockquote>
<p><strong>Note:</strong> Probabilities can be rational or irrational numbers. Rational probabilities are of the form <code>n</code>/<code>d</code> and can be simulated with <code>ZeroOrOne</code> above. Irrational probabilities (such as <code>exp(-x/y)</code> or <code>ln(2)</code>) have an infinite digit expansion (<code>0.ddddd....</code>), and they require special algorithms to simulate; see “<a href="https://peteroupc.github.io/bernoulli.html#Algorithms_for_General_Irrational_Constants"><strong>Algorithms for General Irrational Constants</strong></a>” and “<a href="https://peteroupc.github.io/bernoulli.html#Algorithms_for_Specific_Constants"><strong>Algorithms for Specific Constants</strong></a>” in my page on Bernoulli Factory algorithms.</p>
<p><strong>Examples</strong>:</p>
<ul>
<li>True with probability 3/8: <code>RNDINTEXC(8) < 3</code>.</li>
<li>True with odds of 100 to 1: <code>RNDINTEXC(101) < 1</code>.</li>
<li>True with 20% probability: <code>RNDINTEXC(100) < 20</code>.</li>
<li>To generate a random integer in [0, <code>y</code>), or -1 instead if that number would be less than <code>x</code>, using fewer random bits than the naïve approach: <code>if ZeroOrOne(x, y) == 1: return -1; else: return RNDINTEXCRANGE(x, y)</code>.</li>
</ul>
</blockquote>
<p><a id="Random_Sampling"></a></p>
<h3 id="random-sampling">Random Sampling</h3>
<p>This section contains ways to choose one or more items from among a collection of them, where each item in the collection has the same chance to be chosen as any other. This is called <em>random sampling</em> and can be done <em>with replacement</em> or <em>without replacement</em>.</p>
<p><a id="Sampling_With_Replacement_Choosing_a_Random_Item_from_a_List"></a></p>
<h4 id="sampling-with-replacement-choosing-a-random-item-from-a-list">Sampling With Replacement: Choosing a Random Item from a List</h4>
<p><em>Sampling with replacement</em> essentially means taking a random item and putting it back. To choose a random item from a list—</p>
<ul>
<li>whose size is known in advance, use the idiom <code>list[RNDINTEXC(size(list))]</code>; or</li>
<li>whose size is not known in advance, generate <code>RandomKItemsFromFile(file, 1)</code>, in <a href="#Pseudocode_for_Random_Sampling"><strong>pseudocode given later</strong></a> (the result will be a 1-item list or be an empty list if there are no items).</li>
</ul>
<p><a id="Sampling_Without_Replacement_Choosing_Several_Unique_Items"></a></p>
<h4 id="sampling-without-replacement-choosing-several-unique-items">Sampling Without Replacement: Choosing Several Unique Items</h4>
<p><em>Sampling without replacement</em> essentially means taking a random item <em>without</em> putting it back. There are several approaches for doing a uniform random choice of <code>k</code> unique items or values from among <code>n</code> available items or values, depending on such things as whether <code>n</code> is known and how big <code>n</code> and <code>k</code> are.</p>
<ol>
<li><strong>If <code>n</code> is not known in advance:</strong> Use the <em>reservoir sampling</em> method; see the <code>RandomKItemsFromFile</code> method, in <a href="#Pseudocode_for_Random_Sampling"><strong>pseudocode given later</strong></a>.</li>
<li><strong>If <code>n</code> is relatively small (for example, if there are 200 available items, or there is a range of numbers from 0 through 200 to choose from):</strong>
<ul>
<li>If items have to be chosen from a list <strong>in relative (index) order</strong>, or if <code>n</code> is 1, then use <code>RandomKItemsInOrder</code> (given later).</li>
<li>Otherwise, if the order of the sampled items is unimportant, and each item can be derived from its <em>index</em> (the item’s position as an integer starting at 0) without looking it up in a list: Use the <code>RandomKItemsFromFile</code> method.<sup id="fnref:14" role="doc-noteref"><a href="#fn:14" class="footnote" rel="footnote">14</a></sup></li>
<li>Otherwise, if <code>k</code> is much smaller than <code>n</code>, proceed as in item 3 instead.</li>
<li>Otherwise, any of the following will choose <code>k</code> items in random order:
<ul>
<li>Store all the items in a list, <a href="#Shuffling"><strong>shuffle</strong></a> that list, then choose the first <code>k</code> items from that list.</li>
<li>If the items are already stored in a list and the list’s order can be changed, then shuffle that list and choose the first <code>k</code> items from the shuffled list.</li>
<li>If the items are already stored in a list and the list’s order can’t be changed, then store the indices to those items in another list, shuffle the latter list, then choose the first <code>k</code> indices (or the items corresponding to those indices) from the latter list.</li>
</ul>
</li>
</ul>
</li>
<li><strong>If <code>k</code> is much smaller than <code>n</code> and the order of the items must be random or is unimportant:</strong>
<ol>
<li><strong>If the items are stored in a list whose order can be changed:</strong> Do a <em>partial shuffle</em> of that list, then choose the <em>last</em> <code>k</code> items from that list. A <em>partial shuffle</em> proceeds as given in the section “<a href="#Shuffling"><strong>Shuffling</strong></a>”, except the partial shuffle stops after <code>k</code> swaps have been made (where swapping one item with itself counts as a swap).</li>
<li><strong>Otherwise</strong>: Create another empty list <code>newlist</code>, and create a key/value data structure such as a hash table. Then, for each integer <code>i</code> in the interval [0, <em>k</em>−1], do <code>j = RNDINTEXC(n-i); AddItem(newlist, HGET(j,j)); HSET(j,HGET(n-i-1,n-i-1))</code>, where <code>HSET(k,v)</code> sets the item with key <code>k</code> in the hash table to <code>v</code>, and <code>HGET(k,v)</code> gets the item with key <code>k</code> in that table, or returns <code>v</code> if there is no such item (Ting 2021)<sup id="fnref:15" role="doc-noteref"><a href="#fn:15" class="footnote" rel="footnote">15</a></sup>. The new list stores the indices to the chosen items, in random order.</li>
</ol>
</li>
<li><strong>If <code>n - k</code> is much smaller than <code>n</code>, the items are stored in a list, and the order of the sampled items is unimportant:</strong>
<ol>
<li><strong>If the list’s order can be changed:</strong> Do a <em>partial shuffle</em> of that list, except that <code>n-k</code> rather than <code>k</code> swaps are done, then choose the <em>first</em> <code>k</code> items from that list. (Note 5 in “Shuffling” can’t be used.)</li>
<li>Otherwise, <strong>if <code>n</code> is not very large (for example, less than 5000):</strong> Store the indices to those items in another list, do a <em>partial shuffle</em> of the latter list, except that <code>n-k</code> rather than <code>k</code> swaps are done, then choose the <em>first</em> <code>k</code> indices (or the items corresponding to those indices) from the latter list. (Note 5 in “Shuffling” can’t be used.)</li>
<li><strong>Otherwise</strong>: Proceed as in item 5 instead.</li>
</ol>
</li>
<li><strong>Otherwise (for example, if 32-bit or larger integers will be chosen so that <code>n</code> is 2<sup>32</sup>, or if <code>n</code> is otherwise very large):</strong>
<ul>
<li>If the items have to be chosen <strong>in relative (index) order</strong>: Let <code>n2 = floor(n/2)</code>. Generate <code>h = PolyaEggenberger(k, n2, n, -1)</code>. Sample <code>h</code> integers in relative order from the list <code>[0, 1, ..., n2 - 1]</code> by doing a recursive run of this algorithm (items 1 to 5), then sample <code>k - h</code> integers in relative order from the list <code>[n2, n2 + 1, ..., n - 1]</code> by running this algorithm recursively. The integers chosen this way are the indices to the desired items in relative (index) order (Sanders et al. 2019)<sup id="fnref:16" role="doc-noteref"><a href="#fn:16" class="footnote" rel="footnote">16</a></sup>.</li>
<li>Otherwise, create a data structure to store the indices to items already chosen. When a new index to an item is randomly chosen, add it to the data structure if it’s not already there, or if it is, choose a new random index. Repeat this process until <code>k</code> indices were added to the data structure this way. Examples of suitable data structures are—
<ul>
<li>a <a href="https://en.wikipedia.org/wiki/Hash_table"><strong>hash table</strong></a>,</li>
<li>a compressed bit set (e.g, “roaring bitmap”, EWAH), and</li>
<li>a self-sorting data structure such as a <a href="https://en.wikipedia.org/wiki/Red%E2%80%93black_tree"><strong>red–black tree</strong></a>, if the random items are to be retrieved in sorted order or in index order.</li>
</ul>
<p>Many applications require generating “unique random” values to identify database records or other shared resources, among other reasons. For ways to generate such values, see my <a href="https://peteroupc.github.io/random.html#Unique_Random_Identifiers"><strong>recommendation document</strong></a>.</p>
</li>
</ul>
</li>
</ol>
<p><a id="Shuffling"></a></p>
<h4 id="shuffling">Shuffling</h4>
<p>The <a href="https://en.wikipedia.org/wiki/Fisher-Yates_shuffle"><strong>Fisher–Yates shuffle method</strong></a> shuffles a list (puts its items in a random order) such that all permutations (arrangements) of that list occur with the same probability. However, that method is also easy to write incorrectly — see also (Atwood 2007)<sup id="fnref:17" role="doc-noteref"><a href="#fn:17" class="footnote" rel="footnote">17</a></sup>. The following pseudocode is designed to shuffle a list’s contents.</p>
<pre>METHOD Shuffle(list)
// NOTE: Check size of the list early to prevent
// `i` from being less than 0 if the list's size is 0 and
// `i` is implemented using a nonnegative integer
// type available in certain programming languages.
if size(list) >= 2
// Set i to the last item's index
i = size(list) - 1
while i > 0
// Choose an item ranging from the first item
// up to the item given in `i`. Observe that the item
// at i+1 is excluded.
j = RNDINTEXC(i + 1)
// Swap item at index i with item at index j;
// in this case, i and j may be the same
tmp = list[i]
list[i] = list[k]
list[k] = tmp
// Move i so it points to the previous item
i = i - 1
end
end
// NOTE: An implementation can return the
// shuffled list, as is done here, but this is not required.
return list
END METHOD
</pre>
<blockquote>
<p><strong>Notes:</strong></p>
<ol>
<li><code>j = RNDINTEXC(i + 1)</code> can’t be replaced with <code>j = RNDINTEXC(size(list))</code> since it can cause certain permutations to be slightly more probable than others. If that line is replaced with <code>j = RNDINTEXC(i)</code>, the result is Sattolo’s algorithm (which chooses from among permutations with cycles), rather than a Fisher–Yates shuffle.</li>
<li>When it comes to shuffling, the choice of pseudorandom number generator (or whatever is simulating a “source of random numbers”) is important; see my <a href="https://peteroupc.github.io/random.html#Shuffling"><strong>recommendation document on shuffling</strong></a>.</li>
<li>A shuffling algorithm that can be carried out in parallel is described in (Bacher et al., 2015)<sup id="fnref:18" role="doc-noteref"><a href="#fn:18" class="footnote" rel="footnote">18</a></sup>.</li>
<li>A <em>derangement</em> is a permutation where every item moves to a different position. A random derangement can be generated as follows (Merlini et al. 2007)<sup id="fnref:19" role="doc-noteref"><a href="#fn:19" class="footnote" rel="footnote">19</a></sup>: (1) modify <code>Shuffle</code> by adding the following line after <code>j = RNDINTEXC(i + 1)</code>: <code>if i == list[j]: return nothing</code>, and changing <code>while i > 0</code> to <code>while i >= 0</code>; (2) use the following pseudocode with the modified <code>Shuffle</code> method: <code>while True; list = []; for i in 0...n: AddItem(list, i); s=Shuffle(list); if s!=nothing: return s; end</code>.</li>
<li>Ting (2021)<sup id="fnref:15:1" role="doc-noteref"><a href="#fn:15" class="footnote" rel="footnote">15</a></sup> showed how to reduce the space complexity of shuffling via a <em>hash table</em>: Modify <code>Shuffle</code> as follows: (1) Create a hash table at the start of the method; (2) instead of the swap <code>tmp = list[i]; list[i] = list[j]; list[j] = tmp</code>, use the following: <code>list[i] = HGET(j,j); HSET(k,HGET(i,i)); if k==i: HDEL(i)</code>, where <code>HSET(k,v)</code> sets the item with key <code>k</code> in the hash table to <code>v</code>; <code>HGET(k,v)</code> gets the item with key <code>k</code> in that table, or returns <code>v</code> if there is no such item; and <code>HDEL(k)</code> deletes the item with key <code>k</code> from that table. (The hash table can instead be any key/value map structure, including a red–black tree. This can be combined with note 4 to generate derangements, except replace <code>list[j]</code> in note 4 with <code>HGET(j,j)</code>.)</li>
</ol>
</blockquote>
<p><a id="Random_Character_Strings"></a></p>
<h4 id="random-character-strings">Random Character Strings</h4>
<p>To generate a random string of characters:</p>
<ol>
<li>Prepare a list of the characters (including letters or digits) the string can have. Examples are given later in this section.</li>
<li>Build a new string whose characters are chosen at random from that character list. The method, shown in the pseudocode below, demonstrates this. The method samples characters at random with replacement, and returns a list of the sampled characters. (How to convert this list to a text string depends on the programming language and is outside the scope of this page.) The method takes two parameters: <code>characterList</code> is the list from step 1, and <code>stringSize</code> is the number of random characters.</li>
</ol>
<p> </p>
<pre>METHOD RandomString(characterList, stringSize)
i = 0
newString = NewList()
while i < stringSize
// Choose a character from the list
randomChar = characterList[RNDINTEXC(size(characterList))]
// Add the character to the string
AddItem(newString, randomChar)
i = i + 1
end
return newString
END METHOD
</pre>
<p>The following are examples of character lists:</p>
<ol>
<li>For an <em>alphanumeric string</em>, or string of letters and digits, the characters can be the basic digits “0” to “9” (U+0030-U+0039, nos. 48-57), the basic uppercase letters “A” to “Z” (U+0041-U+005A, nos. 65-90), and the basic lowercase letters “a” to “z” (U+0061-U+007A, nos. 96-122), as given in the Unicode Standard.</li>
<li>For a base-10 digit string, the characters can be the basic digits only.</li>
<li>For a base-16 digit (hexadecimal) string, the characters can be the basic digits as well as the basic letters “A” to “F” or “a” to “f” (not both).</li>
</ol>
<blockquote>
<p><strong>Notes:</strong></p>
<ol>
<li>If the list of characters is fixed, the list can be created in advance at runtime or compile time, or (if every character takes up the same number of code units) a string type as provided in the programming language can be used to store the list as a string.</li>
<li><strong>Unique random strings:</strong> Generating character strings that are not only random, but also unique, can be done by storing a list (such as a hash table) of strings already generated and checking newly generated strings against that list. However, if the unique values will identify something, such as database records or user accounts, an application may care about the choice of PRNG (or other device or program that simulates a “source of random numbers”), so that using <em>random</em> unique values might not be best; see my <a href="https://peteroupc.github.io/random.html#Unique_Random_Identifiers"><strong>recommendation document</strong></a>.</li>
<li><strong>Word generation:</strong> This technique could also be used to generate “pronounceable” words, but this is less flexible than other approaches; see also “<a href="#Markov_Chains"><strong>Markov Chains</strong></a>”.</li>
</ol>
</blockquote>
<p><a id="Pseudocode_for_Random_Sampling"></a></p>
<h4 id="pseudocode-for-random-sampling">Pseudocode for Random Sampling</h4>
<p>The following pseudocode implements two methods:</p>
<ol>
<li><code>RandomKItemsFromFile</code> implements <a href="https://en.wikipedia.org/wiki/Reservoir_sampling"><strong><em>reservoir sampling</em></strong></a>; it chooses up to <code>k</code> random items from a file of indefinite size (<code>file</code>). Although the pseudocode refers to files and lines, the technique works whenever items are retrieved one at a time from a data set or list whose size is not known in advance. In the pseudocode, <code>ITEM_OUTPUT(item, thisIndex)</code> is a placeholder for code that returns the item to store in the list; this can include the item’s value, its index starting at 0, or both.</li>
<li><code>RandomKItemsInOrder</code> returns a list of up to <code>k</code> random items from the specified list (<code>list</code>), in the order in which they appeared in the list. It is based on a technique presented in (Devroye 1986)<sup id="fnref:20" role="doc-noteref"><a href="#fn:20" class="footnote" rel="footnote">20</a></sup>, p. 620.</li>
</ol>
<p> </p>
<pre>METHOD RandomKItemsFromFile(file, k)
list = NewList()
j = 0
index = 0
while true // until a value is returned
// Get the next line from the file
item = GetNextLine(file)
thisIndex = index
index = index + 1
// If the end of the file was reached, break
if item == nothing: break
// NOTE: The following line is OPTIONAL
// and can be used to choose only random lines
// in the file that meet certain criteria,
// expressed as MEETS_CRITERIA below.
// ------
// if not MEETS_CRITERIA(item): continue
// ------
if j < k // phase 1 (fewer than k items)
AddItem(list, ITEM_OUTPUT(item, thisIndex))
j = j + 1
else // phase 2
j = RNDINT(thisIndex)
if j < k: list[j] = ITEM_OUTPUT(item, thisIndex)
end
end
// NOTE: Shuffle at the end in case k or
// fewer lines were in the file, since in that
// case the items would appear in the same
// order as they appeared in the file
// if the list weren't shuffled. This line
// can be removed, however, if the order of
// the items in the list is unimportant.
if size(list)>=2: Shuffle(list)
return list
end
METHOD RandomKItemsInOrder(list, k)
n = size(list)
// Special case if k is 1
if k==1: return [list[RNDINTEXC(n)]]
i = 0
kk = k
ret = NewList()
while i < n and size(ret) < k
u = RNDINTEXC(n - i)
if u <= kk
AddItem(ret, list[i])
kk = kk - 1
end
i = i + 1
end
return ret
END METHOD
</pre>
<blockquote>
<p><strong>Examples:</strong></p>
<ol>
<li>Removing <code>k</code> random items from a list of <code>n</code> items (<code>list</code>) is equivalent to generating a new
list by <code>RandomKItemsInOrder(list, n - k)</code>.</li>
<li><strong>Filtering:</strong> If an application needs to sample the same list (with or without replacement) repeatedly, but only from among a selection of that list’s items, it can create a list of items it wants to sample from (or a list of indices to those items), and sample from the new list instead.<sup id="fnref:21" role="doc-noteref"><a href="#fn:21" class="footnote" rel="footnote">21</a></sup> This won’t work well, though, for lists of indefinite or very large size.</li>
</ol>
</blockquote>
<p><a id="Rejection_Sampling"></a></p>
<h3 id="rejection-sampling">Rejection Sampling</h3>
<p><em>Rejection sampling</em> is a simple and flexible approach for generating random content that meets certain requirements. To implement rejection sampling:</p>
<ol>
<li>Generate the random content (such as a number or text string) by any method and with any distribution and range.</li>
<li>If the content doesn’t meet predetermined criteria, go to step 1.</li>
</ol>
<p>Example criteria include checking—
- whether a number generated this way—
- is not less than a minimum threshold (<em>left-truncation</em>),
- is not greater than a maximum threshold (<em>right-truncation</em>),
- is prime,
- is divisible or not by certain numbers,
- is not among numbers chosen recently by the sampling method,
- was not already chosen (with the aid of a hash table, red–black tree, or similar structure),
- was not chosen more often in a row than desired, or
- is not included in a “blacklist” of numbers,
- whether a point generated this way is sufficiently distant from previous random points (with the aid of a KD-tree or similar structure),
- whether a point generated this way lies in a simple or complex shape,
- whether a text string generated this way matches a regular expression, or
- two or more of the foregoing criteria.</p>
<p>(KD-trees, hash tables, red–black trees, prime-number testing algorithms, and regular expressions are outside the scope of this document.)</p>
<blockquote>
<p><strong>Notes:</strong></p>
<ol>
<li>The running time for rejection sampling depends on the acceptance rate, that is, how often the sampler accepts a sampled outcome rather than rejecting it. In general, this rate is the number of acceptable outcomes divided by the total number of outcomes.</li>
<li>All rejection sampling strategies have a chance to reject data, so they all have a <em>variable running time</em> (in fact, they could run indefinitely). But graphics processing units (GPUs) and other devices that run multiple tasks at once work better if all the tasks finish their work at the same time. This is not possible if they all implement a rejection sampling technique because of its variable running time. If each iteration of the rejection sampler has a low rejection rate, one solution is to have each task run one iteration of the sampler, with its own “source of random numbers” (such as numbers generated from its own PRNG), then to take the first sampled number that hasn’t been rejected this way by a task (which can fail at a very low rate).<sup id="fnref:22" role="doc-noteref"><a href="#fn:22" class="footnote" rel="footnote">22</a></sup></li>
</ol>
</blockquote>
<p><a id="Random_Walks"></a></p>
<h3 id="random-walks">Random Walks</h3>
<p>A <em>random walk</em> is a process with random behavior over time. A simple form of random walk involves choosing, at random, a number that changes the state of the walk. The pseudocode below generates a random walk with <em>n</em> steps, where <code>STATEJUMP()</code> is the next number to add to the current state (see examples later in this section).</p>
<pre>METHOD RandomWalk(n)
// Create a new list with an initial state
list=[0]
// Add 'n' new numbers to the list.
for i in 0...n: AddItem(list, list[i] + STATEJUMP())
return list
END METHOD
</pre>
<blockquote>
<p><strong>Notes:</strong></p>
<ol>
<li>A <strong>white noise process</strong> is simulated by creating a list of independent random variates generated in the same way. Such a process generally models behavior over time that does not depend on the time or the current state. One example is <code>ZeroOrOne(px,py)</code> (for modeling a <em>Bernoulli process</em>, where each number is either 1 with probability <code>px</code>/<code>py</code> or 0 otherwise).</li>
<li>A useful reference here is De Bruyne et al. (2021)<sup id="fnref:23" role="doc-noteref"><a href="#fn:23" class="footnote" rel="footnote">23</a></sup>.</li>
</ol>
</blockquote>
<blockquote>
<p><strong>Examples:</strong></p>
<ol>
<li>If <code>STATEJUMP()</code> is <code>RNDINT(1) * 2 - 1</code>, the random walk generates numbers that each differ from the last by -1 or 1, chosen at random.</li>
<li>If <code>STATEJUMP()</code> is <code>ZeroOrOne(px,py) * 2 - 1</code>, the random walk generates numbers that each differ from the last by either 1 with probability <code>px</code>/<code>py</code> or −1 otherwise.</li>
<li><strong>Binomial process:</strong> If <code>STATEJUMP()</code> is <code>ZeroOrOne(px,py)</code>, the random walk advances the state with probability <code>px</code>/<code>py</code>.</li>
</ol>
</blockquote>
<p><a id="Random_Dates_and_Times"></a></p>
<h3 id="random-dates-and-times">Random Dates and Times</h3>
<p>Pseudocode like the following can be used to choose a <strong>random date-and-time</strong> bounded by two dates-and-times (<code>date1</code>, <code>date2</code>): <code>dtnum1 = DATETIME_TO_NUMBER(date1); dtnum2 = DATETIME_TO_NUMBER(date2); num = RNDINTRANGE(date1, date2); result = NUMBER_TO_DATETIME(num)</code>.</p>
<p>In that pseudocode, <code>DATETIME_TO_NUMBER</code> and <code>NUMBER_TO_DATETIME</code> convert a date-and-time to or from a number, respectively, at the required granularity, for instance, month, day, or hour granularity (the details of such conversion depend on the date-and-time format and are outside the scope of this document). Instead of <code>RNDINTRANGE(date1, date2)</code>, any other random selection strategy can be used.</p>
<p><a id="Randomization_in_Statistical_Testing"></a></p>
<h3 id="randomization-in-statistical-testing">Randomization in Statistical Testing</h3>
<p>Statistical testing uses shuffling and <em>bootstrapping</em> to help draw conclusions on data through randomization.</p>
<ul>
<li><a href="#Shuffling"><strong>Shuffling</strong></a> is used when each item in a data set belongs to one of several mutually exclusive groups. Here, one or more <strong>simulated data sets</strong> are generated by shuffling the original data set and regrouping each item in the shuffled data set in order, such that the number of items in each group for the simulated data set is the same as for the original data set.</li>
<li><a href="https://en.wikipedia.org/wiki/Bootstrapping_%28statistics%29"><strong><em>Bootstrapping</em></strong></a> is a method of creating one or more random samples (simulated data sets) of an existing data set, where the items in each sample are chosen <a href="#Sampling_With_Replacement_Choosing_a_Random_Item_from_a_List"><strong>at random with replacement</strong></a>. (Each random sample can contain duplicates this way.) See also (Brownlee 2018)<sup id="fnref:24" role="doc-noteref"><a href="#fn:24" class="footnote" rel="footnote">24</a></sup>.</li>
</ul>
<p>After creating the simulated data sets, one or more statistics, such as the mean, are calculated for each simulated data set as well as the original data set, then the statistics for the simulated data sets are compared with those of the original (such comparisons are outside the scope of this document).</p>
<p><a id="Markov_Chains"></a></p>
<h3 id="markov-chains">Markov Chains</h3>
<p>A <a href="https://en.wikipedia.org/wiki/Markov_chain"><strong>Markov chain</strong></a> models one or more <em>states</em> (for example, individual letters or syllables), and stores the probabilities to transition from one state to another (for example, “b” to “e” with a chance of 20 percent, or “b” to “b” with a chance of 1 percent). Thus, each state can be seen as having its own list of <em>weights</em> for each relevant state transition (see “<a href="#Weighted_Choice_With_Replacement"><strong>Weighted Choice With Replacement</strong></a>). For example, a Markov chain for generating <strong>“pronounceable” words</strong>, or words similar to natural-language words, can include “start” and “stop” states for the start and end of the word, respectively.</p>
<p>An algorithm called <em>coupling from the past</em> (Propp and Wilson 1996)<sup id="fnref:25" role="doc-noteref"><a href="#fn:25" class="footnote" rel="footnote">25</a></sup> can sample a state from a Markov chain’s <em>stationary distribution</em>, that is, the chain’s steady state, by starting multiple chains at different states and running them until they all reach the same state at the same time. However, stopping the algorithm early can lead to an inexact algorithm unless precautions are taken (Fill 1998)<sup id="fnref:26" role="doc-noteref"><a href="#fn:26" class="footnote" rel="footnote">26</a></sup>. The algorithm works correctly if the chain has a finite number of states and is not periodic, and each state is reachable from every other.</p>
<p>The following pseudocode implements coupling from the past. In the method, <code>StateCount</code> is the number of states in the Markov chain, <code>UPDATE(chain, state, random)</code> transitions the Markov chain to the next state given the current state and random variates, and <code>RANDOM()</code> generates one or more random variates needed by <code>UPDATE</code>.</p>
<pre>METHOD CFTP(chain)
states=[]
numstates=StateCount(chain)
done=false
randoms=[]
while not done
// Start multiple chains at different states. NOTE:
// If the chain is monotonic (meaning the states
// are ordered and, whenever state A is less
// than state B, A's next state is never higher than
// B's next state), then just two chains can be
// created instead, starting
// at the first and last state, respectively.
for i in 0...numstates: AddItem(states, i)
// Update each chain with the same randomness
AddItem(randoms, RANDOM())
for k in 0...size(randoms):
for i in 0...numstates: states[i]=
UPDATE(chain, states[i], randoms[size(randoms)-1-k])
end
// Stop when all states are the same
fs=states[0]
done=true
for i in 1...numstates: done=(done and states[i]==fs)
end
return states[0] // Return the steady state
END METHOD
</pre>
<blockquote>
<p><strong>Note:</strong> A <strong>discrete phase-type distribution</strong> consists of a Markov chain, a start state, and an end state. It models the (random) number of steps, minus 1, needed for the Markov chain to move from the start state to the end state.</p>
</blockquote>
<p><a id="Random_Graphs"></a></p>
<h3 id="random-graphs">Random Graphs</h3>
<p>A <em>graph</em> is a listing of points and the connections between them. The points are called <em>vertices</em> and the connections, <em>edges</em>.</p>
<p>A convenient way to represent a graph is an <em>adjacency matrix</em>. This is an <em>n</em>×<em>n</em> matrix with <em>n</em> rows and <em>n</em> columns (signifying <em>n</em> vertices in the graph). For simple graphs, an adjacency matrix contains only 1s and 0s — for the cell at row <em>r</em> and column <em>c</em>, a 1 means there is an edge pointing from vertex <em>r</em> to vertex <em>c</em>, and a 0 means there are none.</p>
<p>In this section, <code>Zeros(n)</code> creates an <code>n</code>×<code>n</code> zero matrix (such as a list consisting of <code>n</code> lists, each of which contains <code>n</code> zeros).</p>
<p>The following method generates a random <code>n</code>-vertex graph that follows the model G(<em>n</em>, <em>p</em>) (also known as the <em>Gilbert model</em> (Gilbert 1959)<sup id="fnref:27" role="doc-noteref"><a href="#fn:27" class="footnote" rel="footnote">27</a></sup>), where each edge is drawn with probability <code>px</code>/<code>py</code> (Batagelj and Brandes 2005)<sup id="fnref:28" role="doc-noteref"><a href="#fn:28" class="footnote" rel="footnote">28</a></sup>.</p>
<pre>METHOD GNPGraph(n, px, py)
graph=Zeros(n)
for i in 2...n
j = i
while j > 0
j = j - 1 - min(NegativeBinomialInt(1, px, py), j - 1)
if j > 0
// Build an edge
graph[i][j]=1
graph[j][i]=1
end
end