Commit | Line | Data |
---|---|---|
080ea4c0 SBS |
1 | <TeXmacs|2.1.1> |
2 | ||
3 | <style|generic> | |
4 | ||
5 | <\body> | |
6 | <doc-data|<doc-title|bkshuf: A Shuf-Like Utility with Pre-Image Resistance | |
7 | and Relative Order Preservation for Random Sampling of Long | |
8 | Lists>|<doc-author|<author-data|<author-name|Steven Baltakatei | |
b9e8b771 | 9 | Sandoval>>>|<doc-date|2023-02-14T13:56+00>|<doc-misc|CC BY-SA 4.0>> |
080ea4c0 SBS |
10 | |
11 | <section|Summary> | |
12 | ||
13 | <shell|bkshuf> is a <shell|shuf>-like utility designed to output randomly | |
14 | sized groups of lines with a group size distribution centered around some | |
15 | characteristic value. | |
16 | ||
17 | <section|Objective> | |
18 | ||
19 | The author desires to create a <shell|shuf>-like utility named | |
20 | <shell|bkshuf> to mix line lists in order to produce output line lists with | |
21 | the following somewhat conflicting properties: | |
22 | ||
23 | <\description> | |
24 | <item*|Pre-image resistance (PIR)>An output line's position should not | |
25 | contain information about its input line position. | |
26 | ||
27 | <item*|Relative order preservation (ROP)>Two neighboring lines in the | |
28 | input stream should have a high probability of remaining neighbors in the | |
29 | output stream. | |
30 | </description> | |
31 | ||
32 | The objective is to improve the value of a short random scan of a small | |
33 | fraction of a potentially large input list; output that demonstrates ROP as | |
34 | well as some degree of PIR may achieve this objective. In contrast, the | |
35 | <shell|shuf> utility provides PIR but no ROP: a line's neighbor in the | |
36 | output of <shell|shuf> is equally likely to be any other random line from | |
37 | the input.\ | |
38 | ||
39 | In other words, output produced by <shell|bkshuf> should group together | |
40 | sequential segments of the input lines in order to partially preserve | |
41 | relationships that may exist between sequential files. For example, this | |
42 | could be done by jumping to a random position in the input lines, consuming | |
43 | (i.e. reading, outputting, and marking a line not to be read again) some | |
44 | amount of sequential lines, then repeating the process until every line is | |
45 | consumed. The amount of sequential lines to read between jumps affects how | |
46 | well the above desired properties are satisfied. | |
47 | ||
48 | The objective of <shell|bkshuf> is not to completely prevent the | |
49 | possibility of reassembling the input given the output. Additionally, a | |
50 | valuable property desired of <shell|bkshuf> is output which demonstrates | |
51 | sufficiently high PIR compared to ROP such that only a short (compared to | |
52 | the logarithm of the input list size) sequential scan of the output list | |
53 | from a random starting position is required before a jump to a new group is | |
54 | is encountered; this would permit the overal contents of very large input | |
55 | line lists to be sampled. | |
56 | ||
57 | <section|Design> | |
58 | ||
59 | <subsection|Definitions> | |
60 | ||
61 | <\eqnarray*> | |
62 | <tformat|<table|<row|<cell|l>|<cell|:>|<cell|<text|number of | |
63 | lines>>>|<row|<cell|l<rsub|<text|in>>>|<cell|:>|<cell|<text|input line | |
64 | count>>>|<row|<cell|l<rsub|<text|out>>>|<cell|:>|<cell|<text|output line | |
65 | count>>>|<row|<cell|c>|<cell|:>|<cell|<text|target group | |
66 | count>>>|<row|<cell|s>|<cell|:>|<cell|<text|target group | |
67 | size>>>|<row|<cell|p<rsub|<text|seq>><rsub|<text|>>>|<cell|:>|<cell|<text|probability | |
68 | to include next sequential line>>>|<row|<cell|s<around*|(|l<rsub|<text|in>,0>|)>>|<cell|:>|<cell|<text|<math|target | |
69 | group size parameter>>>>|<row|<cell|l<rsub|<text|in>,0>>|<cell|:>|<cell|<text|input | |
70 | line count parameter>>>>> | |
71 | </eqnarray*> | |
72 | ||
73 | <subsection|Process> | |
74 | ||
75 | <\enumerate-numeric> | |
76 | <item>Acquire and count input lines (via <shell|/dev/stdin> or positional | |
77 | arguments). | |
78 | ||
79 | <item>Calculate line count <math|l<rsub|<text|in>>> . | |
80 | ||
81 | <item>Calculate target group size <math|s>. | |
82 | ||
83 | <item>Select random unconsumed input line and consume it to | |
84 | output.<label|jump-to-random> | |
85 | ||
86 | <item>Consume the next sequential line with probability | |
87 | <math|p<rsub|<text|seq>>>. Otherwise if some input lines remain | |
88 | unconsumed, go to step <reference|jump-to-random>. Otherwise, exit. | |
89 | </enumerate-numeric> | |
90 | ||
91 | <subsection|Parameter analysis> | |
92 | ||
93 | <subsubsection|Target group size calculation> | |
94 | ||
95 | The simultaneous presence of ROP and PIR properties in the output depends | |
96 | upon the amount of sequential lines that are read before <shell|bkshuf> | |
97 | jumps to a new random position in the input list. This amount is the | |
98 | <em|target group size>, <math|s>; it is the \Ptarget\Q since <math|s> | |
99 | represents the average of a distribution of group sizes that may be | |
100 | selected, not a single group size. In this analysis, the total number of | |
101 | lines in the input list is <math|l<rsub|<text|in>>>. For small input line | |
102 | counts, (e.g. <math|l<rsub|<text|in>>\<cong\>10>) the target group size | |
103 | should be nearly one (e.g. <math|s\<cong\>1>) since group sizes any larger | |
104 | than this would have almost no PIR (e.g. a group size of <math|s=8> for | |
105 | <math|l<rsub|<text|in>>=10> would be 80% identical to the input). For | |
106 | modest line input counts (e.g. <math|l<rsub|<text|in>>\<cong\>100>), the | |
107 | target group size may be allowed to be larger, such as a tenth of the input | |
108 | line count (e.g. <math|s\<cong\>10>); this would provide some PIR | |
109 | (approximately <math|10!> permutations between the approximately | |
110 | <math|<frac|l<rsub|<text|in>>|s>\<cong\><frac|100|10>\<cong\>10> groups) | |
111 | while each line in groups around size <math|10> would have a low | |
112 | probability of not being next to its neighbor (<math|8> of the 10 lines | |
113 | would retain the same two neighbors while the two ends would retain one | |
114 | each). For very large input line counts (e.g. | |
115 | <math|l<rsub|<text|in>>\<cong\>1<separating-space|0.2em>000<separating-space|0.2em>000>), | |
116 | however, breaking up and randomizing the input into ten groups of | |
117 | <math|100<separating-space|0.2em>000> offers very little PIR; the benefit | |
118 | of the very high ROP is also lost since sequential scanning of tens of | |
119 | thousands of lines is required before a random jump to a new group may be | |
120 | encountered; therefore, the target group size should be a much smaller | |
121 | fraction of <math|l<rsub|<text|in>>>, (e.g. <math|s\<cong\>20>) while still | |
122 | increasing. The relationship between a desireable target group size | |
123 | <math|s> and the input line count <math|l<rsub|<text|in>>> is non-linear. | |
124 | The author believes a reasonable approach is to scale the group size to the | |
125 | logarithm of input line count. | |
126 | ||
127 | <hgroup|Figure <reference|fig ex-plot-s>> shows an example plot of | |
128 | <math|s<around*|(|l<rsub|<text|in>>|)>> that is tuned to achieve a target | |
129 | group size of <math|s<around*|(|l<rsub|<text|in>>=10<rsup|6>|)>=25> for an | |
130 | input list length of <math|l<rsub|<text|in>>=10<rsup|6>> lines.\ | |
131 | ||
132 | <\big-figure|<with|gr-mode|<tuple|edit|point>|gr-frame|<tuple|scale|1.18922cm|<tuple|0.299593gw|0.120812gh>>|gr-geometry|<tuple|geometry|1par|0.6par>|gr-grid|<tuple|cartesian|<point|0|0>|1>|gr-grid-old|<tuple|cartesian|<point|0|0>|1>|gr-edit-grid-aspect|<tuple|<tuple|axes|none>|<tuple|1|none>|<tuple|10|none>>|gr-edit-grid|<tuple|cartesian|<point|0|0>|1>|gr-edit-grid-old|<tuple|cartesian|<point|0|0>|1>|gr-grid-aspect-props|<tuple|<tuple|axes|#808080>|<tuple|1|#c0c0c0>|<tuple|10|pastel | |
133 | blue>>|gr-grid-aspect|<tuple|<tuple|axes|#808080>|<tuple|1|#c0c0c0>|<tuple|10|pastel | |
134 | blue>>|magnify|1.18920711463847|gr-auto-crop|false|<graphics||<math-at|5|<point|-0.221848749616356|1.0>>|<math-at|10|<point|-0.397940008672038|2.0>>|<math-at|15|<point|-0.397940008672038|3.0>>|<math-at|20|<point|-0.397940008672038|4.0>>|<math-at|25|<point|-0.397940008672038|5.0>>|<math-at|30|<point|-0.397940008672038|6.0>>|<math-at|s|<point|0.0719460474170896|6.34360008343183>>|<point|0|0.2>|<point|6|5>|<math-at|10<rsup|1>|<point|1.0|-0.4>>|<math-at|10<rsup|2>|<point|2.0|-0.4>>|<math-at|10<rsup|3>|<point|3.0|-0.4>>|<math-at|10<rsup|4>|<point|4.0|-0.4>>|<math-at|10<rsup|5>|<point|5.0|-0.4>>|<math-at|10<rsup|6>|<point|6.0|-0.4>>|<math-at|1|<point|1.0|-0.8>>|<math-at|2|<point|2.0|-0.8>>|<math-at|3|<point|3.0|-0.8>>|<math-at|4|<point|4.0|-0.8>>|<math-at|5|<point|5.0|-0.8>>|<math-at|6|<point|6.0|-0.8>>|<math-at|x|<point|7.0|-0.8>>|<math-at|l<rsub|<text|in>>|<point|7.0|-0.4>>|<point|5.0|3.5>|<point|4.0|2.3>|<point|3.0|1.4>|<point|2.0|0.7>|<point|1.0|0.3>>>> | |
135 | <label|fig ex-plot-s>A plot of a possible function that relates target | |
136 | group size <math|s> and input lines <math|l<rsub|<text|in>>> that provide | |
137 | some ROP and PIR. The function is tuned to achieve | |
138 | <math|s<around*|(|l<rsub|<text|in>>=10<rsup|6>|)>=25>.\ | |
139 | </big-figure> | |
140 | ||
141 | The following is a set of equations that are used to derive a definition | |
142 | for <math|s<around*|(|l<rsub|<text|in>>|)>> that satisfies the plot in | |
143 | <hgroup|Figure <reference|fig ex-plot-s>>.\ | |
144 | ||
145 | <\eqnarray*> | |
146 | <tformat|<table|<row|<cell|x>|<cell|=>|<cell|<text|log> | |
147 | <around*|(|l<rsub|<text|in>>|)>=<frac|ln | |
148 | <around*|(|l<rsub|<text|in>>|)>|<text|ln> | |
149 | <around*|(|10|)>><eq-number><label|eq | |
150 | rel-x-lin>>>|<row|<cell|10<rsup|x>>|<cell|=>|<cell|l<rsub|<text|in>>>>|<row|<cell|x<rsub|0>>|<cell|=>|<cell|<text|log> | |
151 | <around*|(|l<rsub|<text|in>,0>|)>=<frac|ln | |
152 | <around*|(|l<rsub|<text|in>,0>|)>|<text|ln> | |
153 | <around*|(|10|)>><eq-number><label|eq | |
154 | rel-x0-lin0>>>|<row|<cell|10<rsup|x<rsub|0>>>|<cell|=>|<cell|l<rsub|<text|in>,0>>>|<row|<cell|>|<cell|>|<cell|>>|<row|<cell|s<around*|(|x|)>>|<cell|=>|<cell|<around*|(|k**x|)><rsup|2>+1<eq-number><label|eq | |
155 | gsize-model>>>|<row|<cell|s<around*|(|x=6|)>=25>|<cell|=>|<cell|k<rsup|2>\<cdot\><around*|(|6|)><rsup|2>+1>>|<row|<cell|25>|<cell|=>|<cell|k<rsup|2>\<cdot\><around*|(|36|)>+1>>|<row|<cell|s<around*|(|x<rsub|0>|)>>|<cell|=>|<cell|<around*|(|k*x<rsub|0>|)><rsup|2>+1<eq-number><label|eq | |
156 | gsize-param-rel>>>|<row|<cell|<frac|s<around*|(|x<rsub|0>|)>-1|x<rsub|0><rsup|2>>>|<cell|=>|<cell|k<rsup|2>>>|<row|<cell|k>|<cell|=>|<cell|<sqrt|<frac|s<around*|(|x<rsub|0>|)>-1|x<rsub|0><rsup|2>>>>>|<row|<cell|k>|<cell|=>|<cell|<frac|<sqrt|s<around*|(|x<rsub|0>|)>-1>|x<rsub|0>>>>|<row|<cell|k<rsup|2>>|<cell|=>|<cell|<frac|s<around*|(|x<rsub|0>|)>-1|x<rsub|0><rsup|2>><eq-number><label|eq | |
157 | gsize-const-ksq>>>|<row|<cell|s<around*|(|l<rsub|<text|in>>|)>>|<cell|=>|<cell|<around*|(|<frac|s<around*|(|x<rsub|0>|)>-1|x<rsub|0><rsup|2>>|)>\<cdot\><around*|(|<frac|<text|ln><around*|(|l<rsub|<text|in>>|)>|<text|ln> | |
158 | <around*|(|10|)>>|)><rsup|2>+1>>|<row|<cell|>|<cell|=>|<cell|<around*|(|<frac|<text|ln> | |
159 | <around*|(|10|)>|ln <around*|(|l<rsub|<text|in>,0>|)>>|)><rsup|2>\<cdot\><around*|(|s<around*|(|x<rsub|0>|)>-1|)>\<cdot\><around*|(|<frac|<text|ln><around*|(|l<rsub|<text|in>>|)>|<text|ln> | |
160 | <around*|(|10|)>>|)><rsup|2>+1>>|<row|<cell|s<around*|(|l<rsub|<text|in>>|)>>|<cell|=>|<cell|<around*|(|s<around*|(|l<rsub|<text|in>,0>|)>-1|)>\<cdot\><around*|(|<frac|<text|ln><around*|(|l<rsub|<text|in>>|)>|ln | |
161 | <around*|(|l<rsub|<text|in>,0>|)>>|)><rsup|2>+1>>|<row|<cell|s<around*|(|l<rsub|<text|in>>|)>>|<cell|=>|<cell|<around*|(|<frac|s<around*|(|l<rsub|<text|in>,0>|)>-1|<around*|[|ln | |
162 | <around*|(|l<rsub|<text|in>,0>|)>|]><rsup|2>>|)>\<cdot\><around*|[|<text|ln><around*|(|l<rsub|<text|in>>|)>|]><rsup|2>+1<eq-number><label|eq | |
163 | gsize-lin>>>>> | |
164 | </eqnarray*> | |
165 | ||
166 | Equation <reference|eq gsize-model> defines a quadratic equation for the | |
167 | linear range <math|s> and the logarithmic domain <math|x>. <math|x> is | |
168 | defined in terms of <math|l<rsub|<text|in>>> via a domain transformation | |
169 | defined by <hgroup|Equation <reference|eq rel-x-lin>>. The result is | |
170 | <hgroup|Equation <reference|eq gsize-lin>> which defines | |
171 | <math|s<around*|(|l<rsub|<text|in>>|)>> as a function of | |
172 | <math|l<rsub|<text|in>>> and parameters | |
173 | <math|s<around*|(|l<rsub|<text|in>,0>|)>> and <math|l<rsub|<text|in>,0>>. | |
174 | The parameters define how quickly or slowly the quadratic equation grows. | |
175 | In other words, if a user wishes for a <math|1<separating-space|0.2em>000<separating-space|0.2em>000> | |
176 | line input to be split into groups each containing, on average, <math|25> | |
177 | lines, then they should plug in <math|l<rsub|<text|in>,0>=1<separating-space|0.2em>000<separating-space|0.2em>000> | |
178 | and <math|s<around*|(|l<rsub|<text|in>,0>|)>=25> into <hgroup|Equation | |
179 | <reference|eq gsize-lin>> as is done in <hgroup|Equation <reference|eq | |
180 | gsize-ex-1>>. This equation can then be used to calculate target group | |
181 | sizes <math|s> as a function of other input line counts | |
182 | <math|l<rsub|<text|in>>> besides <math|l<rsub|<text|in>>=1<separating-space|0.2em>000<separating-space|0.2em>000>. | |
183 | For example, plugging <math|l<rsub|<text|in>>=500> into <hgroup|Equation | |
184 | <reference|eq gsize-ex-1>> yields <hgroup|Equation <reference|eq | |
185 | gsize-ex-1-lin500>> which specifies a target group size of | |
186 | <math|5.85629\<cong\>6>. | |
187 | ||
188 | <\eqnarray*> | |
189 | <tformat|<table|<row|<cell|s<around*|(|l<rsub|<text|in>>|)>>|<cell|=>|<cell|<around*|(|<frac|s<around*|(|l<rsub|<text|in>,0>|)>-1|<around*|[|ln | |
190 | <around*|(|l<rsub|<text|in>,0>|)>|]><rsup|2>>|)>\<cdot\><around*|[|<text|ln><around*|(|l<rsub|<text|in>>|)>|]><rsup|2>+1>>|<row|<cell|s<around*|(|l<rsub|<text|in>>|)>>|<cell|=>|<cell|<around*|(|<frac|25-1|<around*|[|<text|ln | |
191 | ><around*|(|1<separating-space|0.2em>000<separating-space|0.2em>000|)>|]><rsup|2>>|)>\<cdot\><around*|[|<text|ln><around*|(|l<rsub|<text|in>>|)>|]><rsup|2>+1<eq-number><label|eq | |
192 | gsize-ex-1>>>|<row|<cell|>|<cell|>|<cell|>>|<row|<cell|s<around*|(|l<rsub|<text|in>>=500|)>>|<cell|=>|<cell|<around*|(|<frac|25-1|<around*|[|<text|ln | |
193 | ><around*|(|1<separating-space|0.2em>000<separating-space|0.2em>000|)>|]><rsup|2>>|)>\<cdot\><around*|[|<text|ln><around*|(|500|)>|]><rsup|2>+1<eq-number><label|eq | |
194 | gsize-ex-1-lin500>>>|<row|<cell|s<around*|(|l<rsub|<text|in>>=500|)>>|<cell|=>|<cell|5.85629>>>> | |
195 | </eqnarray*> | |
196 | ||
197 | \; | |
198 | ||
199 | <subsubsection|Jump from expected value> | |
200 | ||
201 | A method <shell|bkshuf> may employ to decide when read the next sequential | |
202 | unconsumed input line is to simply do so with probability | |
203 | <math|p<rsub|<text|seq>>> such that the expected value of the average group | |
204 | size trends towards <math|s>. | |
205 | ||
206 | <\eqnarray*> | |
b9e8b771 SBS |
207 | <tformat|<table|<row|<cell|p<rsub|<text|seq>>>|<cell|=>|<cell|<around*|(|1-p<rsub|<text|jump>>|)>>>|<row|<cell|p<rsub|<text|jump>>>|<cell|=>|<cell|1-p<rsub|<text|seq>>>>|<row|<cell|s>|<cell|=>|<cell|<frac|1|p<rsub|<text|jump>>>=<frac|1|1-p<rsub|<text|seq>>><eq-number>>>|<row|<cell|s>|<cell|=>|<cell|<frac|1|1-p<rsub|<text|seq>>>>>|<row|<cell|1-p<rsub|<text|seq>>>|<cell|=>|<cell|<frac|1|s>>>|<row|<cell|p<rsub|<text|seq>>-1>|<cell|=>|<cell|<frac|-1|s>>>|<row|<cell|p<rsub|<text|seq>>>|<cell|=>|<cell|1-<frac|1|s<around*|(|l<rsub|<text|in>>|)>><eq-number><label|eq |
208 | pseq-from-s-lin>>>|<row|<cell|p<rsub|<text|jump>>>|<cell|=>|<cell|<frac|1|s<around*|(|l<rsub|<text|in>>|)>><eq-number><label|eq | |
209 | pjump-from-s-lin>>>|<row|<cell|>|<cell|>|<cell|>>|<row|<cell|p<rsub|<text|seq>><around*|(|l<rsub|<text|in>>|)>>|<cell|=>|<cell|1-<around*|[|<around*|(|<frac|s<around*|(|l<rsub|<text|in>,0>|)>-1|<around*|[|ln | |
210 | <around*|(|l<rsub|<text|in>,0>|)>|]><rsup|2>>|)>\<cdot\><around*|[|<text|ln><around*|(|l<rsub|<text|in>>|)>|]><rsup|2>+1|]><rsup|-1><eq-number><label|eq | |
211 | pseq-from-s-lin-exp>>>>> | |
080ea4c0 SBS |
212 | </eqnarray*> |
213 | ||
214 | <subsubsection|Jump from random variate of inverse gaussian distribution> | |
215 | ||
216 | Another method <shell|bkshuf> may employ to decide when to read the next | |
217 | sequential unconsumed input line is to use an inverse gaussian | |
218 | distribution. This may be done by generating from the distribution a float | |
219 | sampled from the inverse gaussian with range 0 to infinity with mean | |
220 | <math|\<mu\>> whenever a new random position in the input list is selected; | |
221 | the float is rounded to the nearest integer.<\footnote> | |
222 | See <name|Michael, John R.> "Generating Random Variates Using | |
223 | Transformations with Multiple Roots". 1976. | |
224 | <hlink|https://doi.org/10.2307/2683801|https://doi.org/10.2307/2683801> . | |
225 | </footnote> Then, after consuming an input line, this integer is | |
226 | decremented by one and another sequential line is consumed provided the | |
227 | integer does not become less than or equal to zero. The inverse gaussian | |
228 | distribution requires specifying the mean <math|\<mu\>> and the shape | |
229 | parameter <math|\<lambda\>>; a higher <math|\<lambda\>> results in a | |
230 | greater spread. An upper bound may also be specified since the distribution | |
231 | has none except for that imposed by its programming implementation. | |
232 | ||
233 | The result of using an inverse gaussian distribution is an output with | |
234 | potentially much more regular group sizes than using the previously | |
235 | mentioned expected value method. However, the implementation of the inverse | |
236 | gaussian sampling operation described by (Michael, 1976) requires several | |
237 | exponent calculations and a square root calculation in addition to various | |
238 | multiplication and division operations. If sufficient processing power is | |
239 | available, this may not necessarily be an issue. | |
240 | ||
241 | <subsubsection|Output structure> | |
242 | ||
243 | Regardless of whether group sizes are determined by the expected value | |
244 | method or using variates of an inverse gaussian distribution, mimicking the | |
245 | <shell|shuf> property of all input lines being present in the output, | |
246 | albeit rearranged, results in a side effect: the first output lines are | |
247 | more likely to contain groups with uninterrupted sequence runs (high ROP) | |
248 | while groups in the last output lines are almost certain to contain | |
249 | sequence jumps within a group (less ROP). The reason for this is that | |
250 | <shell|bkshuf>, when it encounters an input line that has already been | |
251 | consumed, will skip to the next available input line. The decision could be | |
252 | made to skip to a new random line, but, it is simpler to simply read the | |
253 | next available input line. The author's original intention of sampling only | |
254 | a short initial portion of the output is compatible with the behavior that | |
255 | ROP is preserved mostly at the beginning of the output. | |
256 | ||
257 | <section|Version History> | |
258 | ||
259 | <\big-table|<tabular|<tformat|<table|<row|<cell|Version | |
260 | No.>|<cell|Date>|<cell|Path>|<cell|Description>>|<row|<cell|<verbatim|0.0.1>>|<cell|2023-02-14>|<cell|<verbatim|unitproc/bkshuf>>|<cell|Initial | |
261 | draft implemented in <name|Bash>.>>>>>> | |
262 | A table listing versions of <shell|bkshuf>. | |
263 | </big-table> | |
264 | ||
265 | <\description> | |
266 | <item*|v0.0.1>Initial implementation in <shell|bash> <verbatim|5.1.16> | |
267 | with <shell|bc> <verbatim|1.07.1> and <name|GNU Coreutils> | |
268 | <verbatim|8.32> and tested on <name|Pop!_OS> <verbatim|22.04 LTS>. Saved | |
269 | to the author's <name|BK-2020-03> repository<\footnote> | |
8d35dc96 SBS |
270 | See commit <hlink|<verbatim|080ea4c>|https://gitlab.com/baltakatei/baltakatei-exdev/-/blob/080ea4c0ff0d4e6b5ce86f664fa6645c1cb02bf0/unitproc/bkshuf> |
271 | at <hlink|https://gitlab.com/baltakatei/baltakatei-exdev|https://gitlab.com/baltakatei/baltakatei-exdev> | |
080ea4c0 SBS |
272 | . |
273 | </footnote>. | |
274 | </description> | |
275 | </body> | |
276 | ||
277 | <initial|<\collection> | |
278 | </collection>> | |
279 | ||
280 | <\references> | |
281 | <\collection> | |
8d35dc96 | 282 | <associate|auto-1|<tuple|1|1|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> |
b9e8b771 | 283 | <associate|auto-10|<tuple|3.3.3|5|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> |
8d35dc96 SBS |
284 | <associate|auto-11|<tuple|3.3.4|5|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> |
285 | <associate|auto-12|<tuple|4|5|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> | |
286 | <associate|auto-13|<tuple|1|5|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> | |
080ea4c0 | 287 | <associate|auto-14|<tuple|1|?|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> |
8d35dc96 | 288 | <associate|auto-2|<tuple|2|1|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> |
b9e8b771 SBS |
289 | <associate|auto-3|<tuple|3|2|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> |
290 | <associate|auto-4|<tuple|3.1|2|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> | |
8d35dc96 SBS |
291 | <associate|auto-5|<tuple|3.2|2|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> |
292 | <associate|auto-6|<tuple|3.3|2|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> | |
293 | <associate|auto-7|<tuple|3.3.1|2|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> | |
294 | <associate|auto-8|<tuple|1|3|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> | |
295 | <associate|auto-9|<tuple|3.3.2|4|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> | |
296 | <associate|eq gsize-const-ksq|<tuple|5|3|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> | |
297 | <associate|eq gsize-ex-1|<tuple|7|4|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> | |
298 | <associate|eq gsize-ex-1-lin500|<tuple|8|4|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> | |
299 | <associate|eq gsize-lin|<tuple|6|4|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> | |
300 | <associate|eq gsize-model|<tuple|3|3|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> | |
301 | <associate|eq gsize-param-rel|<tuple|4|3|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> | |
b9e8b771 SBS |
302 | <associate|eq pjump-from-s-lin|<tuple|11|?|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> |
303 | <associate|eq pseq-from-s-lin|<tuple|10|?|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> | |
304 | <associate|eq pseq-from-s-lin-exp|<tuple|12|?|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> | |
8d35dc96 SBS |
305 | <associate|eq rel-x-lin|<tuple|1|3|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> |
306 | <associate|eq rel-x0-lin0|<tuple|2|3|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> | |
307 | <associate|fig ex-plot-s|<tuple|1|3|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> | |
308 | <associate|footnote-1|<tuple|1|5|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> | |
309 | <associate|footnote-2|<tuple|2|5|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> | |
310 | <associate|footnr-1|<tuple|1|5|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> | |
311 | <associate|footnr-2|<tuple|2|5|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> | |
312 | <associate|jump-to-random|<tuple|4|2|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>> | |
080ea4c0 SBS |
313 | </collection> |
314 | </references> | |
315 | ||
316 | <\auxiliary> | |
317 | <\collection> | |
318 | <\associate|figure> | |
319 | <tuple|normal|<\surround|<hidden-binding|<tuple>|1>|> | |
320 | A plot of a possible function that relates target group size | |
321 | <with|mode|<quote|math>|s> and input lines | |
322 | <with|mode|<quote|math>|l<rsub|<with|mode|<quote|text>|in>>> that | |
323 | provide some ROP and PIR. The function is tuned to achieve | |
324 | <with|mode|<quote|math>|s<around*|(|l<rsub|<with|mode|<quote|text>|in>>=10<rsup|6>|)>=25>. | |
325 | </surround>|<pageref|auto-8>> | |
326 | </associate> | |
327 | <\associate|table> | |
328 | <tuple|normal|<\surround|<hidden-binding|<tuple>|1>|> | |
8d35dc96 | 329 | A table listing versions of <with|mode|<quote|prog>|prog-language|<quote|shell>|font-family|<quote|rm>|bkshuf>. |
080ea4c0 SBS |
330 | </surround>|<pageref|auto-13>> |
331 | </associate> | |
332 | <\associate|toc> | |
333 | <vspace*|1fn><with|font-series|<quote|bold>|math-font-series|<quote|bold>|1<space|2spc>Summary> | |
334 | <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>> | |
335 | <no-break><pageref|auto-1><vspace|0.5fn> | |
336 | ||
337 | <vspace*|1fn><with|font-series|<quote|bold>|math-font-series|<quote|bold>|2<space|2spc>Objective> | |
338 | <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>> | |
339 | <no-break><pageref|auto-2><vspace|0.5fn> | |
340 | ||
341 | <vspace*|1fn><with|font-series|<quote|bold>|math-font-series|<quote|bold>|3<space|2spc>Design> | |
342 | <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>> | |
343 | <no-break><pageref|auto-3><vspace|0.5fn> | |
344 | ||
345 | <with|par-left|<quote|1tab>|3.1<space|2spc>Definitions | |
346 | <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>> | |
347 | <no-break><pageref|auto-4>> | |
348 | ||
349 | <with|par-left|<quote|1tab>|3.2<space|2spc>Process | |
350 | <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>> | |
351 | <no-break><pageref|auto-5>> | |
352 | ||
353 | <with|par-left|<quote|1tab>|3.3<space|2spc>Parameter analysis | |
354 | <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>> | |
355 | <no-break><pageref|auto-6>> | |
356 | ||
357 | <with|par-left|<quote|2tab>|3.3.1<space|2spc>Target group size | |
358 | calculation <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>> | |
359 | <no-break><pageref|auto-7>> | |
360 | ||
361 | <with|par-left|<quote|2tab>|3.3.2<space|2spc>Jump from expected value | |
362 | <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>> | |
363 | <no-break><pageref|auto-9>> | |
364 | ||
365 | <with|par-left|<quote|2tab>|3.3.3<space|2spc>Jump from random variate | |
366 | of inverse gaussian distribution <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>> | |
367 | <no-break><pageref|auto-10>> | |
368 | ||
369 | <with|par-left|<quote|2tab>|3.3.4<space|2spc>Output structure | |
370 | <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>> | |
371 | <no-break><pageref|auto-11>> | |
372 | ||
373 | <vspace*|1fn><with|font-series|<quote|bold>|math-font-series|<quote|bold>|4<space|2spc>Version | |
374 | History> <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>> | |
375 | <no-break><pageref|auto-12><vspace|0.5fn> | |
376 | </associate> | |
377 | </collection> | |
378 | </auxiliary> |