feat(bktess):Add bash wrapper for tesseract img OCR
[BK-2020-03.git] / doc / unitproc / bkshuf / article.tm
... / ...
CommitLineData
1<TeXmacs|2.1.1>
2
3<style|generic>
4
5<\body>
6 <doc-data|<doc-title|bkshuf: A Shuf-Like Utility with Pre-Image Resistance
7 and Relative Order Preservation for Random Sampling of Long
8 Lists>|<doc-author|<author-data|<author-name|Steven Baltakatei
9 Sandoval>>>|<doc-date|2023-02-14T13:56+00>|<doc-misc|CC BY-SA 4.0>>
10
11 <section|Summary>
12
13 <shell|bkshuf> is a <shell|shuf>-like utility designed to output randomly
14 sized groups of lines with a group size distribution centered around some
15 characteristic value.
16
17 <section|Objective>
18
19 The author desires to create a <shell|shuf>-like utility named
20 <shell|bkshuf> to mix line lists in order to produce output line lists with
21 the following somewhat conflicting properties:
22
23 <\description>
24 <item*|Pre-image resistance (PIR)>An output line's position should not
25 contain information about its input line position.
26
27 <item*|Relative order preservation (ROP)>Two neighboring lines in the
28 input stream should have a high probability of remaining neighbors in the
29 output stream.
30 </description>
31
32 The objective is to improve the value of a short random scan of a small
33 fraction of a potentially large input list; output that demonstrates ROP as
34 well as some degree of PIR may achieve this objective. In contrast, the
35 <shell|shuf> utility provides PIR but no ROP: a line's neighbor in the
36 output of <shell|shuf> is equally likely to be any other random line from
37 the input.\
38
39 In other words, output produced by <shell|bkshuf> should group together
40 sequential segments of the input lines in order to partially preserve
41 relationships that may exist between sequential files. For example, this
42 could be done by jumping to a random position in the input lines, consuming
43 (i.e. reading, outputting, and marking a line not to be read again) some
44 amount of sequential lines, then repeating the process until every line is
45 consumed. The amount of sequential lines to read between jumps affects how
46 well the above desired properties are satisfied.
47
48 The objective of <shell|bkshuf> is not to completely prevent the
49 possibility of reassembling the input given the output. Additionally, a
50 valuable property desired of <shell|bkshuf> is output which demonstrates
51 sufficiently high PIR compared to ROP such that only a short (compared to
52 the logarithm of the input list size) sequential scan of the output list
53 from a random starting position is required before a jump to a new group is
54 is encountered; this would permit the overal contents of very large input
55 line lists to be sampled.
56
57 <section|Design>
58
59 <subsection|Definitions>
60
61 <\eqnarray*>
62 <tformat|<table|<row|<cell|l>|<cell|:>|<cell|<text|number of
63 lines>>>|<row|<cell|l<rsub|<text|in>>>|<cell|:>|<cell|<text|input line
64 count>>>|<row|<cell|l<rsub|<text|out>>>|<cell|:>|<cell|<text|output line
65 count>>>|<row|<cell|c>|<cell|:>|<cell|<text|target group
66 count>>>|<row|<cell|s>|<cell|:>|<cell|<text|target group
67 size>>>|<row|<cell|p<rsub|<text|seq>><rsub|<text|>>>|<cell|:>|<cell|<text|probability
68 to include next sequential line>>>|<row|<cell|s<around*|(|l<rsub|<text|in>,0>|)>>|<cell|:>|<cell|<text|<math|target
69 group size parameter>>>>|<row|<cell|l<rsub|<text|in>,0>>|<cell|:>|<cell|<text|input
70 line count parameter>>>>>
71 </eqnarray*>
72
73 <subsection|Process>
74
75 <\enumerate-numeric>
76 <item>Acquire and count input lines (via <shell|/dev/stdin> or positional
77 arguments).
78
79 <item>Calculate line count <math|l<rsub|<text|in>>> .
80
81 <item>Calculate target group size <math|s>.
82
83 <item>Select random unconsumed input line and consume it to
84 output.<label|jump-to-random>
85
86 <item>Consume the next sequential line with probability
87 <math|p<rsub|<text|seq>>>. Otherwise if some input lines remain
88 unconsumed, go to step <reference|jump-to-random>. Otherwise, exit.
89 </enumerate-numeric>
90
91 <subsection|Parameter analysis>
92
93 <subsubsection|Target group size calculation>
94
95 The simultaneous presence of ROP and PIR properties in the output depends
96 upon the amount of sequential lines that are read before <shell|bkshuf>
97 jumps to a new random position in the input list. This amount is the
98 <em|target group size>, <math|s>; it is the \Ptarget\Q since <math|s>
99 represents the average of a distribution of group sizes that may be
100 selected, not a single group size. In this analysis, the total number of
101 lines in the input list is <math|l<rsub|<text|in>>>. For small input line
102 counts, (e.g. <math|l<rsub|<text|in>>\<cong\>10>) the target group size
103 should be nearly one (e.g. <math|s\<cong\>1>) since group sizes any larger
104 than this would have almost no PIR (e.g. a group size of <math|s=8> for
105 <math|l<rsub|<text|in>>=10> would be 80% identical to the input). For
106 modest line input counts (e.g. <math|l<rsub|<text|in>>\<cong\>100>), the
107 target group size may be allowed to be larger, such as a tenth of the input
108 line count (e.g. <math|s\<cong\>10>); this would provide some PIR
109 (approximately <math|10!> permutations between the approximately
110 <math|<frac|l<rsub|<text|in>>|s>\<cong\><frac|100|10>\<cong\>10> groups)
111 while each line in groups around size <math|10> would have a low
112 probability of not being next to its neighbor (<math|8> of the 10 lines
113 would retain the same two neighbors while the two ends would retain one
114 each). For very large input line counts (e.g.
115 <math|l<rsub|<text|in>>\<cong\>1<separating-space|0.2em>000<separating-space|0.2em>000>),
116 however, breaking up and randomizing the input into ten groups of
117 <math|100<separating-space|0.2em>000> offers very little PIR; the benefit
118 of the very high ROP is also lost since sequential scanning of tens of
119 thousands of lines is required before a random jump to a new group may be
120 encountered; therefore, the target group size should be a much smaller
121 fraction of <math|l<rsub|<text|in>>>, (e.g. <math|s\<cong\>20>) while still
122 increasing. The relationship between a desireable target group size
123 <math|s> and the input line count <math|l<rsub|<text|in>>> is non-linear.
124 The author believes a reasonable approach is to scale the group size to the
125 logarithm of input line count.
126
127 <hgroup|Figure <reference|fig ex-plot-s>> shows an example plot of
128 <math|s<around*|(|l<rsub|<text|in>>|)>> that is tuned to achieve a target
129 group size of <math|s<around*|(|l<rsub|<text|in>>=10<rsup|6>|)>=25> for an
130 input list length of <math|l<rsub|<text|in>>=10<rsup|6>> lines.\
131
132 <\big-figure|<with|gr-mode|<tuple|edit|point>|gr-frame|<tuple|scale|1.18922cm|<tuple|0.299593gw|0.120812gh>>|gr-geometry|<tuple|geometry|1par|0.6par>|gr-grid|<tuple|cartesian|<point|0|0>|1>|gr-grid-old|<tuple|cartesian|<point|0|0>|1>|gr-edit-grid-aspect|<tuple|<tuple|axes|none>|<tuple|1|none>|<tuple|10|none>>|gr-edit-grid|<tuple|cartesian|<point|0|0>|1>|gr-edit-grid-old|<tuple|cartesian|<point|0|0>|1>|gr-grid-aspect-props|<tuple|<tuple|axes|#808080>|<tuple|1|#c0c0c0>|<tuple|10|pastel
133 blue>>|gr-grid-aspect|<tuple|<tuple|axes|#808080>|<tuple|1|#c0c0c0>|<tuple|10|pastel
134 blue>>|magnify|1.18920711463847|gr-auto-crop|false|<graphics||<math-at|5|<point|-0.221848749616356|1.0>>|<math-at|10|<point|-0.397940008672038|2.0>>|<math-at|15|<point|-0.397940008672038|3.0>>|<math-at|20|<point|-0.397940008672038|4.0>>|<math-at|25|<point|-0.397940008672038|5.0>>|<math-at|30|<point|-0.397940008672038|6.0>>|<math-at|s|<point|0.0719460474170896|6.34360008343183>>|<point|0|0.2>|<point|6|5>|<math-at|10<rsup|1>|<point|1.0|-0.4>>|<math-at|10<rsup|2>|<point|2.0|-0.4>>|<math-at|10<rsup|3>|<point|3.0|-0.4>>|<math-at|10<rsup|4>|<point|4.0|-0.4>>|<math-at|10<rsup|5>|<point|5.0|-0.4>>|<math-at|10<rsup|6>|<point|6.0|-0.4>>|<math-at|1|<point|1.0|-0.8>>|<math-at|2|<point|2.0|-0.8>>|<math-at|3|<point|3.0|-0.8>>|<math-at|4|<point|4.0|-0.8>>|<math-at|5|<point|5.0|-0.8>>|<math-at|6|<point|6.0|-0.8>>|<math-at|x|<point|7.0|-0.8>>|<math-at|l<rsub|<text|in>>|<point|7.0|-0.4>>|<point|5.0|3.5>|<point|4.0|2.3>|<point|3.0|1.4>|<point|2.0|0.7>|<point|1.0|0.3>>>>
135 <label|fig ex-plot-s>A plot of a possible function that relates target
136 group size <math|s> and input lines <math|l<rsub|<text|in>>> that provide
137 some ROP and PIR. The function is tuned to achieve
138 <math|s<around*|(|l<rsub|<text|in>>=10<rsup|6>|)>=25>.\
139 </big-figure>
140
141 The following is a set of equations that are used to derive a definition
142 for <math|s<around*|(|l<rsub|<text|in>>|)>> that satisfies the plot in
143 <hgroup|Figure <reference|fig ex-plot-s>>.\
144
145 <\eqnarray*>
146 <tformat|<table|<row|<cell|x>|<cell|=>|<cell|<text|log>
147 <around*|(|l<rsub|<text|in>>|)>=<frac|ln
148 <around*|(|l<rsub|<text|in>>|)>|<text|ln>
149 <around*|(|10|)>><eq-number><label|eq
150 rel-x-lin>>>|<row|<cell|10<rsup|x>>|<cell|=>|<cell|l<rsub|<text|in>>>>|<row|<cell|x<rsub|0>>|<cell|=>|<cell|<text|log>
151 <around*|(|l<rsub|<text|in>,0>|)>=<frac|ln
152 <around*|(|l<rsub|<text|in>,0>|)>|<text|ln>
153 <around*|(|10|)>><eq-number><label|eq
154 rel-x0-lin0>>>|<row|<cell|10<rsup|x<rsub|0>>>|<cell|=>|<cell|l<rsub|<text|in>,0>>>|<row|<cell|>|<cell|>|<cell|>>|<row|<cell|s<around*|(|x|)>>|<cell|=>|<cell|<around*|(|k**x|)><rsup|2>+1<eq-number><label|eq
155 gsize-model>>>|<row|<cell|s<around*|(|x=6|)>=25>|<cell|=>|<cell|k<rsup|2>\<cdot\><around*|(|6|)><rsup|2>+1>>|<row|<cell|25>|<cell|=>|<cell|k<rsup|2>\<cdot\><around*|(|36|)>+1>>|<row|<cell|s<around*|(|x<rsub|0>|)>>|<cell|=>|<cell|<around*|(|k*x<rsub|0>|)><rsup|2>+1<eq-number><label|eq
156 gsize-param-rel>>>|<row|<cell|<frac|s<around*|(|x<rsub|0>|)>-1|x<rsub|0><rsup|2>>>|<cell|=>|<cell|k<rsup|2>>>|<row|<cell|k>|<cell|=>|<cell|<sqrt|<frac|s<around*|(|x<rsub|0>|)>-1|x<rsub|0><rsup|2>>>>>|<row|<cell|k>|<cell|=>|<cell|<frac|<sqrt|s<around*|(|x<rsub|0>|)>-1>|x<rsub|0>>>>|<row|<cell|k<rsup|2>>|<cell|=>|<cell|<frac|s<around*|(|x<rsub|0>|)>-1|x<rsub|0><rsup|2>><eq-number><label|eq
157 gsize-const-ksq>>>|<row|<cell|s<around*|(|l<rsub|<text|in>>|)>>|<cell|=>|<cell|<around*|(|<frac|s<around*|(|x<rsub|0>|)>-1|x<rsub|0><rsup|2>>|)>\<cdot\><around*|(|<frac|<text|ln><around*|(|l<rsub|<text|in>>|)>|<text|ln>
158 <around*|(|10|)>>|)><rsup|2>+1>>|<row|<cell|>|<cell|=>|<cell|<around*|(|<frac|<text|ln>
159 <around*|(|10|)>|ln <around*|(|l<rsub|<text|in>,0>|)>>|)><rsup|2>\<cdot\><around*|(|s<around*|(|x<rsub|0>|)>-1|)>\<cdot\><around*|(|<frac|<text|ln><around*|(|l<rsub|<text|in>>|)>|<text|ln>
160 <around*|(|10|)>>|)><rsup|2>+1>>|<row|<cell|s<around*|(|l<rsub|<text|in>>|)>>|<cell|=>|<cell|<around*|(|s<around*|(|l<rsub|<text|in>,0>|)>-1|)>\<cdot\><around*|(|<frac|<text|ln><around*|(|l<rsub|<text|in>>|)>|ln
161 <around*|(|l<rsub|<text|in>,0>|)>>|)><rsup|2>+1>>|<row|<cell|s<around*|(|l<rsub|<text|in>>|)>>|<cell|=>|<cell|<around*|(|<frac|s<around*|(|l<rsub|<text|in>,0>|)>-1|<around*|[|ln
162 <around*|(|l<rsub|<text|in>,0>|)>|]><rsup|2>>|)>\<cdot\><around*|[|<text|ln><around*|(|l<rsub|<text|in>>|)>|]><rsup|2>+1<eq-number><label|eq
163 gsize-lin>>>>>
164 </eqnarray*>
165
166 Equation <reference|eq gsize-model> defines a quadratic equation for the
167 linear range <math|s> and the logarithmic domain <math|x>. <math|x> is
168 defined in terms of <math|l<rsub|<text|in>>> via a domain transformation
169 defined by <hgroup|Equation <reference|eq rel-x-lin>>. The result is
170 <hgroup|Equation <reference|eq gsize-lin>> which defines
171 <math|s<around*|(|l<rsub|<text|in>>|)>> as a function of
172 <math|l<rsub|<text|in>>> and parameters
173 <math|s<around*|(|l<rsub|<text|in>,0>|)>> and <math|l<rsub|<text|in>,0>>.
174 The parameters define how quickly or slowly the quadratic equation grows.
175 In other words, if a user wishes for a <math|1<separating-space|0.2em>000<separating-space|0.2em>000>
176 line input to be split into groups each containing, on average, <math|25>
177 lines, then they should plug in <math|l<rsub|<text|in>,0>=1<separating-space|0.2em>000<separating-space|0.2em>000>
178 and <math|s<around*|(|l<rsub|<text|in>,0>|)>=25> into <hgroup|Equation
179 <reference|eq gsize-lin>> as is done in <hgroup|Equation <reference|eq
180 gsize-ex-1>>. This equation can then be used to calculate target group
181 sizes <math|s> as a function of other input line counts
182 <math|l<rsub|<text|in>>> besides <math|l<rsub|<text|in>>=1<separating-space|0.2em>000<separating-space|0.2em>000>.
183 For example, plugging <math|l<rsub|<text|in>>=500> into <hgroup|Equation
184 <reference|eq gsize-ex-1>> yields <hgroup|Equation <reference|eq
185 gsize-ex-1-lin500>> which specifies a target group size of
186 <math|5.85629\<cong\>6>.
187
188 <\eqnarray*>
189 <tformat|<table|<row|<cell|s<around*|(|l<rsub|<text|in>>|)>>|<cell|=>|<cell|<around*|(|<frac|s<around*|(|l<rsub|<text|in>,0>|)>-1|<around*|[|ln
190 <around*|(|l<rsub|<text|in>,0>|)>|]><rsup|2>>|)>\<cdot\><around*|[|<text|ln><around*|(|l<rsub|<text|in>>|)>|]><rsup|2>+1>>|<row|<cell|s<around*|(|l<rsub|<text|in>>|)>>|<cell|=>|<cell|<around*|(|<frac|25-1|<around*|[|<text|ln
191 ><around*|(|1<separating-space|0.2em>000<separating-space|0.2em>000|)>|]><rsup|2>>|)>\<cdot\><around*|[|<text|ln><around*|(|l<rsub|<text|in>>|)>|]><rsup|2>+1<eq-number><label|eq
192 gsize-ex-1>>>|<row|<cell|>|<cell|>|<cell|>>|<row|<cell|s<around*|(|l<rsub|<text|in>>=500|)>>|<cell|=>|<cell|<around*|(|<frac|25-1|<around*|[|<text|ln
193 ><around*|(|1<separating-space|0.2em>000<separating-space|0.2em>000|)>|]><rsup|2>>|)>\<cdot\><around*|[|<text|ln><around*|(|500|)>|]><rsup|2>+1<eq-number><label|eq
194 gsize-ex-1-lin500>>>|<row|<cell|s<around*|(|l<rsub|<text|in>>=500|)>>|<cell|=>|<cell|5.85629>>>>
195 </eqnarray*>
196
197 \;
198
199 <subsubsection|Jump from expected value>
200
201 A method <shell|bkshuf> may employ to decide when read the next sequential
202 unconsumed input line is to simply do so with probability
203 <math|p<rsub|<text|seq>>> such that the expected value of the average group
204 size trends towards <math|s>.
205
206 <\eqnarray*>
207 <tformat|<table|<row|<cell|p<rsub|<text|seq>>>|<cell|=>|<cell|<around*|(|1-p<rsub|<text|jump>>|)>>>|<row|<cell|p<rsub|<text|jump>>>|<cell|=>|<cell|1-p<rsub|<text|seq>>>>|<row|<cell|s>|<cell|=>|<cell|<frac|1|p<rsub|<text|jump>>>=<frac|1|1-p<rsub|<text|seq>>><eq-number>>>|<row|<cell|s>|<cell|=>|<cell|<frac|1|1-p<rsub|<text|seq>>>>>|<row|<cell|1-p<rsub|<text|seq>>>|<cell|=>|<cell|<frac|1|s>>>|<row|<cell|p<rsub|<text|seq>>-1>|<cell|=>|<cell|<frac|-1|s>>>|<row|<cell|p<rsub|<text|seq>>>|<cell|=>|<cell|1-<frac|1|s<around*|(|l<rsub|<text|in>>|)>><eq-number><label|eq
208 pseq-from-s-lin>>>|<row|<cell|p<rsub|<text|jump>>>|<cell|=>|<cell|<frac|1|s<around*|(|l<rsub|<text|in>>|)>><eq-number><label|eq
209 pjump-from-s-lin>>>|<row|<cell|>|<cell|>|<cell|>>|<row|<cell|p<rsub|<text|seq>><around*|(|l<rsub|<text|in>>|)>>|<cell|=>|<cell|1-<around*|[|<around*|(|<frac|s<around*|(|l<rsub|<text|in>,0>|)>-1|<around*|[|ln
210 <around*|(|l<rsub|<text|in>,0>|)>|]><rsup|2>>|)>\<cdot\><around*|[|<text|ln><around*|(|l<rsub|<text|in>>|)>|]><rsup|2>+1|]><rsup|-1><eq-number><label|eq
211 pseq-from-s-lin-exp>>>>>
212 </eqnarray*>
213
214 <subsubsection|Jump from random variate of inverse gaussian distribution>
215
216 Another method <shell|bkshuf> may employ to decide when to read the next
217 sequential unconsumed input line is to use an inverse gaussian
218 distribution. This may be done by generating from the distribution a float
219 sampled from the inverse gaussian with range 0 to infinity with mean
220 <math|\<mu\>> whenever a new random position in the input list is selected;
221 the float is rounded to the nearest integer.<\footnote>
222 See <name|Michael, John R.> "Generating Random Variates Using
223 Transformations with Multiple Roots". 1976.
224 <hlink|https://doi.org/10.2307/2683801|https://doi.org/10.2307/2683801> .
225 </footnote> Then, after consuming an input line, this integer is
226 decremented by one and another sequential line is consumed provided the
227 integer does not become less than or equal to zero. The inverse gaussian
228 distribution requires specifying the mean <math|\<mu\>> and the shape
229 parameter <math|\<lambda\>>; a higher <math|\<lambda\>> results in a
230 greater spread. An upper bound may also be specified since the distribution
231 has none except for that imposed by its programming implementation.
232
233 The result of using an inverse gaussian distribution is an output with
234 potentially much more regular group sizes than using the previously
235 mentioned expected value method. However, the implementation of the inverse
236 gaussian sampling operation described by (Michael, 1976) requires several
237 exponent calculations and a square root calculation in addition to various
238 multiplication and division operations. If sufficient processing power is
239 available, this may not necessarily be an issue.
240
241 <subsubsection|Output structure>
242
243 Regardless of whether group sizes are determined by the expected value
244 method or using variates of an inverse gaussian distribution, mimicking the
245 <shell|shuf> property of all input lines being present in the output,
246 albeit rearranged, results in a side effect: the first output lines are
247 more likely to contain groups with uninterrupted sequence runs (high ROP)
248 while groups in the last output lines are almost certain to contain
249 sequence jumps within a group (less ROP). The reason for this is that
250 <shell|bkshuf>, when it encounters an input line that has already been
251 consumed, will skip to the next available input line. The decision could be
252 made to skip to a new random line, but, it is simpler to simply read the
253 next available input line. The author's original intention of sampling only
254 a short initial portion of the output is compatible with the behavior that
255 ROP is preserved mostly at the beginning of the output.
256
257 <section|Version History>
258
259 <\big-table|<tabular|<tformat|<table|<row|<cell|Version
260 No.>|<cell|Date>|<cell|Path>|<cell|Description>>|<row|<cell|<verbatim|0.0.1>>|<cell|2023-02-14>|<cell|<verbatim|unitproc/bkshuf>>|<cell|Initial
261 draft implemented in <name|Bash>.>>>>>>
262 A table listing versions of <shell|bkshuf>.
263 </big-table>
264
265 <\description>
266 <item*|v0.0.1>Initial implementation in <shell|bash> <verbatim|5.1.16>
267 with <shell|bc> <verbatim|1.07.1> and <name|GNU Coreutils>
268 <verbatim|8.32> and tested on <name|Pop!_OS> <verbatim|22.04 LTS>. Saved
269 to the author's <name|BK-2020-03> repository<\footnote>
270 See commit <hlink|<verbatim|080ea4c>|https://gitlab.com/baltakatei/baltakatei-exdev/-/blob/080ea4c0ff0d4e6b5ce86f664fa6645c1cb02bf0/unitproc/bkshuf>
271 at <hlink|https://gitlab.com/baltakatei/baltakatei-exdev|https://gitlab.com/baltakatei/baltakatei-exdev>
272 .
273 </footnote>.
274 </description>
275</body>
276
277<initial|<\collection>
278</collection>>
279
280<\references>
281 <\collection>
282 <associate|auto-1|<tuple|1|1|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
283 <associate|auto-10|<tuple|3.3.3|5|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
284 <associate|auto-11|<tuple|3.3.4|5|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
285 <associate|auto-12|<tuple|4|5|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
286 <associate|auto-13|<tuple|1|5|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
287 <associate|auto-14|<tuple|1|?|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
288 <associate|auto-2|<tuple|2|1|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
289 <associate|auto-3|<tuple|3|2|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
290 <associate|auto-4|<tuple|3.1|2|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
291 <associate|auto-5|<tuple|3.2|2|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
292 <associate|auto-6|<tuple|3.3|2|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
293 <associate|auto-7|<tuple|3.3.1|2|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
294 <associate|auto-8|<tuple|1|3|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
295 <associate|auto-9|<tuple|3.3.2|4|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
296 <associate|eq gsize-const-ksq|<tuple|5|3|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
297 <associate|eq gsize-ex-1|<tuple|7|4|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
298 <associate|eq gsize-ex-1-lin500|<tuple|8|4|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
299 <associate|eq gsize-lin|<tuple|6|4|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
300 <associate|eq gsize-model|<tuple|3|3|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
301 <associate|eq gsize-param-rel|<tuple|4|3|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
302 <associate|eq pjump-from-s-lin|<tuple|11|?|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
303 <associate|eq pseq-from-s-lin|<tuple|10|?|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
304 <associate|eq pseq-from-s-lin-exp|<tuple|12|?|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
305 <associate|eq rel-x-lin|<tuple|1|3|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
306 <associate|eq rel-x0-lin0|<tuple|2|3|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
307 <associate|fig ex-plot-s|<tuple|1|3|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
308 <associate|footnote-1|<tuple|1|5|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
309 <associate|footnote-2|<tuple|2|5|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
310 <associate|footnr-1|<tuple|1|5|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
311 <associate|footnr-2|<tuple|2|5|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
312 <associate|jump-to-random|<tuple|4|2|../../../../../wr/20230213..bkshuf_draft/src/doc.tm>>
313 </collection>
314</references>
315
316<\auxiliary>
317 <\collection>
318 <\associate|figure>
319 <tuple|normal|<\surround|<hidden-binding|<tuple>|1>|>
320 A plot of a possible function that relates target group size
321 <with|mode|<quote|math>|s> and input lines
322 <with|mode|<quote|math>|l<rsub|<with|mode|<quote|text>|in>>> that
323 provide some ROP and PIR. The function is tuned to achieve
324 <with|mode|<quote|math>|s<around*|(|l<rsub|<with|mode|<quote|text>|in>>=10<rsup|6>|)>=25>.
325 </surround>|<pageref|auto-8>>
326 </associate>
327 <\associate|table>
328 <tuple|normal|<\surround|<hidden-binding|<tuple>|1>|>
329 A table listing versions of <with|mode|<quote|prog>|prog-language|<quote|shell>|font-family|<quote|rm>|bkshuf>.
330 </surround>|<pageref|auto-13>>
331 </associate>
332 <\associate|toc>
333 <vspace*|1fn><with|font-series|<quote|bold>|math-font-series|<quote|bold>|1<space|2spc>Summary>
334 <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
335 <no-break><pageref|auto-1><vspace|0.5fn>
336
337 <vspace*|1fn><with|font-series|<quote|bold>|math-font-series|<quote|bold>|2<space|2spc>Objective>
338 <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
339 <no-break><pageref|auto-2><vspace|0.5fn>
340
341 <vspace*|1fn><with|font-series|<quote|bold>|math-font-series|<quote|bold>|3<space|2spc>Design>
342 <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
343 <no-break><pageref|auto-3><vspace|0.5fn>
344
345 <with|par-left|<quote|1tab>|3.1<space|2spc>Definitions
346 <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
347 <no-break><pageref|auto-4>>
348
349 <with|par-left|<quote|1tab>|3.2<space|2spc>Process
350 <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
351 <no-break><pageref|auto-5>>
352
353 <with|par-left|<quote|1tab>|3.3<space|2spc>Parameter analysis
354 <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
355 <no-break><pageref|auto-6>>
356
357 <with|par-left|<quote|2tab>|3.3.1<space|2spc>Target group size
358 calculation <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
359 <no-break><pageref|auto-7>>
360
361 <with|par-left|<quote|2tab>|3.3.2<space|2spc>Jump from expected value
362 <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
363 <no-break><pageref|auto-9>>
364
365 <with|par-left|<quote|2tab>|3.3.3<space|2spc>Jump from random variate
366 of inverse gaussian distribution <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
367 <no-break><pageref|auto-10>>
368
369 <with|par-left|<quote|2tab>|3.3.4<space|2spc>Output structure
370 <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
371 <no-break><pageref|auto-11>>
372
373 <vspace*|1fn><with|font-series|<quote|bold>|math-font-series|<quote|bold>|4<space|2spc>Version
374 History> <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
375 <no-break><pageref|auto-12><vspace|0.5fn>
376 </associate>
377 </collection>
378</auxiliary>