Commit | Line | Data |
---|---|---|
e879cdc3 SBS |
1 | 2020-07-12T21:16Z; bktei> Note: This file is now retired since ~bklog~ |
2 | has replaced ~bkgpslog~. | |
3 | ||
872c737e SBS |
4 | * bkgpslog task list |
5 | ** DONE Add job control for short buffer length | |
6 | CLOSED: [2020-07-02 Thu 16:04] | |
7 | 2020-07-02T14:56Z; bktei> File write operations were bundled into a | |
8 | magicWriteBuffer function that is called then detached from the script | |
9 | shell (job control), but the detached job is not tracked by the main | |
10 | script. A problem may arise if two instances of magicWriteBuffer | |
11 | attempt to write to the same tar simultaneously. Two instances of | |
12 | magicWriteBuffer may exist if the buffer length is low (ex: 1 second); | |
13 | the default buffer length of 60 seconds should reduce the probability | |
14 | of a collision but it should be possible for the main script to track | |
15 | the process ID of a magicWriteBuffer() as soon as it detaches and then | |
16 | checking (via ~$!~ as described [[https://bashitout.com/2013/05/18/Ampersands-on-the-command-line.html][here]]) that the process is still alive. | |
17 | 2020-07-02T15:23Z; bktei> I found that the Bash ~wait~ built-in can be | |
18 | used to delay processing until a specified job completes. The ~wait~ | |
19 | command will pause script execution until all backgrounded processes | |
20 | complete. | |
21 | 2020-07-02T16:03Z; bktei> Added ~wait~. | |
f6fb18bd SBS |
22 | ** DONE Rewrite tar initialization function |
23 | CLOSED: [2020-07-02 Thu 17:23] | |
24 | 2020-07-02T17:23Z; bktei> Simplify tar initialization function so | |
25 | VERSION file is used to test appendability of tar as well as to mark | |
26 | when a new session is started. | |
27 | ** DONE Consolidate tar checking/creation into function | |
28 | CLOSED: [2020-07-02 Thu 18:33] | |
29 | 2020-07-02T18:33Z; bktei> Simplify how the output tar file's existence | |
30 | is checked and its status as a valid tar file is validated. This was | |
31 | done using a new function ~checkMakeTar~. | |
3df184eb | 32 | ** DONE Add VERSION if output tar deleted between writes |
f75428fe | 33 | |
3df184eb SBS |
34 | CLOSED: [2020-07-02 Thu 20:22] |
35 | 2020-07-02T20:21Z; bktei> Added bkgpslog-specified function | |
36 | magicWriteVersion() to be called whenever a new time-stamped ~VERSION~ | |
37 | file needs to be generated and appended to the output tar file | |
38 | ~PATHOUT_TAR~. | |
3592a7e9 | 39 | ** DONE Rewrite buffer loop to reduce lag between gpspipe runs |
9ae33467 | 40 | |
3592a7e9 | 41 | CLOSED: [2020-07-03 Fri 20:57] |
f75428fe SBS |
42 | 2020-07-03T17:10Z; bktei> As is, there is still a 5-6 second lag |
43 | between when ~gpspipe~ times out at the end of a buffer round and when | |
44 | ~gpspipe~ is called by the subsequent buffer round. I believe this can | |
45 | be reduced by moving variable manipulations inside the | |
46 | asynchronously-executed magicWriteBuffer() function. Ideally, the | |
47 | while loop should look like: | |
48 | ||
49 | #+BEGIN_EXAMPLE | |
50 | while( $SECONDS < $SCRIPT_TTL); do | |
51 | gpspipe-r > "$DIR_TMP"/buffer.nmea | |
52 | writeBuffer & | |
53 | done | |
54 | #+END_EXAMPLE | |
3592a7e9 SBS |
55 | 2020-07-03T20:56Z; bktei> I simplified it futher to something like |
56 | this: | |
57 | #+BEGIN_EXAMPLE | |
58 | while( $SECONDS < $SCRIPT_TTL); do | |
59 | writeBuffer & | |
60 | sleep $SCRIPT_TTL | |
61 | done | |
62 | #+END_EXAMPLE | |
9ae33467 | 63 | |
3592a7e9 SBS |
64 | Raspberry Pi Zero W shows approximately 71ms of drift per buffer round |
65 | with 10s buffer. | |
e879cdc3 SBS |
66 | ** DONE Feature: Recipient watch folder |
67 | CLOSED: [2020-07-12 Sun 21:08] | |
9ae33467 SBS |
68 | 2020-07-03T21:28Z; bktei> This feature would be to scan the contents |
69 | of a specified directory at the start of every buffer round in order | |
70 | to determine encryption (age) recipients. This would allow a device to | |
71 | dynamically encrypt location data in response to automated changes | |
72 | made by other tools. For example, if such a directory were | |
73 | synchronized via Syncthing and changes to such a directory were | |
74 | managed by a trusted remote server, then that server could respond to | |
75 | human requests to secure location data. | |
76 | ||
77 | Two specific privacy subfeatures come to mind: | |
78 | ||
79 | 1. Parallel encryption: Given a set of ~n~ public keys, encrypt data | |
80 | with a single ~age~ command with options causing all ~n~ pubkeys to | |
81 | be recipients. In order to decrypt the data, any individual private | |
82 | key could be used. No coordination between key owners would be | |
83 | required to decrypt. | |
84 | ||
85 | 2. Sequential encryption: Given a set of ~n~ public keys, encrypt data | |
86 | with ~n~ sequential ~age~ commands all piped in series with each | |
87 | ~age~ command utilizing only one of the ~n~ public keys. In order | |
88 | to decrypt the data, all ~n~ private keys would be required to | |
89 | decrypt the data. Since coordination is required, it is less | |
90 | convenient than parallel encryption. | |
91 | ||
92 | In either case, a directory would be useful for holding configuration | |
93 | files specifying how to execute which or combination of which features | |
94 | at the start of every buffer round. | |
95 | ||
96 | I don't yet know how to program the rules, although I think it'd be | |
97 | easier to simply add an option providing ~bkgpslog~ with a directory | |
98 | to watch. When examining the directory, check for a file with the | |
99 | appropriate file extension (ex: .pubkey) and then read the first line | |
100 | into the script's pubKey array. | |
101 | ||
e879cdc3 SBS |
102 | 2020-07-12T21:08Z; bktei> ~-R~ watch directory option added in ~bkgpslog~ ver |
103 | ~0.4.0~. | |
104 | ||
105 | ** DONE Feature: Simplify option to reduce output size | |
106 | CLOSED: [2020-07-12 Sun 21:15] | |
9ae33467 SBS |
107 | |
108 | ~gpsbabel~ [[https://www.gpsbabel.org/htmldoc-development/filter_simplify.html][features]] a ~simplify~ option to trim data points from GPS | |
109 | data. There are several methods for prioritizing which points to keep | |
110 | and which to trim, although the following seems useful given some | |
111 | sample data I've recorded in a test run of ninfacyzga-01: | |
112 | ||
113 | #+BEGIN_EXAMPLE | |
114 | gpsbabel -i nmea -f all.nmea -x simplify,error=10,relative -o gpx \ | |
115 | -F all-simp-rel-10.gpx | |
116 | #+END_EXAMPLE | |
117 | ||
118 | An error level of "10" with the "relative" option seems to retain all | |
119 | desireable features for GPS data while reducing the number of points | |
120 | along straightaways. File size is reduced by a factor of | |
121 | about 11. Noise from local stay-in-place drift isn't removed; a | |
122 | relative error of about 1000 is required to remove stay-in-place drift | |
123 | noise but this also trims all but 100m-size features of the recorded | |
124 | path. A relative error of 1000 reduces file size by a factor of | |
125 | about 450. | |
126 | ||
127 | #+BEGIN_EXAMPLE | |
128 | 67M relerror-0.001.kml | |
129 | 66M relerror-0.01.kml | |
130 | 58M relerror-0.1.kml | |
131 | 21M relerror-1.kml | |
132 | 5.8M relerror-10.kml | |
133 | 797K relerror-100.kml | |
134 | 152K relerror-1000.kml | |
135 | #+END_EXAMPLE | |
136 | ||
e879cdc3 SBS |
137 | 2020-07-12T21:13Z; bktei> Instead of programming data simplification |
138 | in ~bkgpslog~, the data simplification step should be performed via | |
139 | ~bklog~'s ~-p~ option which specifies a processing command string to | |
140 | be ~eval~'d before data is compressed, encrypted, and written to | |
141 | disk. In other words, handling the simplification of data beyond | |
142 | allowing for a general command string specified by ~-p~ is outside the | |
143 | scope of ~bkgpslog~ or its successor ~bklog~. | |
144 | ||
145 | ** DONE Feature: Generalize bkgpslog to bklog function | |
146 | CLOSED: [2020-07-12 Sun 21:11] | |
320ac29c SBS |
147 | 2020-07-05T02:42Z; bktei> Transform ~bkgpslog~ into a modular |
148 | component called ~bklog~ such that it processes a stdout stream of any | |
149 | external command, not just ~gpspipe -r~. This would permit reuse of | |
150 | the ~bkgpslog~ code for logging not just GPS data but things like | |
151 | pressure, temperature, system statistics, etc. | |
152 | 2020-07-05T16:35Z; bktei> | |
153 | : bklog -r age1asdf -o log.tar # encrypt/compress stdin to log.tar | |
154 | : bklog -x -f log.tar -i age.key -O /tmp # extract and decrypt | |
155 | ||
156 | Making ~bklog~ follow the [[https://en.wikipedia.org/wiki/Unix_philosophy][Unix philosophy]] means that it shouldn't care | |
157 | what kind of text is fed to it. | |
158 | ||
159 | *** ~bklog~ Design vs. Unix Philosophy | |
160 | **** Pubkey dir watching | |
161 | The feature of periodically checking a directory for changes in the | |
162 | pubkeys it contains should be justified by its usefulness; if the | |
163 | complexity cannot be justified then the feature should be removed. | |
164 | **** Defaults vs options | |
165 | Many options can cause the tool to become complex in unjustifiable | |
166 | ways. Currently I am adding options because I want the ability to | |
167 | modify the script's behavior without having to modify the source code | |
168 | on the machine in which the code is running. I should consider | |
169 | removing features at some point and having the program force defaults | |
170 | on the user. For example, allowing the specification of a temporary | |
171 | directory, while useful for me, is probably not useful for most people | |
172 | who don't know or care about the difference between ~/tmp~ and | |
173 | ~/dev/shm~. | |
174 | **** Script time to live (TTL) | |
175 | I initially implemented a script time-to-live feature because I was | |
176 | unsure in my ability to program script that could run for long periods | |
177 | of time without causing a runaway usage of memory. I still think it's | |
178 | a good idea to offer a script TTL option to the user but I think the | |
179 | default should be to simply run forver. | |
e879cdc3 SBS |
180 | |
181 | 2020-07-12T21:11Z; bktei> ~bklog~ script created and tested as of | |
182 | commit ~aedd19f~. | |
183 | ||
184 | ** DONE TODO: Evaluate ~rsyslog~ as stand-in for this work | |
185 | CLOSED: [2020-07-12 Sun 21:09] | |
320ac29c SBS |
186 | 2020-07-05T02:57Z; bktei> I searched for "debian iot logging" ("iot" |
187 | as in "Internet of Things", the current buzzword for small low-power | |
188 | computers being used to provide microservices for owners in their own | |
189 | home) and came across several search results mentioning ~syslog~ and | |
190 | ~rsyslog~. | |
191 | ||
192 | https://www.thissmarthouse.net/consolidating-iot-logs-into-mysql-using-rsyslog/ | |
193 | https://rsyslog.readthedocs.io/en/latest/tutorials/tls.html | |
194 | https://serverfault.com/questions/20840/how-would-you-send-syslog-securely-over-the-public-internet | |
195 | https://www.rsyslog.com/ | |
196 | ||
197 | My impression is that ~rsyslog~ is a complex software package designed | |
198 | to offer many features, some of which possibly might satisfy my | |
199 | needs. | |
200 | ||
201 | However, as stated in the repository README, the objective of the | |
202 | ~ninfacyzga-01~ project is "Observing facts of the new". This means | |
203 | that the goal is not only to record location data but any data that | |
204 | can be captured by a sensor. This means the capture of the following | |
205 | environmental phenomena are within the scope of this device: | |
206 | ||
207 | *** Sounds (microphone) | |
208 | *** Light (camera) | |
209 | *** Temperature (thermocouple) | |
210 | *** Air Pressure (barometer) | |
211 | *** Acceleration Vector (acceleromter / gyroscope) | |
212 | *** Magnetic Field Vector (magnetometer) | |
213 | ||
214 | This brings up the issue of respecting privacy of others in shared | |
215 | spaces through which ~ninfacyzga-01~ may pass through. ~ninfacyzga-01~ | |
216 | should encrypt data it records according to rules set by its | |
217 | owner. | |
218 | ||
219 | One permissive rule could be that if ~ninfacyzga-01~ detects that a | |
220 | person (let's call her Alice) enters a room, it should add Alice's | |
221 | encryption public key to the list of recipients against which it | |
222 | encrypts data without Alice having to know how ~ninfacyzga-01~ is | |
223 | programmed (she might have a ~calkuptcana~ agent on her person that | |
224 | broadcasts her privacy preferences). Meanwhile, ~ninfacyzga-01~ may | |
225 | publish its observations to a repository that Alice and other members | |
226 | of the shared communal space have access to (ex: a read-only shared | |
227 | directory on a local network WiFi). Alice could download all the files | |
228 | in the shared repository but she would only be able to decrypt files | |
229 | generated when she was physically near enough to ~ninfacyzga-01~ for | |
230 | it to detect that her presence was within some spatial boundary. | |
231 | ||
232 | A more restrictive rule could resemble the permissive rule in that | |
233 | ~ninfacyzga-01~ uses Alice's encryption public key only when she is | |
234 | physically near by, except that it encrypts logged files against | |
235 | public keys in a sequential manner. This would mean that all people | |
236 | who were near ~ninfacyzga-01~ would have to pass around each log file | |
237 | to eachother so that they could decrypt the content. | |
238 | ||
239 | That said, according to [[https://www.rsyslog.com/doc/master/tutorials/database.html][this ~rsyslog~ page]], ~rsyslog~ is more a data | |
240 | wrangling system for collecting data from disparate sources of | |
241 | different types and outputting data to text files on disk than a | |
242 | system committed to the server-client model of database storage. So, I | |
243 | think converting ~bkgpslog~ into a ~bklog~ script that appends | |
244 | encrypted and compressed data to a tar file for later extraction | |
245 | (possibly the same script with future features) would be best. | |
246 | ||
e879cdc3 SBS |
247 | 2020-07-12T21:10Z; bktei> rsyslog is outside the scope of what |
248 | ~bkgpslog~ does (record location observations). A different tool | |
249 | should be used to retrieve and synchronize data. The dumb storage | |
250 | method of "tar files in a syncthing folder" works for now. | |
1a1738c4 SBS |
251 | ** TODO: Place persistent recip. updates in asynchronous coproc |
252 | 2020-07-06T19:37Z; bktei> In order to update the recipient list, the | |
253 | magicParseRecipientDir() function needs to be run each buffer period | |
254 | in order to scan for changes in the recipient list. However, such a | |
255 | scan takes time; if the magicGatherWriteBuffer() function must pause | |
256 | until magicParseRecipientDir() completes, then a significant pause | |
257 | between buffer sessions may occur, causing detectable gaps in location | |
258 | data between buffer rounds. | |
259 | ||
260 | I looked for ways in which I might start magicParseRecipientDir() | |
261 | asynchronously immediately before running the data collection command | |
262 | and then collect its output at the start of the next buffer round. One | |
263 | way using the ~coproc~ Bash built-in is described [[https://stackoverflow.com/a/20018504/10850071][here]]. I'd have to | |
264 | make the asynchronous function output the recipient list to stdout | |
265 | which would then be ~read~ into the ~recPubKeysValid~ array in the | |
266 | main loop. However, for now, I'm putting the magicParseRecipientDir() | |
267 | as-is in the main loop and accepting the delay for now. | |
6c30388f SBS |
268 | * bkgpslog narrative |
269 | ** Initialize environment | |
270 | *** Init variables | |
271 | **** Save timeStart (YYYYmmddTHHMMSS±zz) | |
272 | *** Define Functions | |
273 | **** Define Debugging functions | |
274 | **** Define Argument Processing function | |
275 | **** Define Main function | |
276 | ** Run Main Function | |
277 | *** Process Arguments | |
278 | *** Set output encryption and compression option strings | |
279 | *** Check that critical apps and dirs are available, displag missing ones. | |
280 | *** Set lifespans of script and buffer | |
281 | *** Init temp working dir ~DIR_TMP~ | |
282 | Make temporary dir in tmpfs dir: ~/dev/shm/$(nonce)..bkgpslog/~ (~DIR_TMP~) | |
283 | *** Initialize ~tar~ archive | |
284 | **** Write ~bkgpslog~ version to ~$DIR_TMP/VERSION~ | |
285 | **** Create empty ~tar~ archive in ~DIR_OUT~ at ~PATHOUT_TAR~ | |
286 | ||
287 | Set output file name to: | |
288 | : PATHOUT_TAR="$DIR_OUT/YYYYmmdd..hostname_location.gz.age.tar" | |
289 | Usage: ~iso8601Period $timeStart $timeEnd~ | |
290 | ||
291 | **** Append ~VERSION~ file to ~PATHOUT_TAR~ | |
292 | ||
293 | Append ~$DIR_TMP/VERSION~ to ~PATHOUT_TAR~ via ~tar --append~ | |
294 | ||
295 | *** Read/Write Loop (Record gps data until script lifespan ends) | |
296 | **** Determine output file paths | |
297 | **** Define GPS conversion commands | |
298 | **** Fill Bash variable buffer from ~gpspipe~ | |
299 | **** Process bufferBash, save secured chunk set to ~DIR_TMP~ | |
300 | **** Append each secured chunk to ~PATHOUT_TAR~ | |
301 | : tar --append --directory=DIR_TMP --file=PATHOUT_TAR $(basename PATHOUT_{NMEA,GPX,KML} ) | |
302 | **** Remove secured chunk from ~DIR_TMP~ |