Commit | Line | Data |
---|---|---|
872c737e SBS |
1 | * bkgpslog task list |
2 | ** DONE Add job control for short buffer length | |
3 | CLOSED: [2020-07-02 Thu 16:04] | |
4 | 2020-07-02T14:56Z; bktei> File write operations were bundled into a | |
5 | magicWriteBuffer function that is called then detached from the script | |
6 | shell (job control), but the detached job is not tracked by the main | |
7 | script. A problem may arise if two instances of magicWriteBuffer | |
8 | attempt to write to the same tar simultaneously. Two instances of | |
9 | magicWriteBuffer may exist if the buffer length is low (ex: 1 second); | |
10 | the default buffer length of 60 seconds should reduce the probability | |
11 | of a collision but it should be possible for the main script to track | |
12 | the process ID of a magicWriteBuffer() as soon as it detaches and then | |
13 | checking (via ~$!~ as described [[https://bashitout.com/2013/05/18/Ampersands-on-the-command-line.html][here]]) that the process is still alive. | |
14 | 2020-07-02T15:23Z; bktei> I found that the Bash ~wait~ built-in can be | |
15 | used to delay processing until a specified job completes. The ~wait~ | |
16 | command will pause script execution until all backgrounded processes | |
17 | complete. | |
18 | 2020-07-02T16:03Z; bktei> Added ~wait~. | |
f6fb18bd SBS |
19 | ** DONE Rewrite tar initialization function |
20 | CLOSED: [2020-07-02 Thu 17:23] | |
21 | 2020-07-02T17:23Z; bktei> Simplify tar initialization function so | |
22 | VERSION file is used to test appendability of tar as well as to mark | |
23 | when a new session is started. | |
24 | ** DONE Consolidate tar checking/creation into function | |
25 | CLOSED: [2020-07-02 Thu 18:33] | |
26 | 2020-07-02T18:33Z; bktei> Simplify how the output tar file's existence | |
27 | is checked and its status as a valid tar file is validated. This was | |
28 | done using a new function ~checkMakeTar~. | |
3df184eb | 29 | ** DONE Add VERSION if output tar deleted between writes |
f75428fe | 30 | |
3df184eb SBS |
31 | CLOSED: [2020-07-02 Thu 20:22] |
32 | 2020-07-02T20:21Z; bktei> Added bkgpslog-specified function | |
33 | magicWriteVersion() to be called whenever a new time-stamped ~VERSION~ | |
34 | file needs to be generated and appended to the output tar file | |
35 | ~PATHOUT_TAR~. | |
3592a7e9 | 36 | ** DONE Rewrite buffer loop to reduce lag between gpspipe runs |
9ae33467 | 37 | |
3592a7e9 | 38 | CLOSED: [2020-07-03 Fri 20:57] |
f75428fe SBS |
39 | 2020-07-03T17:10Z; bktei> As is, there is still a 5-6 second lag |
40 | between when ~gpspipe~ times out at the end of a buffer round and when | |
41 | ~gpspipe~ is called by the subsequent buffer round. I believe this can | |
42 | be reduced by moving variable manipulations inside the | |
43 | asynchronously-executed magicWriteBuffer() function. Ideally, the | |
44 | while loop should look like: | |
45 | ||
46 | #+BEGIN_EXAMPLE | |
47 | while( $SECONDS < $SCRIPT_TTL); do | |
48 | gpspipe-r > "$DIR_TMP"/buffer.nmea | |
49 | writeBuffer & | |
50 | done | |
51 | #+END_EXAMPLE | |
3592a7e9 SBS |
52 | 2020-07-03T20:56Z; bktei> I simplified it futher to something like |
53 | this: | |
54 | #+BEGIN_EXAMPLE | |
55 | while( $SECONDS < $SCRIPT_TTL); do | |
56 | writeBuffer & | |
57 | sleep $SCRIPT_TTL | |
58 | done | |
59 | #+END_EXAMPLE | |
9ae33467 | 60 | |
3592a7e9 SBS |
61 | Raspberry Pi Zero W shows approximately 71ms of drift per buffer round |
62 | with 10s buffer. | |
9ae33467 SBS |
63 | ** TODO Feature: Recipient watch folder |
64 | 2020-07-03T21:28Z; bktei> This feature would be to scan the contents | |
65 | of a specified directory at the start of every buffer round in order | |
66 | to determine encryption (age) recipients. This would allow a device to | |
67 | dynamically encrypt location data in response to automated changes | |
68 | made by other tools. For example, if such a directory were | |
69 | synchronized via Syncthing and changes to such a directory were | |
70 | managed by a trusted remote server, then that server could respond to | |
71 | human requests to secure location data. | |
72 | ||
73 | Two specific privacy subfeatures come to mind: | |
74 | ||
75 | 1. Parallel encryption: Given a set of ~n~ public keys, encrypt data | |
76 | with a single ~age~ command with options causing all ~n~ pubkeys to | |
77 | be recipients. In order to decrypt the data, any individual private | |
78 | key could be used. No coordination between key owners would be | |
79 | required to decrypt. | |
80 | ||
81 | 2. Sequential encryption: Given a set of ~n~ public keys, encrypt data | |
82 | with ~n~ sequential ~age~ commands all piped in series with each | |
83 | ~age~ command utilizing only one of the ~n~ public keys. In order | |
84 | to decrypt the data, all ~n~ private keys would be required to | |
85 | decrypt the data. Since coordination is required, it is less | |
86 | convenient than parallel encryption. | |
87 | ||
88 | In either case, a directory would be useful for holding configuration | |
89 | files specifying how to execute which or combination of which features | |
90 | at the start of every buffer round. | |
91 | ||
92 | I don't yet know how to program the rules, although I think it'd be | |
93 | easier to simply add an option providing ~bkgpslog~ with a directory | |
94 | to watch. When examining the directory, check for a file with the | |
95 | appropriate file extension (ex: .pubkey) and then read the first line | |
96 | into the script's pubKey array. | |
97 | ||
98 | ** TODO Feature: Simplify option to reduce output size | |
99 | ||
100 | ~gpsbabel~ [[https://www.gpsbabel.org/htmldoc-development/filter_simplify.html][features]] a ~simplify~ option to trim data points from GPS | |
101 | data. There are several methods for prioritizing which points to keep | |
102 | and which to trim, although the following seems useful given some | |
103 | sample data I've recorded in a test run of ninfacyzga-01: | |
104 | ||
105 | #+BEGIN_EXAMPLE | |
106 | gpsbabel -i nmea -f all.nmea -x simplify,error=10,relative -o gpx \ | |
107 | -F all-simp-rel-10.gpx | |
108 | #+END_EXAMPLE | |
109 | ||
110 | An error level of "10" with the "relative" option seems to retain all | |
111 | desireable features for GPS data while reducing the number of points | |
112 | along straightaways. File size is reduced by a factor of | |
113 | about 11. Noise from local stay-in-place drift isn't removed; a | |
114 | relative error of about 1000 is required to remove stay-in-place drift | |
115 | noise but this also trims all but 100m-size features of the recorded | |
116 | path. A relative error of 1000 reduces file size by a factor of | |
117 | about 450. | |
118 | ||
119 | #+BEGIN_EXAMPLE | |
120 | 67M relerror-0.001.kml | |
121 | 66M relerror-0.01.kml | |
122 | 58M relerror-0.1.kml | |
123 | 21M relerror-1.kml | |
124 | 5.8M relerror-10.kml | |
125 | 797K relerror-100.kml | |
126 | 152K relerror-1000.kml | |
127 | #+END_EXAMPLE | |
128 | ||
320ac29c SBS |
129 | ** TODO Feature: Generalize bkgpslog to bklog function |
130 | 2020-07-05T02:42Z; bktei> Transform ~bkgpslog~ into a modular | |
131 | component called ~bklog~ such that it processes a stdout stream of any | |
132 | external command, not just ~gpspipe -r~. This would permit reuse of | |
133 | the ~bkgpslog~ code for logging not just GPS data but things like | |
134 | pressure, temperature, system statistics, etc. | |
135 | 2020-07-05T16:35Z; bktei> | |
136 | : bklog -r age1asdf -o log.tar # encrypt/compress stdin to log.tar | |
137 | : bklog -x -f log.tar -i age.key -O /tmp # extract and decrypt | |
138 | ||
139 | Making ~bklog~ follow the [[https://en.wikipedia.org/wiki/Unix_philosophy][Unix philosophy]] means that it shouldn't care | |
140 | what kind of text is fed to it. | |
141 | ||
142 | *** ~bklog~ Design vs. Unix Philosophy | |
143 | **** Pubkey dir watching | |
144 | The feature of periodically checking a directory for changes in the | |
145 | pubkeys it contains should be justified by its usefulness; if the | |
146 | complexity cannot be justified then the feature should be removed. | |
147 | **** Defaults vs options | |
148 | Many options can cause the tool to become complex in unjustifiable | |
149 | ways. Currently I am adding options because I want the ability to | |
150 | modify the script's behavior without having to modify the source code | |
151 | on the machine in which the code is running. I should consider | |
152 | removing features at some point and having the program force defaults | |
153 | on the user. For example, allowing the specification of a temporary | |
154 | directory, while useful for me, is probably not useful for most people | |
155 | who don't know or care about the difference between ~/tmp~ and | |
156 | ~/dev/shm~. | |
157 | **** Script time to live (TTL) | |
158 | I initially implemented a script time-to-live feature because I was | |
159 | unsure in my ability to program script that could run for long periods | |
160 | of time without causing a runaway usage of memory. I still think it's | |
161 | a good idea to offer a script TTL option to the user but I think the | |
162 | default should be to simply run forver. | |
163 | ** TODO: Evaluate ~rsyslog~ as stand-in for this work | |
164 | 2020-07-05T02:57Z; bktei> I searched for "debian iot logging" ("iot" | |
165 | as in "Internet of Things", the current buzzword for small low-power | |
166 | computers being used to provide microservices for owners in their own | |
167 | home) and came across several search results mentioning ~syslog~ and | |
168 | ~rsyslog~. | |
169 | ||
170 | https://www.thissmarthouse.net/consolidating-iot-logs-into-mysql-using-rsyslog/ | |
171 | https://rsyslog.readthedocs.io/en/latest/tutorials/tls.html | |
172 | https://serverfault.com/questions/20840/how-would-you-send-syslog-securely-over-the-public-internet | |
173 | https://www.rsyslog.com/ | |
174 | ||
175 | My impression is that ~rsyslog~ is a complex software package designed | |
176 | to offer many features, some of which possibly might satisfy my | |
177 | needs. | |
178 | ||
179 | However, as stated in the repository README, the objective of the | |
180 | ~ninfacyzga-01~ project is "Observing facts of the new". This means | |
181 | that the goal is not only to record location data but any data that | |
182 | can be captured by a sensor. This means the capture of the following | |
183 | environmental phenomena are within the scope of this device: | |
184 | ||
185 | *** Sounds (microphone) | |
186 | *** Light (camera) | |
187 | *** Temperature (thermocouple) | |
188 | *** Air Pressure (barometer) | |
189 | *** Acceleration Vector (acceleromter / gyroscope) | |
190 | *** Magnetic Field Vector (magnetometer) | |
191 | ||
192 | This brings up the issue of respecting privacy of others in shared | |
193 | spaces through which ~ninfacyzga-01~ may pass through. ~ninfacyzga-01~ | |
194 | should encrypt data it records according to rules set by its | |
195 | owner. | |
196 | ||
197 | One permissive rule could be that if ~ninfacyzga-01~ detects that a | |
198 | person (let's call her Alice) enters a room, it should add Alice's | |
199 | encryption public key to the list of recipients against which it | |
200 | encrypts data without Alice having to know how ~ninfacyzga-01~ is | |
201 | programmed (she might have a ~calkuptcana~ agent on her person that | |
202 | broadcasts her privacy preferences). Meanwhile, ~ninfacyzga-01~ may | |
203 | publish its observations to a repository that Alice and other members | |
204 | of the shared communal space have access to (ex: a read-only shared | |
205 | directory on a local network WiFi). Alice could download all the files | |
206 | in the shared repository but she would only be able to decrypt files | |
207 | generated when she was physically near enough to ~ninfacyzga-01~ for | |
208 | it to detect that her presence was within some spatial boundary. | |
209 | ||
210 | A more restrictive rule could resemble the permissive rule in that | |
211 | ~ninfacyzga-01~ uses Alice's encryption public key only when she is | |
212 | physically near by, except that it encrypts logged files against | |
213 | public keys in a sequential manner. This would mean that all people | |
214 | who were near ~ninfacyzga-01~ would have to pass around each log file | |
215 | to eachother so that they could decrypt the content. | |
216 | ||
217 | That said, according to [[https://www.rsyslog.com/doc/master/tutorials/database.html][this ~rsyslog~ page]], ~rsyslog~ is more a data | |
218 | wrangling system for collecting data from disparate sources of | |
219 | different types and outputting data to text files on disk than a | |
220 | system committed to the server-client model of database storage. So, I | |
221 | think converting ~bkgpslog~ into a ~bklog~ script that appends | |
222 | encrypted and compressed data to a tar file for later extraction | |
223 | (possibly the same script with future features) would be best. | |
224 | ||
1a1738c4 SBS |
225 | ** TODO: Place persistent recip. updates in asynchronous coproc |
226 | 2020-07-06T19:37Z; bktei> In order to update the recipient list, the | |
227 | magicParseRecipientDir() function needs to be run each buffer period | |
228 | in order to scan for changes in the recipient list. However, such a | |
229 | scan takes time; if the magicGatherWriteBuffer() function must pause | |
230 | until magicParseRecipientDir() completes, then a significant pause | |
231 | between buffer sessions may occur, causing detectable gaps in location | |
232 | data between buffer rounds. | |
233 | ||
234 | I looked for ways in which I might start magicParseRecipientDir() | |
235 | asynchronously immediately before running the data collection command | |
236 | and then collect its output at the start of the next buffer round. One | |
237 | way using the ~coproc~ Bash built-in is described [[https://stackoverflow.com/a/20018504/10850071][here]]. I'd have to | |
238 | make the asynchronous function output the recipient list to stdout | |
239 | which would then be ~read~ into the ~recPubKeysValid~ array in the | |
240 | main loop. However, for now, I'm putting the magicParseRecipientDir() | |
241 | as-is in the main loop and accepting the delay for now. | |
6c30388f SBS |
242 | * bkgpslog narrative |
243 | ** Initialize environment | |
244 | *** Init variables | |
245 | **** Save timeStart (YYYYmmddTHHMMSS±zz) | |
246 | *** Define Functions | |
247 | **** Define Debugging functions | |
248 | **** Define Argument Processing function | |
249 | **** Define Main function | |
250 | ** Run Main Function | |
251 | *** Process Arguments | |
252 | *** Set output encryption and compression option strings | |
253 | *** Check that critical apps and dirs are available, displag missing ones. | |
254 | *** Set lifespans of script and buffer | |
255 | *** Init temp working dir ~DIR_TMP~ | |
256 | Make temporary dir in tmpfs dir: ~/dev/shm/$(nonce)..bkgpslog/~ (~DIR_TMP~) | |
257 | *** Initialize ~tar~ archive | |
258 | **** Write ~bkgpslog~ version to ~$DIR_TMP/VERSION~ | |
259 | **** Create empty ~tar~ archive in ~DIR_OUT~ at ~PATHOUT_TAR~ | |
260 | ||
261 | Set output file name to: | |
262 | : PATHOUT_TAR="$DIR_OUT/YYYYmmdd..hostname_location.gz.age.tar" | |
263 | Usage: ~iso8601Period $timeStart $timeEnd~ | |
264 | ||
265 | **** Append ~VERSION~ file to ~PATHOUT_TAR~ | |
266 | ||
267 | Append ~$DIR_TMP/VERSION~ to ~PATHOUT_TAR~ via ~tar --append~ | |
268 | ||
269 | *** Read/Write Loop (Record gps data until script lifespan ends) | |
270 | **** Determine output file paths | |
271 | **** Define GPS conversion commands | |
272 | **** Fill Bash variable buffer from ~gpspipe~ | |
273 | **** Process bufferBash, save secured chunk set to ~DIR_TMP~ | |
274 | **** Append each secured chunk to ~PATHOUT_TAR~ | |
275 | : tar --append --directory=DIR_TMP --file=PATHOUT_TAR $(basename PATHOUT_{NMEA,GPX,KML} ) | |
276 | **** Remove secured chunk from ~DIR_TMP~ |