| 1 | * bkgpslog task list |
| 2 | ** DONE Add job control for short buffer length |
| 3 | CLOSED: [2020-07-02 Thu 16:04] |
| 4 | 2020-07-02T14:56Z; bktei> File write operations were bundled into a |
| 5 | magicWriteBuffer function that is called then detached from the script |
| 6 | shell (job control), but the detached job is not tracked by the main |
| 7 | script. A problem may arise if two instances of magicWriteBuffer |
| 8 | attempt to write to the same tar simultaneously. Two instances of |
| 9 | magicWriteBuffer may exist if the buffer length is low (ex: 1 second); |
| 10 | the default buffer length of 60 seconds should reduce the probability |
| 11 | of a collision but it should be possible for the main script to track |
| 12 | the process ID of a magicWriteBuffer() as soon as it detaches and then |
| 13 | checking (via ~$!~ as described [[https://bashitout.com/2013/05/18/Ampersands-on-the-command-line.html][here]]) that the process is still alive. |
| 14 | 2020-07-02T15:23Z; bktei> I found that the Bash ~wait~ built-in can be |
| 15 | used to delay processing until a specified job completes. The ~wait~ |
| 16 | command will pause script execution until all backgrounded processes |
| 17 | complete. |
| 18 | 2020-07-02T16:03Z; bktei> Added ~wait~. |
| 19 | ** DONE Rewrite tar initialization function |
| 20 | CLOSED: [2020-07-02 Thu 17:23] |
| 21 | 2020-07-02T17:23Z; bktei> Simplify tar initialization function so |
| 22 | VERSION file is used to test appendability of tar as well as to mark |
| 23 | when a new session is started. |
| 24 | ** DONE Consolidate tar checking/creation into function |
| 25 | CLOSED: [2020-07-02 Thu 18:33] |
| 26 | 2020-07-02T18:33Z; bktei> Simplify how the output tar file's existence |
| 27 | is checked and its status as a valid tar file is validated. This was |
| 28 | done using a new function ~checkMakeTar~. |
| 29 | ** DONE Add VERSION if output tar deleted between writes |
| 30 | |
| 31 | CLOSED: [2020-07-02 Thu 20:22] |
| 32 | 2020-07-02T20:21Z; bktei> Added bkgpslog-specified function |
| 33 | magicWriteVersion() to be called whenever a new time-stamped ~VERSION~ |
| 34 | file needs to be generated and appended to the output tar file |
| 35 | ~PATHOUT_TAR~. |
| 36 | ** DONE Rewrite buffer loop to reduce lag between gpspipe runs |
| 37 | |
| 38 | CLOSED: [2020-07-03 Fri 20:57] |
| 39 | 2020-07-03T17:10Z; bktei> As is, there is still a 5-6 second lag |
| 40 | between when ~gpspipe~ times out at the end of a buffer round and when |
| 41 | ~gpspipe~ is called by the subsequent buffer round. I believe this can |
| 42 | be reduced by moving variable manipulations inside the |
| 43 | asynchronously-executed magicWriteBuffer() function. Ideally, the |
| 44 | while loop should look like: |
| 45 | |
| 46 | #+BEGIN_EXAMPLE |
| 47 | while( $SECONDS < $SCRIPT_TTL); do |
| 48 | gpspipe-r > "$DIR_TMP"/buffer.nmea |
| 49 | writeBuffer & |
| 50 | done |
| 51 | #+END_EXAMPLE |
| 52 | 2020-07-03T20:56Z; bktei> I simplified it futher to something like |
| 53 | this: |
| 54 | #+BEGIN_EXAMPLE |
| 55 | while( $SECONDS < $SCRIPT_TTL); do |
| 56 | writeBuffer & |
| 57 | sleep $SCRIPT_TTL |
| 58 | done |
| 59 | #+END_EXAMPLE |
| 60 | |
| 61 | Raspberry Pi Zero W shows approximately 71ms of drift per buffer round |
| 62 | with 10s buffer. |
| 63 | ** TODO Feature: Recipient watch folder |
| 64 | 2020-07-03T21:28Z; bktei> This feature would be to scan the contents |
| 65 | of a specified directory at the start of every buffer round in order |
| 66 | to determine encryption (age) recipients. This would allow a device to |
| 67 | dynamically encrypt location data in response to automated changes |
| 68 | made by other tools. For example, if such a directory were |
| 69 | synchronized via Syncthing and changes to such a directory were |
| 70 | managed by a trusted remote server, then that server could respond to |
| 71 | human requests to secure location data. |
| 72 | |
| 73 | Two specific privacy subfeatures come to mind: |
| 74 | |
| 75 | 1. Parallel encryption: Given a set of ~n~ public keys, encrypt data |
| 76 | with a single ~age~ command with options causing all ~n~ pubkeys to |
| 77 | be recipients. In order to decrypt the data, any individual private |
| 78 | key could be used. No coordination between key owners would be |
| 79 | required to decrypt. |
| 80 | |
| 81 | 2. Sequential encryption: Given a set of ~n~ public keys, encrypt data |
| 82 | with ~n~ sequential ~age~ commands all piped in series with each |
| 83 | ~age~ command utilizing only one of the ~n~ public keys. In order |
| 84 | to decrypt the data, all ~n~ private keys would be required to |
| 85 | decrypt the data. Since coordination is required, it is less |
| 86 | convenient than parallel encryption. |
| 87 | |
| 88 | In either case, a directory would be useful for holding configuration |
| 89 | files specifying how to execute which or combination of which features |
| 90 | at the start of every buffer round. |
| 91 | |
| 92 | I don't yet know how to program the rules, although I think it'd be |
| 93 | easier to simply add an option providing ~bkgpslog~ with a directory |
| 94 | to watch. When examining the directory, check for a file with the |
| 95 | appropriate file extension (ex: .pubkey) and then read the first line |
| 96 | into the script's pubKey array. |
| 97 | |
| 98 | ** TODO Feature: Simplify option to reduce output size |
| 99 | |
| 100 | ~gpsbabel~ [[https://www.gpsbabel.org/htmldoc-development/filter_simplify.html][features]] a ~simplify~ option to trim data points from GPS |
| 101 | data. There are several methods for prioritizing which points to keep |
| 102 | and which to trim, although the following seems useful given some |
| 103 | sample data I've recorded in a test run of ninfacyzga-01: |
| 104 | |
| 105 | #+BEGIN_EXAMPLE |
| 106 | gpsbabel -i nmea -f all.nmea -x simplify,error=10,relative -o gpx \ |
| 107 | -F all-simp-rel-10.gpx |
| 108 | #+END_EXAMPLE |
| 109 | |
| 110 | An error level of "10" with the "relative" option seems to retain all |
| 111 | desireable features for GPS data while reducing the number of points |
| 112 | along straightaways. File size is reduced by a factor of |
| 113 | about 11. Noise from local stay-in-place drift isn't removed; a |
| 114 | relative error of about 1000 is required to remove stay-in-place drift |
| 115 | noise but this also trims all but 100m-size features of the recorded |
| 116 | path. A relative error of 1000 reduces file size by a factor of |
| 117 | about 450. |
| 118 | |
| 119 | #+BEGIN_EXAMPLE |
| 120 | 67M relerror-0.001.kml |
| 121 | 66M relerror-0.01.kml |
| 122 | 58M relerror-0.1.kml |
| 123 | 21M relerror-1.kml |
| 124 | 5.8M relerror-10.kml |
| 125 | 797K relerror-100.kml |
| 126 | 152K relerror-1000.kml |
| 127 | #+END_EXAMPLE |
| 128 | |
| 129 | ** TODO Feature: Generalize bkgpslog to bklog function |
| 130 | 2020-07-05T02:42Z; bktei> Transform ~bkgpslog~ into a modular |
| 131 | component called ~bklog~ such that it processes a stdout stream of any |
| 132 | external command, not just ~gpspipe -r~. This would permit reuse of |
| 133 | the ~bkgpslog~ code for logging not just GPS data but things like |
| 134 | pressure, temperature, system statistics, etc. |
| 135 | 2020-07-05T16:35Z; bktei> |
| 136 | : bklog -r age1asdf -o log.tar # encrypt/compress stdin to log.tar |
| 137 | : bklog -x -f log.tar -i age.key -O /tmp # extract and decrypt |
| 138 | |
| 139 | Making ~bklog~ follow the [[https://en.wikipedia.org/wiki/Unix_philosophy][Unix philosophy]] means that it shouldn't care |
| 140 | what kind of text is fed to it. |
| 141 | |
| 142 | *** ~bklog~ Design vs. Unix Philosophy |
| 143 | **** Pubkey dir watching |
| 144 | The feature of periodically checking a directory for changes in the |
| 145 | pubkeys it contains should be justified by its usefulness; if the |
| 146 | complexity cannot be justified then the feature should be removed. |
| 147 | **** Defaults vs options |
| 148 | Many options can cause the tool to become complex in unjustifiable |
| 149 | ways. Currently I am adding options because I want the ability to |
| 150 | modify the script's behavior without having to modify the source code |
| 151 | on the machine in which the code is running. I should consider |
| 152 | removing features at some point and having the program force defaults |
| 153 | on the user. For example, allowing the specification of a temporary |
| 154 | directory, while useful for me, is probably not useful for most people |
| 155 | who don't know or care about the difference between ~/tmp~ and |
| 156 | ~/dev/shm~. |
| 157 | **** Script time to live (TTL) |
| 158 | I initially implemented a script time-to-live feature because I was |
| 159 | unsure in my ability to program script that could run for long periods |
| 160 | of time without causing a runaway usage of memory. I still think it's |
| 161 | a good idea to offer a script TTL option to the user but I think the |
| 162 | default should be to simply run forver. |
| 163 | ** TODO: Evaluate ~rsyslog~ as stand-in for this work |
| 164 | 2020-07-05T02:57Z; bktei> I searched for "debian iot logging" ("iot" |
| 165 | as in "Internet of Things", the current buzzword for small low-power |
| 166 | computers being used to provide microservices for owners in their own |
| 167 | home) and came across several search results mentioning ~syslog~ and |
| 168 | ~rsyslog~. |
| 169 | |
| 170 | https://www.thissmarthouse.net/consolidating-iot-logs-into-mysql-using-rsyslog/ |
| 171 | https://rsyslog.readthedocs.io/en/latest/tutorials/tls.html |
| 172 | https://serverfault.com/questions/20840/how-would-you-send-syslog-securely-over-the-public-internet |
| 173 | https://www.rsyslog.com/ |
| 174 | |
| 175 | My impression is that ~rsyslog~ is a complex software package designed |
| 176 | to offer many features, some of which possibly might satisfy my |
| 177 | needs. |
| 178 | |
| 179 | However, as stated in the repository README, the objective of the |
| 180 | ~ninfacyzga-01~ project is "Observing facts of the new". This means |
| 181 | that the goal is not only to record location data but any data that |
| 182 | can be captured by a sensor. This means the capture of the following |
| 183 | environmental phenomena are within the scope of this device: |
| 184 | |
| 185 | *** Sounds (microphone) |
| 186 | *** Light (camera) |
| 187 | *** Temperature (thermocouple) |
| 188 | *** Air Pressure (barometer) |
| 189 | *** Acceleration Vector (acceleromter / gyroscope) |
| 190 | *** Magnetic Field Vector (magnetometer) |
| 191 | |
| 192 | This brings up the issue of respecting privacy of others in shared |
| 193 | spaces through which ~ninfacyzga-01~ may pass through. ~ninfacyzga-01~ |
| 194 | should encrypt data it records according to rules set by its |
| 195 | owner. |
| 196 | |
| 197 | One permissive rule could be that if ~ninfacyzga-01~ detects that a |
| 198 | person (let's call her Alice) enters a room, it should add Alice's |
| 199 | encryption public key to the list of recipients against which it |
| 200 | encrypts data without Alice having to know how ~ninfacyzga-01~ is |
| 201 | programmed (she might have a ~calkuptcana~ agent on her person that |
| 202 | broadcasts her privacy preferences). Meanwhile, ~ninfacyzga-01~ may |
| 203 | publish its observations to a repository that Alice and other members |
| 204 | of the shared communal space have access to (ex: a read-only shared |
| 205 | directory on a local network WiFi). Alice could download all the files |
| 206 | in the shared repository but she would only be able to decrypt files |
| 207 | generated when she was physically near enough to ~ninfacyzga-01~ for |
| 208 | it to detect that her presence was within some spatial boundary. |
| 209 | |
| 210 | A more restrictive rule could resemble the permissive rule in that |
| 211 | ~ninfacyzga-01~ uses Alice's encryption public key only when she is |
| 212 | physically near by, except that it encrypts logged files against |
| 213 | public keys in a sequential manner. This would mean that all people |
| 214 | who were near ~ninfacyzga-01~ would have to pass around each log file |
| 215 | to eachother so that they could decrypt the content. |
| 216 | |
| 217 | That said, according to [[https://www.rsyslog.com/doc/master/tutorials/database.html][this ~rsyslog~ page]], ~rsyslog~ is more a data |
| 218 | wrangling system for collecting data from disparate sources of |
| 219 | different types and outputting data to text files on disk than a |
| 220 | system committed to the server-client model of database storage. So, I |
| 221 | think converting ~bkgpslog~ into a ~bklog~ script that appends |
| 222 | encrypted and compressed data to a tar file for later extraction |
| 223 | (possibly the same script with future features) would be best. |
| 224 | |
| 225 | * bkgpslog narrative |
| 226 | ** Initialize environment |
| 227 | *** Init variables |
| 228 | **** Save timeStart (YYYYmmddTHHMMSS±zz) |
| 229 | *** Define Functions |
| 230 | **** Define Debugging functions |
| 231 | **** Define Argument Processing function |
| 232 | **** Define Main function |
| 233 | ** Run Main Function |
| 234 | *** Process Arguments |
| 235 | *** Set output encryption and compression option strings |
| 236 | *** Check that critical apps and dirs are available, displag missing ones. |
| 237 | *** Set lifespans of script and buffer |
| 238 | *** Init temp working dir ~DIR_TMP~ |
| 239 | Make temporary dir in tmpfs dir: ~/dev/shm/$(nonce)..bkgpslog/~ (~DIR_TMP~) |
| 240 | *** Initialize ~tar~ archive |
| 241 | **** Write ~bkgpslog~ version to ~$DIR_TMP/VERSION~ |
| 242 | **** Create empty ~tar~ archive in ~DIR_OUT~ at ~PATHOUT_TAR~ |
| 243 | |
| 244 | Set output file name to: |
| 245 | : PATHOUT_TAR="$DIR_OUT/YYYYmmdd..hostname_location.gz.age.tar" |
| 246 | Usage: ~iso8601Period $timeStart $timeEnd~ |
| 247 | |
| 248 | **** Append ~VERSION~ file to ~PATHOUT_TAR~ |
| 249 | |
| 250 | Append ~$DIR_TMP/VERSION~ to ~PATHOUT_TAR~ via ~tar --append~ |
| 251 | |
| 252 | *** Read/Write Loop (Record gps data until script lifespan ends) |
| 253 | **** Determine output file paths |
| 254 | **** Define GPS conversion commands |
| 255 | **** Fill Bash variable buffer from ~gpspipe~ |
| 256 | **** Process bufferBash, save secured chunk set to ~DIR_TMP~ |
| 257 | **** Append each secured chunk to ~PATHOUT_TAR~ |
| 258 | : tar --append --directory=DIR_TMP --file=PATHOUT_TAR $(basename PATHOUT_{NMEA,GPX,KML} ) |
| 259 | **** Remove secured chunk from ~DIR_TMP~ |