X-Git-Url: https://zdv2.bktei.com/gitweb/EVA-2020-02.git/blobdiff_plain/1a1738c494ed3fad751cf395aa4d4b493e61f7eb..7103c8b0857a3538a014dacfd0d99c209890dad8:/exec/bkgpslog-plan.org diff --git a/exec/bkgpslog-plan.org b/exec/bkgpslog-plan.org deleted file mode 100644 index e100962..0000000 --- a/exec/bkgpslog-plan.org +++ /dev/null @@ -1,276 +0,0 @@ -* bkgpslog task list -** DONE Add job control for short buffer length - CLOSED: [2020-07-02 Thu 16:04] -2020-07-02T14:56Z; bktei> File write operations were bundled into a -magicWriteBuffer function that is called then detached from the script -shell (job control), but the detached job is not tracked by the main -script. A problem may arise if two instances of magicWriteBuffer -attempt to write to the same tar simultaneously. Two instances of -magicWriteBuffer may exist if the buffer length is low (ex: 1 second); -the default buffer length of 60 seconds should reduce the probability -of a collision but it should be possible for the main script to track -the process ID of a magicWriteBuffer() as soon as it detaches and then -checking (via ~$!~ as described [[https://bashitout.com/2013/05/18/Ampersands-on-the-command-line.html][here]]) that the process is still alive. -2020-07-02T15:23Z; bktei> I found that the Bash ~wait~ built-in can be -used to delay processing until a specified job completes. The ~wait~ -command will pause script execution until all backgrounded processes -complete. -2020-07-02T16:03Z; bktei> Added ~wait~. -** DONE Rewrite tar initialization function - CLOSED: [2020-07-02 Thu 17:23] -2020-07-02T17:23Z; bktei> Simplify tar initialization function so -VERSION file is used to test appendability of tar as well as to mark -when a new session is started. -** DONE Consolidate tar checking/creation into function - CLOSED: [2020-07-02 Thu 18:33] -2020-07-02T18:33Z; bktei> Simplify how the output tar file's existence -is checked and its status as a valid tar file is validated. This was -done using a new function ~checkMakeTar~. -** DONE Add VERSION if output tar deleted between writes - - CLOSED: [2020-07-02 Thu 20:22] -2020-07-02T20:21Z; bktei> Added bkgpslog-specified function -magicWriteVersion() to be called whenever a new time-stamped ~VERSION~ -file needs to be generated and appended to the output tar file -~PATHOUT_TAR~. -** DONE Rewrite buffer loop to reduce lag between gpspipe runs - - CLOSED: [2020-07-03 Fri 20:57] -2020-07-03T17:10Z; bktei> As is, there is still a 5-6 second lag -between when ~gpspipe~ times out at the end of a buffer round and when -~gpspipe~ is called by the subsequent buffer round. I believe this can -be reduced by moving variable manipulations inside the -asynchronously-executed magicWriteBuffer() function. Ideally, the -while loop should look like: - -#+BEGIN_EXAMPLE -while( $SECONDS < $SCRIPT_TTL); do - gpspipe-r > "$DIR_TMP"/buffer.nmea - writeBuffer & -done -#+END_EXAMPLE -2020-07-03T20:56Z; bktei> I simplified it futher to something like -this: -#+BEGIN_EXAMPLE -while( $SECONDS < $SCRIPT_TTL); do - writeBuffer & - sleep $SCRIPT_TTL -done -#+END_EXAMPLE - -Raspberry Pi Zero W shows approximately 71ms of drift per buffer round -with 10s buffer. -** TODO Feature: Recipient watch folder -2020-07-03T21:28Z; bktei> This feature would be to scan the contents -of a specified directory at the start of every buffer round in order -to determine encryption (age) recipients. This would allow a device to -dynamically encrypt location data in response to automated changes -made by other tools. For example, if such a directory were -synchronized via Syncthing and changes to such a directory were -managed by a trusted remote server, then that server could respond to -human requests to secure location data. - -Two specific privacy subfeatures come to mind: - -1. Parallel encryption: Given a set of ~n~ public keys, encrypt data - with a single ~age~ command with options causing all ~n~ pubkeys to - be recipients. In order to decrypt the data, any individual private - key could be used. No coordination between key owners would be - required to decrypt. - -2. Sequential encryption: Given a set of ~n~ public keys, encrypt data - with ~n~ sequential ~age~ commands all piped in series with each - ~age~ command utilizing only one of the ~n~ public keys. In order - to decrypt the data, all ~n~ private keys would be required to - decrypt the data. Since coordination is required, it is less - convenient than parallel encryption. - -In either case, a directory would be useful for holding configuration -files specifying how to execute which or combination of which features -at the start of every buffer round. - -I don't yet know how to program the rules, although I think it'd be -easier to simply add an option providing ~bkgpslog~ with a directory -to watch. When examining the directory, check for a file with the -appropriate file extension (ex: .pubkey) and then read the first line -into the script's pubKey array. - -** TODO Feature: Simplify option to reduce output size - -~gpsbabel~ [[https://www.gpsbabel.org/htmldoc-development/filter_simplify.html][features]] a ~simplify~ option to trim data points from GPS -data. There are several methods for prioritizing which points to keep -and which to trim, although the following seems useful given some -sample data I've recorded in a test run of ninfacyzga-01: - -#+BEGIN_EXAMPLE -gpsbabel -i nmea -f all.nmea -x simplify,error=10,relative -o gpx \ --F all-simp-rel-10.gpx -#+END_EXAMPLE - -An error level of "10" with the "relative" option seems to retain all -desireable features for GPS data while reducing the number of points -along straightaways. File size is reduced by a factor of -about 11. Noise from local stay-in-place drift isn't removed; a -relative error of about 1000 is required to remove stay-in-place drift -noise but this also trims all but 100m-size features of the recorded -path. A relative error of 1000 reduces file size by a factor of -about 450. - -#+BEGIN_EXAMPLE - 67M relerror-0.001.kml - 66M relerror-0.01.kml - 58M relerror-0.1.kml - 21M relerror-1.kml -5.8M relerror-10.kml -797K relerror-100.kml -152K relerror-1000.kml -#+END_EXAMPLE - -** TODO Feature: Generalize bkgpslog to bklog function -2020-07-05T02:42Z; bktei> Transform ~bkgpslog~ into a modular -component called ~bklog~ such that it processes a stdout stream of any -external command, not just ~gpspipe -r~. This would permit reuse of -the ~bkgpslog~ code for logging not just GPS data but things like -pressure, temperature, system statistics, etc. -2020-07-05T16:35Z; bktei> -: bklog -r age1asdf -o log.tar # encrypt/compress stdin to log.tar -: bklog -x -f log.tar -i age.key -O /tmp # extract and decrypt - -Making ~bklog~ follow the [[https://en.wikipedia.org/wiki/Unix_philosophy][Unix philosophy]] means that it shouldn't care -what kind of text is fed to it. - -*** ~bklog~ Design vs. Unix Philosophy -**** Pubkey dir watching -The feature of periodically checking a directory for changes in the -pubkeys it contains should be justified by its usefulness; if the -complexity cannot be justified then the feature should be removed. -**** Defaults vs options -Many options can cause the tool to become complex in unjustifiable -ways. Currently I am adding options because I want the ability to -modify the script's behavior without having to modify the source code -on the machine in which the code is running. I should consider -removing features at some point and having the program force defaults -on the user. For example, allowing the specification of a temporary -directory, while useful for me, is probably not useful for most people -who don't know or care about the difference between ~/tmp~ and -~/dev/shm~. -**** Script time to live (TTL) -I initially implemented a script time-to-live feature because I was -unsure in my ability to program script that could run for long periods -of time without causing a runaway usage of memory. I still think it's -a good idea to offer a script TTL option to the user but I think the -default should be to simply run forver. -** TODO: Evaluate ~rsyslog~ as stand-in for this work -2020-07-05T02:57Z; bktei> I searched for "debian iot logging" ("iot" -as in "Internet of Things", the current buzzword for small low-power -computers being used to provide microservices for owners in their own -home) and came across several search results mentioning ~syslog~ and -~rsyslog~. - -https://www.thissmarthouse.net/consolidating-iot-logs-into-mysql-using-rsyslog/ -https://rsyslog.readthedocs.io/en/latest/tutorials/tls.html -https://serverfault.com/questions/20840/how-would-you-send-syslog-securely-over-the-public-internet -https://www.rsyslog.com/ - -My impression is that ~rsyslog~ is a complex software package designed -to offer many features, some of which possibly might satisfy my -needs. - -However, as stated in the repository README, the objective of the -~ninfacyzga-01~ project is "Observing facts of the new". This means -that the goal is not only to record location data but any data that -can be captured by a sensor. This means the capture of the following -environmental phenomena are within the scope of this device: - -*** Sounds (microphone) -*** Light (camera) -*** Temperature (thermocouple) -*** Air Pressure (barometer) -*** Acceleration Vector (acceleromter / gyroscope) -*** Magnetic Field Vector (magnetometer) - -This brings up the issue of respecting privacy of others in shared -spaces through which ~ninfacyzga-01~ may pass through. ~ninfacyzga-01~ -should encrypt data it records according to rules set by its -owner. - -One permissive rule could be that if ~ninfacyzga-01~ detects that a -person (let's call her Alice) enters a room, it should add Alice's -encryption public key to the list of recipients against which it -encrypts data without Alice having to know how ~ninfacyzga-01~ is -programmed (she might have a ~calkuptcana~ agent on her person that -broadcasts her privacy preferences). Meanwhile, ~ninfacyzga-01~ may -publish its observations to a repository that Alice and other members -of the shared communal space have access to (ex: a read-only shared -directory on a local network WiFi). Alice could download all the files -in the shared repository but she would only be able to decrypt files -generated when she was physically near enough to ~ninfacyzga-01~ for -it to detect that her presence was within some spatial boundary. - -A more restrictive rule could resemble the permissive rule in that -~ninfacyzga-01~ uses Alice's encryption public key only when she is -physically near by, except that it encrypts logged files against -public keys in a sequential manner. This would mean that all people -who were near ~ninfacyzga-01~ would have to pass around each log file -to eachother so that they could decrypt the content. - -That said, according to [[https://www.rsyslog.com/doc/master/tutorials/database.html][this ~rsyslog~ page]], ~rsyslog~ is more a data -wrangling system for collecting data from disparate sources of -different types and outputting data to text files on disk than a -system committed to the server-client model of database storage. So, I -think converting ~bkgpslog~ into a ~bklog~ script that appends -encrypted and compressed data to a tar file for later extraction -(possibly the same script with future features) would be best. - -** TODO: Place persistent recip. updates in asynchronous coproc -2020-07-06T19:37Z; bktei> In order to update the recipient list, the -magicParseRecipientDir() function needs to be run each buffer period -in order to scan for changes in the recipient list. However, such a -scan takes time; if the magicGatherWriteBuffer() function must pause -until magicParseRecipientDir() completes, then a significant pause -between buffer sessions may occur, causing detectable gaps in location -data between buffer rounds. - -I looked for ways in which I might start magicParseRecipientDir() -asynchronously immediately before running the data collection command -and then collect its output at the start of the next buffer round. One -way using the ~coproc~ Bash built-in is described [[https://stackoverflow.com/a/20018504/10850071][here]]. I'd have to -make the asynchronous function output the recipient list to stdout -which would then be ~read~ into the ~recPubKeysValid~ array in the -main loop. However, for now, I'm putting the magicParseRecipientDir() -as-is in the main loop and accepting the delay for now. -* bkgpslog narrative -** Initialize environment -*** Init variables -**** Save timeStart (YYYYmmddTHHMMSS±zz) -*** Define Functions -**** Define Debugging functions -**** Define Argument Processing function -**** Define Main function -** Run Main Function -*** Process Arguments -*** Set output encryption and compression option strings -*** Check that critical apps and dirs are available, displag missing ones. -*** Set lifespans of script and buffer -*** Init temp working dir ~DIR_TMP~ -Make temporary dir in tmpfs dir: ~/dev/shm/$(nonce)..bkgpslog/~ (~DIR_TMP~) -*** Initialize ~tar~ archive -**** Write ~bkgpslog~ version to ~$DIR_TMP/VERSION~ -**** Create empty ~tar~ archive in ~DIR_OUT~ at ~PATHOUT_TAR~ - -Set output file name to: -: PATHOUT_TAR="$DIR_OUT/YYYYmmdd..hostname_location.gz.age.tar" -Usage: ~iso8601Period $timeStart $timeEnd~ - -**** Append ~VERSION~ file to ~PATHOUT_TAR~ - -Append ~$DIR_TMP/VERSION~ to ~PATHOUT_TAR~ via ~tar --append~ - -*** Read/Write Loop (Record gps data until script lifespan ends) -**** Determine output file paths -**** Define GPS conversion commands -**** Fill Bash variable buffer from ~gpspipe~ -**** Process bufferBash, save secured chunk set to ~DIR_TMP~ -**** Append each secured chunk to ~PATHOUT_TAR~ -: tar --append --directory=DIR_TMP --file=PATHOUT_TAR $(basename PATHOUT_{NMEA,GPX,KML} ) -**** Remove secured chunk from ~DIR_TMP~