Python tools for the BlackVue dashcam

TLDR: By default the options of getting data out of BlackVue dashcams in open standards is limited. I have created some tools to improve this. Run pip install blackvue_acc blackvue_gps blackvue_wifi to get them all.

Recently I bought a BlackVue dashcam, equipped with GPS, accelerometer and Wi-Fi. This makes it a nice (and connected) source of data. The default way this data comes out of the camera is not great though. Some of the disadvantages are the closed-source apps and desktop software, with some features even requiring a BlackVue cloud account.

In this post I write about the data in these dashcams, how to get it out, and how to parse it.

  • blackvue_acc: package for extracting and parsing the accelerometer data
  • blackvue_gps: package for extracting and parsing the GPS data
  • blackvue_wifi: package for interacting with the HTTP API over WiFi

GPS data

All data the dashcam produces is embedded in the mp4-files as datastreams. gandy92 created a script to parse these datastreams and give the files a logical extension: blackclue

The resulting file with the GPS records has the extension .nmea. The contents were a unix timestamp in square brackets, followed by a NMEA ‘sentence’. The 1990s internet provides more information about NMEA records (1) (2). Fortunately, there was a Python package that could parse these records. My script (blackvue_gps) first extracts the timestamp with a regex, and then simply takes the parsed NMEA-records that have a latitude and longitude, and writes the result to a csv.

The code looks somewhat like the code block below (full source):

# imports and looping are omitted here
unix_ms = re.findall(r'(?!\[)[0-9]*(?=\])', line)[0]
parsed_nmea = pynmea2.parse(line.split(']')[-1])
if hasattr(parsed_nmea, 'latitude') and hasattr(parsed_nmea, 'longitude'):
    lat = parsed_nmea.latitude
    lon = parsed_nmea.longitude

If you just want csv files with your gps records, run

pip install blackvue_gps

to install the tool and then run

blackvue_gps --to-csv /path/to/your/folder

or

blackvue_gps --to-csv /path/to/your/file.mp4

When visualising the data in this csv file, the high resolution of the GPS data really stood out. The location is logged with a frequency of about once a second.

Accelerometer data

The previously mentioned blackclue library already adds separation to the binary .3gf blobs and creates a .3gf.txt. This file is basically a csv, but separated with n spaces instead of comma’s. Python’s built-in csv module has no support for regex separators, but pandas.read_csv does, so I used that, realising that it might be a bit overkill for the job.

       0 00000000 0064 0002 0012      0    100      2     18
       1 00000064 0078 fffa 0028    100    120     -6     40
       2 000000dc 0054 0008 000a    220     84      8     10
       3 00000153 0082 fffa 000e    339    130     -6     14
       4 000001b7 009e fffe 0010    439    158     -2     16
       5 0000021b 007a 0004 0004    539    122      4      4
       6 00000286 0076 fffc 0010    646    118     -4     16
       7 000002ed 0088 0006 002e    749    136      6     46
       8 00000351 0078 fffe 0016    849    120     -2     22

The block above contains an example of the first few lines of a .3gf.txt file. This also shows high resolution data, with measurements about 10 times per second.

Its columns are:

  • the milliseconds since the start of the video,
  • forces in three directions as signed 2 byte integers

By looking at the video associated with the file, it looked like 1 G(-force) equals 128 as an integer (gravity when stationary). Please remember that this is not yet confirmed. The order of the columns seemed to be y, x, z, when examined next to a video with some action in each direction.

The code looks somewhat like the code block below (full source):

# imports and looping omitted here
columnnames = ['id', 'ms_since_start_hex', 'y_hex',
               'x_hex', 'z_hex', 'ms_since_start',
               'y_int', 'x_int', 'z_int']
df = pandas.read_csv(filename,
                     sep='\s+',
                     names=columnnames)
df['y'] = df['y_int'] / 128  # wild guess factor is high here
df['x'] = df['x_int'] / 128  # 1G is assumed to be 128 as integer
df['z'] = df['z_int'] / 128  # based on what the sensor for y registers
df.to_csv(filename.replace('.3gf.txt', '.acc.csv'))

If you just want csv files with your accelerometer data, run

pip install blackvue_acc

to install the tool and then run

blackvue_acc --to-csv /path/to/your/folder

or

blackvue_acc --to-csv /path/to/your/file.mp4

Connecting over Wi-Fi

A file found on GitHub by johnhamelink had some good pointers about the location of the HTTP API exposed over Wi-Fi. With this, I wrote a script to list and download all recordings.

Again, installation is as easy as

pip install blackvue_wifi

Downloading all data from a dashcam connected over Wi-Fi can then be done with:

blackvue_wifi --download-all-recordings ./

The dashcam also has upload functionality. This creates the possibility of extracting the GPS and accelerometer data and saving it on the device itself. When the video files are then removed from the device to make space for new recordings, the other data can persist. You could do this with a permanently connected Raspberry Pi for example.

Conclusion

BlackVue dashcams are a nice source of data, and with some scripting you can get all of it out in open formats. In a future post, I’ll show ways to combine the data from these sources in an interactive visualisation.

If you wish to contribute features or bug fixes to the blackvue Python packages, here’s a list of issues you could work on:

  • GeoJSON output for blackvue_gps
  • Changing settings of the dashcam with blackvue_wifi
  • Unit tests (and maybe tox configuration) for all packages

Using a Google Spreadsheet as an app's backend. What could possibly go wrong?

TLDR: You could use a Google Spreadsheet as the backend of an app, but you really shouldn’t. Using Python to programmatically update parts of a spreadsheet makes way more sense than using a spreadsheet to get data in Python.

Imagine making an app for someone that would like to distribute a schedule for an event, that can change often during the event. You can’t include the schedule into the app itself, because users might end up with outdated data. Probably, you’ll choose to set up a REST API with some endpoints.

Now you need a way for the event organiser to update data. And what administrative interface would be more familiar to an event organiser than a spreadsheet?

What if I told you, you could use a Google Spreadsheet as a database backend for your app? With the Python packages gsheets and an HTTP server like Flask, you can have your API endpoint online in a matter of minutes. You are not even limited to reading data: gsheets can write to the sheet as well.

In the example code below, we want to get a list of sessions, with each session hosting multiple pitches. We do this join by letting columns in the spreadsheet refer to id columns of rows on a different sheet.

from flask import Flask, Response
from gsheets import Sheets
import json
from datetime import datetime

sheets = Sheets.from_files('client_secret.json', 'storage.json')
spreadsheet = sheets['identifier-of-your-spreadsheet-here']
app = Flask(__name__)


def process_sheet(spreadsheet):
    sessions = spreadsheet.find('Sessions')
    pitches = spreadsheet.find('Pitches')
    sessions_merged = []
    for session in sessions.values()[1:]:
        included_pitches = []
        for pitch in pitches.values():
            if pitch[4] == session[0]:  # nested for loops are a very sad and 
                                        # expensive way to join data
                included_pitches.append({"title": pitch[3],
                                         "speaker": pitch[1],
                                         "summary": pitch[2]})
        sessions_merged.append(
            {"name": session[1],
             "start": datetime.strptime(session[3], 
                                        '%d-%m-%Y %H:%M').isoformat(),
             "end": datetime.strptime(session[4], 
                                      '%d-%m-%Y %H:%M').isoformat(),
             "location": session[5],
             "host": session[2],
             "summaries": included_pitches
             }
        )
    return sessions_merged


@app.route('/sessions')
def sessions():
    response = Response(json.dumps(process_sheet(spreadsheet)))
    response.headers['Access-Control-Allow-Origin'] = '*'
    return response
    
    
if __name__ == "__main__":
    app.run()

Obviously, this code is far too brittle to be anywhere near a production system. Any wrong input, data type, change of column order, empty id column, empty row, or other change could crash the complete API endpoint. The sweet spot for using a script like this is when making a rapid prototype (i.e. hackathons). The admin backend is just a Google Spreadsheet, saving you valuable time developing a boring backend.

Please consider that spreadsheets:

  • typically don’t set data types for columns, but just for cells, making it very unpredictable what you’ll get
  • are not known for their scalability
  • let you do your foreign keys manually

Fair warning: Never YOLO a script like this into production. If you really want to use Google Spreadsheets as your production backend, you should add more checks and safeties (and maybe look into some of Google’s actual database services).

Nevertheless, it’s cool Google Spreadsheets has an API. In my opinion, using Python to programmatically update parts of a spreadsheet makes way more sense, than using a spreadsheet to get data in Python.

Edit: After posting this code snippet, I found an article describing the perfectly named npm module get-sheet-done. With this, you can remove the Flask API from the equation entirely, and interact with the Google Sheets API directly from JavaScript. In this case, your spreadsheet needs to be publicly visible though.