Reviewing GPS for NZ Trail Running – the Why, How, and Who, etc.

Why

Because nobody is reviewing running GPS watches for trail runners in a comprehensive, candid, and objective way in NZ trail conditions. And conditions are everything with respect to GPS trail performance.

If not already obvious we are reviewing for running the trails, not tri, not road, not cross-fit. This means we prioritise some things more highly than others, such as – accuracy under canopy, endurance, and reliability (all day/all conditions). In other words some features and style are readily conceded if means better core trail performance.

How

Each review is comprised of five sections – a summary, field testing, functional tests, a trail ready feature checklist, general usability assessment, and long term usage verdicts. These are assessed against a baseline of results from other models reviewed to produce a summary and trail-runner type match.

The summary tries to fit the model to the trail runner. Gives an overall synopsis, best runner fit, long term outlook, and the good and bad of the watch

Field tests are the real-world testing of the GPS watches. A number of standardised tests have been designed to provide meaningful, objective data to assess positional accuracy, distance, elevation, and battery life in NZ trail conditions. These tests are undertaken and repeated in various terrain and satellite availability conditions involving +100km of running per unit. There’s no substitute for doing the tests in situ, and obsessively repeating them till outcomes are either predictable, or knowingly unpredictable.

The functional tests are putting to key features to task on some common tasks like navigation, race pacing, and data exchange.

The feature checklist is made up of the core features we use in training and racing and expect in a trail ready watch. These are categorised as – trail standard, ultra running, and nice to have features.

The general usability assessment is about general design and functionality.

Long term usability verdicts are the consensus views of the MEC test lab members who’ve got either got +500km running experience with the unit, or were no longer able to cope with the issues encountered and disposed of the unit.

When

The reviews are written in an ongoing journal type manner as time permits and new firmware/web service updates come on board. Not surprisingly these tests take some time, with a couple hundred kilometres needed for the field tests and +500km for the long term verdicts (so don’t hassle us, we have spouses, kids, jobs, and brewing to attend to).

The increased feature dependence on paired web services, and a modern development cycle of an hardware release followed by multiple firmware updates also limit a static review. Updates to the review will be clearly identifiable, so as to be clear on any past issues encountered and resolved.

Who

The Maungakiekie Endurance Club (MEC) is made up of a bunch of people who share a love of trail running and other endurance sports. Experience ranges widely from seasoned ultra-runners to those new to trail. The MEC test labs draws on this collective to provide the long term usability verdicts and match assessments.

What (can you expect)

These reviews are generally undertaken using a sample of one GPS unit per model. Intra-model variability is an unknown, and anecdotal reports without objective testing makes interpretation difficult. Unless otherwise stated all GPS units are sourced via standard retail channels.

A good field test outcome here means the model in question is capable of performing, but good performance for you may not be inevitable. Conversely a bad outcome may mean either the model is a poor design, or there is variation in performance in that model (both of which are a poor result for consumers).

The field tests were undertaken in a specific set of semi-controlled conditions to enable valid comparisons between models. While the results are generalizable to similar conditions (given a large enough sample), we have also made a reasoned assessment on likely performance in tougher/more fun conditions. Your mileage will definitely vary.

Testing Methodology

Two standard courses were surveyed with respect to distance and absolute XYZ positioning. The courses were constructed using post-processed differential-GPS data validated and aligned against high resolution aerial photography. Elevation was determined using 0.5m LIDAR (laser measured elevation) data. These courses were confirmed in distance by running each course with a large diameter calibrated measuring wheel on multiple occasions.

Each course has various sections representing easy and more taxing GPS conditions. Neither of the courses have conditions approaching the full difficulty of typical NZ bush trails. But if performance was seen to suffer in these conditions then the unit would presumably have little hope of accuracy in the real thing.

Last the units are taken out into known but un-surveyed courses representing typically difficult NZ GPS conditions, including gorges, steep terrain, and heavy tree cover. Absolute accuracy is not measured on these outings, but relative performance is assessed on repeated laps and in the context of the other results.

On the surveyed courses, distances and absolute positional errors are measured (how far off the measured course each recorded track is). On the difficult un-surveyed course relative positional error over multiple laps is assessed (how far off each recorded lap is from each other) as well as the variation across GPS units.

Each surveyed course is run over multiple laps over multiple days with at least two GPS units in play (to check against truly anomalous GPS conditions). All activities are logged in terms of potential satellite reception (Geometric Dilution of Precision) to ensure results between models are comparable. Additionally some activities are pre-planned to check performance under intentionally poor GPS coverage conditions.

Sampling over multiple days is important in testing accuracy as within session samples are not independent. If you’ve used a fitness GPS for a while, you’ll also know that they all have bad days. Testing needs to be sufficient to capture these occurrences.

Accuracy measurement is undertaken in a Geographic Information System from raw GPX track points for positional accuracy (orthogonal distance from surveyed course). Data from the watches is grabbed directly from the PC and converted to a GPX for the positional tests only.

Distance and elevation is as recorded on the watch since these are filtered/smoothed according to watch smarts and give different results from the GPX. Filtering and smoothing can really improve pace and distance figures as the rubbish GPS data is discarded, but can only do so much with dodgy data. Course distances are measured by comparing reported lap distances (as reported on watch) against surveyed course distance. Elevation is assessed by measuring watch reported climb against surveyed course climb.

Pace (where reported) is assessed by running with a custom measuring wheel, current pace is visually checked at steady pace and during changes of pace. The recorded pace is then compared across units using SportTracks to check for lag, spikes, and over-smoothing.

As well as quantifiable results, the form of the GPX tracks are also described. This provides some insight to what kind of errors are affecting positional and distance accuracy. Tracks are described in terms of smoothing (corner cutting), random scatter, and shadowing (tracks running parallel to actual position). Trackpoint density clouds also provide a visual summary of GPS performance.

The battery run down test is pretty much as its sounds. With the watch on it’s most accurate setting and HR sensor paired, we run them to exhaustion in real trail conditions. If opportunities allow the watch will also be tested with battery saving settings on.

Software Used

Windows desktop software used for the testing includes QGIS, R, SportTracks, GPS Track Editor, and Bipolar. Relevant desktop apps by the manufacturer may also be used (eg. desktop Basecamp for Garmin).

Functional Tests

Just cos there is a vague reference in the brochure doesn’t mean it does what you think it’s going to do. We put the watch to a number of set tasks involving data exchange, navigation, and pacing.

Data exchange is moving the data off the watch to native and non-native services and applications, both via mobile and desktop apps. Currently this includes Strava and SportTracks. The native web service is also checked to see if data can be extracted individually/bulk or imported to the service from other model GPS (in full fidelity).

Navigation includes a set task of creating a trail route (that auto-follows trails) and waypoints against a topo map. The course should have waypoint alert functionality cooked in, as on-watch breadcrumbs by themselves don’t help a whole lot when things get tough (sometimes we need all the assistance we can get). Additional stuff like a course elevation profile and elevation on waypoints, waypoint autolapping, distance to waypoint and ETA that respect the planned course are also checked. A basic ‘get me back to the start’ navigation function is checked. Also waypoints need to be able to be effectively collected and managed (imported and exported) from the watch or webservice.

Race pacing is simply a check of what function the watch has to run against a set race pace or time, or alternatively against a past recorded event (with changing pace).

Feature Checklist

Watches are assessed against a set of standard trail relevant features we consider core or pretty useful.

For us a design critical to lifelong athletes is the adherence to open standards in both sensors and data formats. For one thing you tend to collect sensors. More importantly though you amass data, having this locked up in a proprietary format or service doesn’t make for a good long term strategy. An ability to manage our data independent to a manufacturers software and services is fundamental.

General Trail Running Feature Set
  • GPS accuracy under canopy
  • Consistent GPS performance
  • Rapid GPS Acquisition
  • HRM
  • Cadence option
  • Battery 8hr (with HRM)
  • Barometer (for altitude)
  • Basic breadcrumb with waypoint navigation
  • Vibration alerts
  • Trail legible display
  • Open data access
  • Sensor Standards Compliant
Standard Ultra Feature Set (as per trail running plus)
  • Battery 14hr+ with HRM and high accuracy recording
  • Battery 24hr+ with HRM and down-sampling
  • Electronic compass
Nice to Have Features
  • Mobile uploads (Android/Apple)
  • Cadence (without footpod)
  • HRV (R-R) recording with recovery estimate/test
  • Footpod GPS override option
  • Basic interval workout ability
  • Pacing function
  • Position/waypoint autolapping
  • Feed/drink or run/walk timing reminders
  • Everyday watch with
    • Activity Tracking
    • Mobile notifications