Google Smartphone Decimeter Challenge

Last year, Google hosted a Kaggle competition to see who could generate the most accurate solutions for a large number of raw observation data sets collected with Android phones on vehicles driven around the Bay Area. The phones were located inside the vehicle on the front dash and they did not use ground planes under the phones, making the data far more challenging than the previous static cell phone data sets I have explored.

I took a quick look at the data when they first posted it, but felt that the quality was outside the scope of what RTKLIB could reasonably handle, given the low quality of the collection environment, and so I did not pursue it. Now that I have gained some experience working with a few other cell phone data sets, I thought it might be time to take another look at this data. In the interest of full disclosure, this second look was also motivated by a very generous contribution by Google to support and maintain the demo5 RTKLIB code.

This turned out to be a very useful exercise. Most of my efforts with RTKLIB have been focused on improving the ambiguity resolution but in this case the errors were too large for ambiguity resolution to be reliable so I had to focus entirely on improving the float solution. The large and frequent errors severely stressed the RTKLIB solution code and exposed several weaknesses that do not normally show up with the cleaner raw data used for more typical precision solutions. So I am thankful to Google not just for their contribution but also for encouraging me to perform this exercise.

The input data for the competition is available here. The raw data files are in the Android GNSSLogger raw format and include IMU and other sensor data from the phones in addition to the GNSS data. As a starting point, Google has also included standard RINEX files converted from the GNSSLogger files as well as what is a very good set of standard precision GNSS solutions considering the difficulty of the data. The goal of the competition is to use any combination of the provided input data to generate a set of solutions with the minimum error when compared to the ground truths. The data is divided into two sets, the training data for which the ground truths are included and the test data for which the ground truths are not included. The intent is for competitors to develop and evaluate their algorithms using the training data, then submitting their solutions for the test data. Submissions are scored using the mean of the 50th and 95th percentile distance errors for the test data set.

For my exercise I will focus only on providing a set of PPK baseline solutions similar to the baseline solutions that Google provided and for the most part leave the post-processing filtering and other opportunities for others to explore. The one exception I will make is that I did find a few of the data sets were too poor to resolve with RTKLIB so I will merge the solutions from all phones for each ride into a single combined solution, ignoring the small distance between phones. More details on this later.

The baseline solutions have the advantage of being able to use satellites from all the constellations in the raw data (GPS, GLONASS, Galileo, Bediou, and QZSS) while the PPK solutions can only use satellites with matching observations in the local base data (GPS, GLONASS, and Galileo in this case). However, the PPK solutions have the advantage that by differencing the observations between the smartphones and a nearby base station, they can cancel out most of the atmospheric, orbital, and clock errors. In addition, the baseline solutions use only the pseudorange measurements while the PPK solutions can take advantage of the carrier phase measurements as well. In general, the advantages from differencing with the base data and using the carrier phases should outweigh the disadvantage of having fewer satellites, and I would expect the PPK solutions should be more accurate than the baseline solutions, but we’ll see.

So let’s get started.

Since I am going to generate PPK solutions, the first thing I need is some nearby base observations. For these, I downloaded raw observation data for the appropriate dates and times for a nearby CORS station from the NOAA National Geodetic Survey website. There were several different CORS stations I could have used but I chose the SLAC station because it was reasonably close to all of the data collection rides and contained Galileo observations in addition to GPS and GLONASS.

I also needed satellite navigation data for each data set so I downloaded the BRDM files from the International GNSS Service website. This is the easiest place I have found to get navigation files that contain data for all of the GNSS constellations. The CORS data I downloaded above included navigation data for GPS and GLONASS, but not for Galileo.

Next, let’s look at the raw observation data from the smartphones. One challenge when working with any raw receiver data with RTKLIB is that there is always more information in the raw data than can be translated directly into either the RINEX format or the RTKLIB internal variables. This means that to take advantage of this additional data, the conversion process needs to be more than just a simple translation, it also needs to include some interpretation and possible consolidation of the data. For this reason, I chose not to start with the provided RINEX files which can be read directly by RTKLIB, but instead to translate the raw GNSSlogger files to RINEX using a python script. This gives us the opportunity to take some advantage of the additional information in the raw file before it is discarded.

As a starting point, I used a python script from Robukun that I found on Github to do the GNSSLogger to RINEX conversion. I modified it to use the same set of rules that Google described for the pre-processing of the raw data for their baseline solutions. Interestingly, the RINEX files they provided did not appear to follow this set of rules. In addition to a few small tweaks to these rules, I made two more significant changes. First, I ignored all of the cycle slip or half cycle ambiguity flags in the raw data. These appear to be too conservative and caused RTKLIB to throw out too much useful data.

I also embedded the reciever’s pseudorange and carrier phase uncertainty estimates into the older, mostly unused legacy SNR field using the same format that the demo5 code uses to do the same thing for the u-blox raw data. I ended up not using the uncertainty estimates in my solution but they are available for future exploration. In the latest demo5 code, observation weighting can be based on any arbitrary combination of elevation, SNR, and, when available, receiver uncertainty, but for this solution I used the standard elevation-only weighting.

I have included the modified python script along with all the other files required to duplicate my results in a “smartphone” release package of the demo5 code available here.

This python script will convert each GNSSLogger raw data file into a RINEX file that can then be processed with RTKPOST or RNX2RTKP. I started with RTKPOST to develop a solution configuration file on a couple of datasets, then used RNX2RTKP to batch process all of the data sets. I used a modified version of the batch-processing python script that I described in my last post to do both the GNSSLogger->Rinex conversions and to run the RTKLIB PPK solutions.

I started with the b34d version of the demo5 RTKLIB code and the config file I used for the previous cell phone data analysis but ended up needing to make some changes to both the code and the config file. Some of the changes are in the b34e RTKLIB code but others are only in the “smartphone” release of the code since I haven’t yet confirmed that they don’t cause any issues with more typical PPK/RTK solutions.

The most significant change to the RTKLIB code was to enable cycle slip detection using the doppler raw measurements. This feature has been in the code for many years but has been commented out because it is unable to distinguish between clock jumps and cycle slips. By rewriting the function to process all satellites in a single call instead of just a single satellite I was able to remove the common-mode effect of the clock jumps. This feature compares the doppler measurement to the change in carrier phase measurement since the doppler shift of a given signal is the time derivative of the carrier phase. It flags a cycle slip if the difference between these two values exceeds a user-configurable threshold.

Because I removed the cycle slip flags from the RINEX files, it was very important to have a reasonably reliable alternative method to detect the cycle slips. The existing geometry-free detection method works quite well for satellites with dual-frequency measurements but many of the satellites in these data sets have only single frequency measurements and so this test can not be applied. By only responding to verified cycle slips instead of every flagged potential slip, the code is much better able to preserve the phase bias estimates of each satellite.

In addition to this change to the RTKLIB code, I made a few other minor changes which allowed the code to degrade a little more gracefully in the presence of very low quality raw observations.

I started with the configuration file I used in my recent post for the Xiamoi Mi8 static cell phone data sets but made the following changes:

Positioning mode: static -> kinematic
GNSS constellations: disable Beidou
Integer Ambiguity Res: on -> off
Slip Thresh: Geom-Free: 0.05 -> 0.10
Slip Thresh: Doppler: N/A -> 5.0
Time Format: hms -> tow
Phase Error Ratio L1/L5: 1500/300 -> 300/100
Phase Error a+b: 0.006/0.006 -> 0.003/0.003
Carrier Phase Bias: 0.001 -> 0.01

Switching the position mode from static to kinematic is self-explanatory. I disabled Beidou because the base observations did not include it. I disabled ambiguity resolution because the errors in this data are too large to reliably resolve the ambiguities and we would just end up with many false fixes. I also increased the threshold for geometry-free slips because of the larger errors and added a new parameter for the new doppler slip detection. I switched the time format in the solution file just because it was more compatible with the format used in the Google baseline files. I have less solid justification for the last three changes and am also less certain that these are optimal but was trying to increase the weighting of the pseudorange measurements while also accounting for the lower confidence in the carrier phase biases remaining constant due to undetected slips. There was also a bit of trial and error on a couple of the data sets with these parameters before applying them to the full data set. If you make a close comparison between these config files you will notice that a few other parameters also changed but these are all related to the ambiguity resolution which is turned off so they have no effect on the results. The config file with all of these changes is included in the release package.

With these changes I was able to generate PPK solutions for all of the data in the training set. Comparing these to the provided ground truths showed that the majority of the solutions were reasonably good relative to the provided baseline solutions, but a few were very, very poor.

Plotting the raw observations for the poor quality solutions showed that the raw observations for these datasets were distinctly worse than the others. The plot below shows excerpts of the raw observations for two phones from the same ride. I used the RINEX files provided by Google for these plots since they include the receiver flagged cycle slips (red ticks). The two phones were placed next to each other on the front dash but provided very different results. The bad data did not follow any particular model of phone, and every data set had at least one good data set, so I am not sure what was the cause of these bad data sets. The baseline solutions provided by Google did not seem to be nearly as significantly affected by this, presumably because they relied primarily on the pseudorange measurements and not the carrier phase.

Pixel4 and Pixel4Modded raw observations for 1/4/21 RWC-1 drive

Since every ride had at least one usable set of raw observations I decided to replace the individual phone solutions for each ride with a single solution created with a weighted average of the individual solutions for that ride. This does introduce some error because I ignored the small distance between phones (typically ~ 0.2 meters) but these errors are still relatively small compared to the total errors. Each solution point was weighted by the inverse of the RTKLIB estimated variance for that point.

Using the rules of the competition to calculate phone averaged errors relative to the ground truths, the resulting errors for the training set were 1.37 meters for the 5oth percentile, 5.35 meters for the 95th percentile and 3.36 meters for the average of the two.

This compares to 50th percentile =2.43 meters, 95th percentile =9.62 meters and average =6.02 meters for the Google baseline solutions. This comparison isn’t quite fair since the baseline solutions did not include the phone merge. However, the baseline solutions don’t include accuracy estimates for each point, so there is no easy way to merge these solutions.

I then ran the same process to generate merged PPK solutions for the test data set provided by Google. There are no ground truths included for this data so I can not directly calculate the errors. However I did submit the results to Kaggle to get a combined 50th/95th percentile score for both the Google baseline solutions and the RTKLIB PPK solutions. The contest was over five months ago now, so new entries are not recorded, but we can compare how our solutions would have done relative to the final results.

The Google baseline solutions returned a score of 5.42 meters which would have put us in a 35 way tie for 692th place out of 810 competitors. Apparently many competitors did not get past simply submitting the given input as their output.

The RTKLIB PPK phone-merged solutions returned a score of 2.15 meters which would have given us 5th place. Not too bad for a baseline with very little post-processing! Presumably this could be improved a fair bit with some of the many other techniques competitors used to improve the Google baselines.

I’ve given the “private leaderboard” scores from Kaggle here since that is what was used to determine the winners of the contest. The “public leaderboard” score is determined from a different slice of the test data set and was not ranked as high, probably because it included more urban data which benefits more from the post-processing techniques.

In most cases I would be very disappointed with PPK solution errors measured in meters, not centimeters, but in this case, given the extremely challenging data, I was just happy that RTKLIB was able to converge to any kind of reasonable answer.

Kaggle results for RTKLIB PPK baseline solutions

I’ve included the merged and unmerged baseline PPK solution files as “baseline_locations_merged_test_1230.csv” and “baseline_locations_test_1230.csv” in the release package. These files are in the same format as the Google provided baseline file “baseline_locations_test.csv”, so for anyone who competed in the competition it should be straightforward to substitue this baseline file in place of the Google baseline file. If you do run this exercise, I would be interested in hearing your results, so please leave a comment.

For those of you who would like to duplicate my results, this is a brief summary of the steps required to do this. All of the necessary files are included in the above-mentioned release package. Note that you should review and adjust the input parameters at the top of each python file to make sure it matches your file structure. The parameters are set up to calculate the solutions for the test data but can be modified to specify the training data. The python scripts are located in the python folder and the base station observation files, navigation files, configuration file, and solution files are in the Google folder

  1. If not already done, download and unzip the Google datasets
  2. Download the demo5 RTKLIB smartphone release package.
  3. Copy the base observation, navigation and configuration files from the RTKLIB package into the raw data file folders.
  4. Run “” to generate the RINEX files and solution files
  5. Run “” to create the unmerged baseline solutions file.
  6. Run “” to merge the individual phone solutions into a baseline file with combined solutions.
  7. Run “” to generate a file for submission to Kaggle

I have heard that Google will be running the competition again this year, so for those of you who missed it last year, you will have another chance to compete. I hope to be more involved this time. Although I don’t think I will compete myself, I would like to use this post as a starting point to put together some tools and information to make it easier for others to use RTKLIB in the competition.

Batch processing RTKLIB solutions with RNX2RTKP and Python

The RTKPOST GUI app in RTKLIB is a great tool for interactively exploring post-processed PPK or PPP solutions. However, once you have settled on a solution configuration and simply want to run that solution on many data sets, RTKPOST is not the right tool. The RNX2RTKP command line app is a better choice but will still only run a single solution at a time. If your goal is to run the same solution configuration on many data sets, then you will need to add some kind of wrapper to call the RNX2RTKP app multiple times.

I usually use a python script to do this. Below I have included a simple script that I used to process the large number of cellphone data sets in my previous post. It is configured to run in Windows but with possible minor modifications, it should run in linux as well. It is not meant to be used as is, but is a template you can use to write your own script to match the structure of your data.

 batch_rnx2rtkp - template to run multiple simultaneous solutions of rnx2rtkp in Windows.
    This example is configured to run the cellphone data sets available at

import os
import subprocess
import psutil
import time

# set location of data and rnx2rtkp executable
datapath = r'C:\gps\data\cellphone\Mi8_Julien\0521_dataset'
binpath = r'C:\gps\rtklib\bins\demo5_b34d'

# Choose datasets to process
DATA_SET = 0  # 0=open-sky, 1=partial forest, 2=forest
USE_GROUND_PLANE = True  # True or False

# set input files
cfgs = ['ppk_ar_1027_snr24']  # list of config files to run (files should have .conf ext)
rovFile = 'GEOP*.19o'  # rover files with wild cards
baseFile = 'BBYS*.19o' # base files with wild cards
navFile = r'"..\brdm*.19p"' # navigation files with wild cards
outFile = 'ppk_ar_1027'

# set maximum number of simultaneous occurences of rnx2rtkp to run
max_windows = 10 

# get list of current threads running in Windows
current_process = psutil.Process()
num_start = len(current_process.children())

# get list of date folders in data path
dates = os.listdir(datapath)

# loop through date folders
for date in dates:
    datepath = datapath + '/' + date
    # Filter which folders to process
    if not os.path.isdir(datepath):  # skip if not folder
        if date[-2:] != 'gp':  #skip folders without ground plane tag in name
    else: # no ground plane
        if date[-2:] == 'gp':  #skip folders with ground plane tag in name

    # Get list of datasets in this date folder
    datasets = os.listdir(datapath + '/' + date)
    # Select desired folder in data set 
    dataset = datepath + '/' + datasets[DATA_SET] # 0=open-sky, 1=partial forest, 2=forest 
    # Run a solution for each config file in list       
    for cfg in cfgs:
        # create command to run solution
        rtkcmd=r'%s\rnx2rtkp -x 0 -y 2 -k ..\..\%s.conf -o %s.pos %s %s %s' % \
            (binpath, cfg, outFile + '_' + cfg, rovFile, baseFile, navFile)    
        # run command

    # if max windows open, wait for one to close
    while len(current_process.children())-num_start >= max_windows:
        time.sleep(1) #wait here for existing window to close

# Wait for all solutions to finish
print('Waiting for solutions to complete ...')  
while len(current_process.children())-num_start > 0:
    pass #wait here if max windows open        

RTKLIB is a single threaded app so will not take advantage of multiple processors on a single computer. To get around this limitation and maximize the use of all processors, the script will launch separate processes for each RTKLIB solution until a maximum number of simulatenous processes has been reached and then will wait for a prior process to complete before launching a new solution. For my typical laptop, I find that setting the maximum number of simultaneous windows to 10 works fairly well but this number can be adjusted up or down based on the processing power of your computer.

In this case I have run only one solution for each dataset but the script is set up to run as many different solutions as desired. Just add an additional config file name to the list of config files for each desired solution. Be aware that the list of config files leaves off the file extensions. The actual config files should all have a “.conf” extension.

Note that the list of input files uses wildcards in the file names in many cases, since the file names for each data set will likely vary slightly from data set to data set.

In my example, the datasets contain three different types of environment (open sky, partial forest, and forest) and two different antenna configurations (ground plane or no ground plane). I have set the script up to only run the data for one type of environment and one antenna type as specified in the input parameters at the top of the script. However, this can be modified to filter the datasets in any way desired or to just run all of them.

There are probably faster and more elegant ways to do this, but if you are just looking for something simple to allow you to quickly run a given solution or set of solutions on many data sets, then you may find this useful.