Google Smartphone Decimeter Challenge

Last year, Google hosted a Kaggle competition to see who could generate the most accurate solutions for a large number of raw observation data sets collected with Android phones on vehicles driven around the Bay Area. The phones were located inside the vehicle on the front dash and they did not use ground planes under the phones, making the data far more challenging than the previous static cell phone data sets I have explored.

I took a quick look at the data when they first posted it, but felt that the quality was outside the scope of what RTKLIB could reasonably handle, given the low quality of the collection environment, and so I did not pursue it. Now that I have gained some experience working with a few other cell phone data sets, I thought it might be time to take another look at this data. In the interest of full disclosure, this second look was also motivated by a very generous contribution by Google to support and maintain the demo5 RTKLIB code.

This turned out to be a very useful exercise. Most of my efforts with RTKLIB have been focused on improving the ambiguity resolution but in this case the errors were too large for ambiguity resolution to be reliable so I had to focus entirely on improving the float solution. The large and frequent errors severely stressed the RTKLIB solution code and exposed several weaknesses that do not normally show up with the cleaner raw data used for more typical precision solutions. So I am thankful to Google not just for their contribution but also for encouraging me to perform this exercise.

The input data for the competition is available here. The raw data files are in the Android GNSSLogger raw format and include IMU and other sensor data from the phones in addition to the GNSS data. As a starting point, Google has also included standard RINEX files converted from the GNSSLogger files as well as what is a very good set of standard precision GNSS solutions considering the difficulty of the data. The goal of the competition is to use any combination of the provided input data to generate a set of solutions with the minimum error when compared to the ground truths. The data is divided into two sets, the training data for which the ground truths are included and the test data for which the ground truths are not included. The intent is for competitors to develop and evaluate their algorithms using the training data, then submitting their solutions for the test data. Submissions are scored using the mean of the 50th and 95th percentile distance errors for the test data set.

For my exercise I will focus only on providing a set of PPK baseline solutions similar to the baseline solutions that Google provided and for the most part leave the post-processing filtering and other opportunities for others to explore. The one exception I will make is that I did find a few of the data sets were too poor to resolve with RTKLIB so I will merge the solutions from all phones for each ride into a single combined solution, ignoring the small distance between phones. More details on this later.

The baseline solutions have the advantage of being able to use satellites from all the constellations in the raw data (GPS, GLONASS, Galileo, Bediou, and QZSS) while the PPK solutions can only use satellites with matching observations in the local base data (GPS, GLONASS, and Galileo in this case). However, the PPK solutions have the advantage that by differencing the observations between the smartphones and a nearby base station, they can cancel out most of the atmospheric, orbital, and clock errors. In addition, the baseline solutions use only the pseudorange measurements while the PPK solutions can take advantage of the carrier phase measurements as well. In general, the advantages from differencing with the base data and using the carrier phases should outweigh the disadvantage of having fewer satellites, and I would expect the PPK solutions should be more accurate than the baseline solutions, but we’ll see.

So let’s get started.

Since I am going to generate PPK solutions, the first thing I need is some nearby base observations. For these, I downloaded raw observation data for the appropriate dates and times for a nearby CORS station from the NOAA National Geodetic Survey website. There were several different CORS stations I could have used but I chose the SLAC station because it was reasonably close to all of the data collection rides and contained Galileo observations in addition to GPS and GLONASS.

I also needed satellite navigation data for each data set so I downloaded the BRDM files from the International GNSS Service website. This is the easiest place I have found to get navigation files that contain data for all of the GNSS constellations. The CORS data I downloaded above included navigation data for GPS and GLONASS, but not for Galileo.

Next, let’s look at the raw observation data from the smartphones. One challenge when working with any raw receiver data with RTKLIB is that there is always more information in the raw data than can be translated directly into either the RINEX format or the RTKLIB internal variables. This means that to take advantage of this additional data, the conversion process needs to be more than just a simple translation, it also needs to include some interpretation and possible consolidation of the data. For this reason, I chose not to start with the provided RINEX files which can be read directly by RTKLIB, but instead to translate the raw GNSSlogger files to RINEX using a python script. This gives us the opportunity to take some advantage of the additional information in the raw file before it is discarded.

As a starting point, I used a python script from Robukun that I found on Github to do the GNSSLogger to RINEX conversion. I modified it to use the same set of rules that Google described for the pre-processing of the raw data for their baseline solutions. Interestingly, the RINEX files they provided did not appear to follow this set of rules. In addition to a few small tweaks to these rules, I made two more significant changes. First, I ignored all of the cycle slip or half cycle ambiguity flags in the raw data. These appear to be too conservative and caused RTKLIB to throw out too much useful data.

I also embedded the reciever’s pseudorange and carrier phase uncertainty estimates into the older, mostly unused legacy SNR field using the same format that the demo5 code uses to do the same thing for the u-blox raw data. I ended up not using the uncertainty estimates in my solution but they are available for future exploration. In the latest demo5 code, observation weighting can be based on any arbitrary combination of elevation, SNR, and, when available, receiver uncertainty, but for this solution I used the standard elevation-only weighting.

I have included the modified python script along with all the other files required to duplicate my results in a “smartphone” release package of the demo5 code available here.

This python script will convert each GNSSLogger raw data file into a RINEX file that can then be processed with RTKPOST or RNX2RTKP. I started with RTKPOST to develop a solution configuration file on a couple of datasets, then used RNX2RTKP to batch process all of the data sets. I used a modified version of the batch-processing python script that I described in my last post to do both the GNSSLogger->Rinex conversions and to run the RTKLIB PPK solutions.

I started with the b34d version of the demo5 RTKLIB code and the config file I used for the previous cell phone data analysis but ended up needing to make some changes to both the code and the config file. Some of the changes are in the b34e RTKLIB code but others are only in the “smartphone” release of the code since I haven’t yet confirmed that they don’t cause any issues with more typical PPK/RTK solutions.

The most significant change to the RTKLIB code was to enable cycle slip detection using the doppler raw measurements. This feature has been in the code for many years but has been commented out because it is unable to distinguish between clock jumps and cycle slips. By rewriting the function to process all satellites in a single call instead of just a single satellite I was able to remove the common-mode effect of the clock jumps. This feature compares the doppler measurement to the change in carrier phase measurement since the doppler shift of a given signal is the time derivative of the carrier phase. It flags a cycle slip if the difference between these two values exceeds a user-configurable threshold.

Because I removed the cycle slip flags from the RINEX files, it was very important to have a reasonably reliable alternative method to detect the cycle slips. The existing geometry-free detection method works quite well for satellites with dual-frequency measurements but many of the satellites in these data sets have only single frequency measurements and so this test can not be applied. By only responding to verified cycle slips instead of every flagged potential slip, the code is much better able to preserve the phase bias estimates of each satellite.

In addition to this change to the RTKLIB code, I made a few other minor changes which allowed the code to degrade a little more gracefully in the presence of very low quality raw observations.

I started with the configuration file I used in my recent post for the Xiamoi Mi8 static cell phone data sets but made the following changes:

Positioning mode: static -> kinematic
GNSS constellations: disable Beidou
Integer Ambiguity Res: on -> off
Slip Thresh: Geom-Free: 0.05 -> 0.10
Slip Thresh: Doppler: N/A -> 5.0
Time Format: hms -> tow
Phase Error Ratio L1/L5: 1500/300 -> 300/100
Phase Error a+b: 0.006/0.006 -> 0.003/0.003
Carrier Phase Bias: 0.001 -> 0.01

Switching the position mode from static to kinematic is self-explanatory. I disabled Beidou because the base observations did not include it. I disabled ambiguity resolution because the errors in this data are too large to reliably resolve the ambiguities and we would just end up with many false fixes. I also increased the threshold for geometry-free slips because of the larger errors and added a new parameter for the new doppler slip detection. I switched the time format in the solution file just because it was more compatible with the format used in the Google baseline files. I have less solid justification for the last three changes and am also less certain that these are optimal but was trying to increase the weighting of the pseudorange measurements while also accounting for the lower confidence in the carrier phase biases remaining constant due to undetected slips. There was also a bit of trial and error on a couple of the data sets with these parameters before applying them to the full data set. If you make a close comparison between these config files you will notice that a few other parameters also changed but these are all related to the ambiguity resolution which is turned off so they have no effect on the results. The config file with all of these changes is included in the release package.

With these changes I was able to generate PPK solutions for all of the data in the training set. Comparing these to the provided ground truths showed that the majority of the solutions were reasonably good relative to the provided baseline solutions, but a few were very, very poor.

Plotting the raw observations for the poor quality solutions showed that the raw observations for these datasets were distinctly worse than the others. The plot below shows excerpts of the raw observations for two phones from the same ride. I used the RINEX files provided by Google for these plots since they include the receiver flagged cycle slips (red ticks). The two phones were placed next to each other on the front dash but provided very different results. The bad data did not follow any particular model of phone, and every data set had at least one good data set, so I am not sure what was the cause of these bad data sets. The baseline solutions provided by Google did not seem to be nearly as significantly affected by this, presumably because they relied primarily on the pseudorange measurements and not the carrier phase.

Pixel4 and Pixel4Modded raw observations for 1/4/21 RWC-1 drive

Since every ride had at least one usable set of raw observations I decided to replace the individual phone solutions for each ride with a single solution created with a weighted average of the individual solutions for that ride. This does introduce some error because I ignored the small distance between phones (typically ~ 0.2 meters) but these errors are still relatively small compared to the total errors. Each solution point was weighted by the inverse of the RTKLIB estimated variance for that point.

Using the rules of the competition to calculate phone averaged errors relative to the ground truths, the resulting errors for the training set were 1.37 meters for the 5oth percentile, 5.35 meters for the 95th percentile and 3.36 meters for the average of the two.

This compares to 50th percentile =2.43 meters, 95th percentile =9.62 meters and average =6.02 meters for the Google baseline solutions. This comparison isn’t quite fair since the baseline solutions did not include the phone merge. However, the baseline solutions don’t include accuracy estimates for each point, so there is no easy way to merge these solutions.

I then ran the same process to generate merged PPK solutions for the test data set provided by Google. There are no ground truths included for this data so I can not directly calculate the errors. However I did submit the results to Kaggle to get a combined 50th/95th percentile score for both the Google baseline solutions and the RTKLIB PPK solutions. The contest was over five months ago now, so new entries are not recorded, but we can compare how our solutions would have done relative to the final results.

The Google baseline solutions returned a score of 5.42 meters which would have put us in a 35 way tie for 692th place out of 810 competitors. Apparently many competitors did not get past simply submitting the given input as their output.

The RTKLIB PPK phone-merged solutions returned a score of 2.15 meters which would have given us 5th place. Not too bad for a baseline with very little post-processing! Presumably this could be improved a fair bit with some of the many other techniques competitors used to improve the Google baselines.

I’ve given the “private leaderboard” scores from Kaggle here since that is what was used to determine the winners of the contest. The “public leaderboard” score is determined from a different slice of the test data set and was not ranked as high, probably because it included more urban data which benefits more from the post-processing techniques.

In most cases I would be very disappointed with PPK solution errors measured in meters, not centimeters, but in this case, given the extremely challenging data, I was just happy that RTKLIB was able to converge to any kind of reasonable answer.

Kaggle results for RTKLIB PPK baseline solutions

I’ve included the merged and unmerged baseline PPK solution files as “baseline_locations_merged_test_1230.csv” and “baseline_locations_test_1230.csv” in the release package. These files are in the same format as the Google provided baseline file “baseline_locations_test.csv”, so for anyone who competed in the competition it should be straightforward to substitue this baseline file in place of the Google baseline file. If you do run this exercise, I would be interested in hearing your results, so please leave a comment.

For those of you who would like to duplicate my results, this is a brief summary of the steps required to do this. All of the necessary files are included in the above-mentioned release package. Note that you should review and adjust the input parameters at the top of each python file to make sure it matches your file structure. The parameters are set up to calculate the solutions for the test data but can be modified to specify the training data. The python scripts are located in the python folder and the base station observation files, navigation files, configuration file, and solution files are in the Google folder

  1. If not already done, download and unzip the Google datasets
  2. Download the demo5 RTKLIB smartphone release package.
  3. Copy the base observation, navigation and configuration files from the RTKLIB package into the raw data file folders.
  4. Run “” to generate the RINEX files and solution files
  5. Run “” to create the unmerged baseline solutions file.
  6. Run “” to merge the individual phone solutions into a baseline file with combined solutions.
  7. Run “” to generate a file for submission to Kaggle

I have heard that Google will be running the competition again this year, so for those of you who missed it last year, you will have another chance to compete. I hope to be more involved this time. Although I don’t think I will compete myself, I would like to use this post as a starting point to put together some tools and information to make it easier for others to use RTKLIB in the competition.

18 thoughts on “Google Smartphone Decimeter Challenge”

  1. Google have released this year’s Kaggle challenge. It looks like you have the top stop by a large margin 😉 I tried running an RTK solution (using nav data from TRAK for LAX and SLAC for everything else) with the demo5 release and the instructions you gave here (modified as required for the slightly different data formats) with very mixed results. Of the 170 training tracks, 35 didn’t product a solution (incl. all the LAX ones), 76 were a bit better than the baseline, 42 were a bit worse and 17 were much worse. I strongly suspect I’m doing something wrong – especially since you have a great score. Are there any gotchas I should be aware of in this year’s data?


    1. Looks like you are not doing so badly yourself! I’d be happy to try and answer your questions but I am concerned about violating the rules for sharing information outside of the Kaggle forums during an active competition. Can you re-ask your questions on the Kaggle forum and I will answer there.


  2. Hi Tim,
    I find your idea about cycle slips very interesting. When I started to collect the current dataset with the Pixel 5, I found the data to be flooded with cycle slips. I didn’t realize that with Mi8 to such an extent. I tried to solve this with both Google support and Geo++ guys, but with no effect. I looking forward to find what effect will have your approach – delete cycle slips and detect them based on other metrics – on my data.

    Liked by 1 person

  3. good evening ! very interesting that you paid attention to this Google GNSS Analysis Tools!
    I tried to use it, but I NEVER got the result !! I even wrote 5 pieces of breakdown reports – there is no result at all! And yet, remember I wrote that Oneplus 7t does not give out phase measurements and you need to find an expert on low-level android programming? So, do you happen to know forums, or some sites where the authors of these drivers for smartphones communicate? I would try to contact them about the inclusion of the phase ….. eh dream dream …
    here is google bug tracker


  4. hi rtklibexplore, I checked the detslp_dop function(rtkpos.c) in ” demo5 RTKLIB smartphone release” version, but i didn’t find the code about ” process all satellites in a single call instead of just a single satellite I was able to remove the common-mode effect of the clock jumps.” Do I miss something? Could you please explain more on this? Thanks in advance.


    1. Hi Yanling. If you compare the current version of rtkpos.c on Github with the version of rtkpos.c in the src folder in the demo5 RTKLIB smartphone release, you will see the changes I made to remove the common mode effect of a clock jump. My assumption was that a clock jump will add an equal offset to all of the doppler-carrier phase differences. In this case, the mean of all non-outlier differences is a reasonable first order estimate of the clock-jump. It will not remove very large clock jumps, since in that case all the differences will appear to be outliers and excluded from the mean, but it appears to be reasonably effective at removing the more common length clock jumps, at least in the smartphone data. I plan to do more testing on this feature with other receivers before adding it to the main demo5 code.


      1. Thanks so much for your detailed explaination! I still didn’t find the corresponding code change you made in ” RTKLIB smartphone release”. I am not sure it is because you didn’t commit it or I miss something. But I understood your idea, it’s great! I am thinking if we calculate time difference of previous and current epochs (tt) by removing the clock bias calculated from single point positioning, would this be another way to eliminate clock jump?


        1. Speaking to how we did it on the older Parthus designs, we called it a “clock slew”, where the integration time in the correlators were multiples of “1 ms” in receiver time, when sufficient clock bias accumulated we’d expand or contract the next integration period by a millisecond. We’d normally do this when 700us had built up, and I think this was chosen so it didn’t oscillate, but could perhaps be smaller, as TCXO’s can drift relatively slowly. How often this occurred will depend on how sloppy the local clock is, but it is typically going to impact a single epoch, relatively infrequently, but periodically. But yes, would impact ALL measurements in a given epoch, and would look like a 999ms or 1001ms second from the measurement time-stamps.


        2. Hi Yanling. Sorry for the confusion. I did not create a new branch for the smartphone code release although now I realize I should have. The changes to the source code are all in the rtkpos.c file which is in the src folder in the release zip file. I think the source code link in the release page is just retrieving the code from the main branch. I’m still new to using Github releases.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: