It’s been about four years now since I created the demo5 branch of RTKLIB. During that time I have added a number of features and enhancements to the code with a focus on low-cost receivers (primarily u-blox) and moving rovers while at the same time keeping synced with the latest 2.4.3 code from the official RTKLIB code base. I thought it would be interesting to do a little benchmarking between the the two official versions of RTKLIB (2.4.2 and 2.4.3) and the demo5 code to see to what extent this evolution of the code has affected the results.
First of all, though, it’s probably worth a little discussion about versions 2.4.2 and 2.4.3. Source and executables for both versions are available on Github at https://github.com/tomojitakasu. The source is in the RTKLIB repository and the executables are in the RTKLIB_BIN repository. Both of these repositories default to the “master” branch which is the 2.4.2 code. This is what you will get unless you specifically request the “2.4.3” branch of the code. For several years, almost all the development activity was on the 2.4.3 branch and only very minimal changes were being made to the master branch. In Jan 2018, there was a merge of the 2.4.3 changes back to the master branch although it appears that not all of the changes in 2.4.3 were merged back into the master branch. Since then development has continued on 2.4.3 without another merge back to the master branch.
For most of my data analysis posts, I focus on a single data set, spending a fair bit of time to make sure I am only analyzing the usable parts of the data, possibly tweaking the configuration file for that specific data, and digging into any issues that crop up. In this case I didn’t do that. I picked nine raw data sets, all with u-blox M8T receivers and a moving rover, made no effort to filter out bad data, and used a single generic configuration file for all nine data sets. Eight of the data sets are those that I have previously uploaded to my website at http://rtkexplorer.com/downloads/gps-data/ and the ninth was from my most recent drive around the neighborhood with a u-blox M8T and an antenna on top of the car. I ran solutions for each of the three RTKLIB versions on all nine data sets. For each solution, I converted the data from u-blox binary to rinex using the same code version as I did for the solution, since the different codes will affect this conversion as well as the solution.
I ran the post-processed solutions in “combined” mode, meaning that the solution is run both forward and backward and the results are then combined. Not only does this tend to produce better results, but the results also have higher confidence since RTKLIB compares the forward and backwards solutions, sample by sample, and downgrades any sample where the solutions in both directions are fixed and the results differ by more than four standard deviations. This tends to do a good job of detecting and rejecting any false fixes in the results. However, it is not foolproof. If the solution is fixed in only one direction and float in the other, then there is no additional validation.
I used the same configuration settings for running each of the three RTKLIB versions on each of the data sets with a few exceptions. Versions 2.4.2 and 2.4.3 do not have the “arfilter” feature that automatically holds off new satellites until their phase bias estimates have converged enough to not break the ambiguity resolution so I increased the fixed hold off (arlockcnt) from 0 to 10 for the 2.4.2 and 2.4.3 codes. The outlier detection scheme is also different in the demo5 code from the other two versions, making it necessary to increase the outlier threshold (rejionno). In this case I increased the threshold from 1.0 to 30.0 for versions 2.4.2 and 2.4.3. Lastly, the 2.4.2 code runs very slowly if dynamics is enabled, so I turned dynamics off for this code.
The specific code versions for the experiment were: 2.4.2 p13, 2.4.3 b33 and demo5 b33b2.
I used fix percentage as a metric for the test. This isn’t a perfect metric because it doesn’t take into effect the accuracy of the non-fixed results and it can be affected by false fixes. Overall, though it is probably the best single metric for this kind of test, especially since running the solutions in combined mode should reject many of the false fixes.
The fix percentages for each test are listed below. The data set names correspond to the names of the sample data sets on my website.
|Test||V 2.4.2||V 2.4.3||V2.4.3a||demo5||data set|
For the most part, the results match my expectations. Version 2.4.2 has the lowest fix percentage, 2.4.3 is in the middle, and demo5 has the highest. However, I ran into one very significant issue with the 2.4.3 code that I do not fully understand. In the table above, the “V2.4.3” column is the results using version 2.4.3 for both the conversion from raw binary to rinex as well as for the solution. As you can see, the fix percentages were very low for this test for all data sets except the first, significantly lower than even the 2.4.2 results. I did not fully debug this issue but the problem appears to be in the conversion from raw binary to rinex, not in the solution itself.
The V2.4.3a column is the results for running the 2.4.2 code for the raw binary to rinex conversion, and then the 2.4.3 code for the solution. This result is much more within my expectations for the 2.4.3 code. I suspect that the issue with the 2.4.3 rinex conversion is that when it is filtering out low quality observations it is not preserving the cycle-slips. RTKLIB can be very sensitive to unflagged cycle slips.
I am very curious if anyone who is a regular user of the 2.4.3 code can duplicate this result. As you can see from the first data set, it does not always occur, and is much less of a problem if the data does not have a large number of cycle slips, so it would need to be tested on a more challenging data set to see this issue.
Regarding the demo5 results, seven of the nine tests had over 94% fix rate and the median fix percentage was 97.1%. I consider this quite reasonable since all of these were fairly challenging data sets and most of them included at least a small amount of unusable data. The two data sets that did not perform as well (80-86% fix percentage) were both older. One did not include Galileo and the other was from a drone that had one particularly poor quality section of data. However both solutions had well over a 90% fix rate when run in the forward only direction which indicates the fixes in the combined solution were downgraded because of mismatches between the two directions. In one of these cases (test 3), the 2.4.3 solution obtained a higher fix rate than the demo5 code but it only got fixes in the forward direction, not in the backward direction so had no additional validation. Based on some discrete jumps in that solution, I suspect it would have also downgraded the fix percentage if it had achieved fix in the backwards direction.
Looking more closely at the cases where the demo5 solution points were downgraded to float for mismatches, it’s interesting because, at least at first glance, it appears that these were not false fixes, but discrepancies from using different combinations of satellites that were large enough to trigger the four standard deviation threshold. This is a little concerning and worthy of further investigation. Fortunately it only appears to occur when the data quality is fairly poor. However, this does emphasize the importance of insuring the best quality measurements possible, and not over-relying on RTKLIB to reject inaccurate solutions.
As always, I’d like to emphasize that these tests are intended only as one users snapshot of one fairly particular use case of RTKLIB and are not intended to be any kind of comprehensive analysis. Also, it’s important to understand that 99+% of the code in all versions of RTKLIB including the demo5 code are the result of many years of dedicated effort by Tomoji Takasu and his team at Tokyo University of Marine Science and Technology. My only contribution has been to add a few changes on top of this code to make it a little more focused on practical application for specific uses rather than a more generic academic tool.
Lastly, for reference, here’s a partial list of the most important configuration settings I used for this experiment for the demo5 code. The other two codes used the same settings with the exceptions I describe above.
pos1-posmode =kinematic # solution mode
pos1-soltype =combined # solution type (forward, backward, combined)
pos1-frequency =l1 # (l1, l1+l2, l1+l2+l5)
pos1-elmask =15 # min sat elevation to include in solution (deg)
pos1-snrmask_r =off # SNR mask rover (off, on)
pos1-snrmask_b =off # SNR mask base (off, on)
pos1-dynamics =on # add dynamic states to kalman filter (off, on)
pos1-navsys =15 # (1:gps+2:sbas+4:glo+8:gal+16:qzs+32:comp)
pos2-gloarmode =on # Glonass AR mode (off, on, fix-and-hold, autocal)
pos2-bdsarmode =off # Bediou AR mode (off, on)
pos2-aroutcnt =50 # outage count to reset sat ambiguity (samples)
pos2-arminfix =50 # min # of fix samples to enable AR hold (samples)
pos2-rejionno =1.0 # phase bias outlier threshold (m)
pos2-maxage =30 # max age of differential (secs)
pos2-arthres =3.0 # minimum AR ratio for fix (m)
pos2-arthres1 =0.1 # max variance of position states to attempt AR (m)
pos2-varholdamb =0.1 # variance of fix-and-hold tracking feedback (cyc^2)
pos2-arfilter =on # automatic hold off for adding new sats to AR
pos2-arlockcnt =0 # fixed hold off for adding new sats to AR (samples)
pos2-minfixsats =4 # min sats required for fix
pos2-minholdsats =5 # min sats required for AR hold
pos2-mindropsats =10 # min sats required to drop sats from AR
stats-eratio1 =300 # ratio of input stdev of code to phase observations
stats-eratio2 =300 # ratio of input stdev of code to phase observations