RTKLIB Benchmarking: versions 2.4.2, 2.4.3, and demo5

It’s been about four years now since I created the demo5 branch of RTKLIB. During that time I have added a number of features and enhancements to the code with a focus on low-cost receivers (primarily u-blox) and moving rovers while at the same time keeping synced with the latest 2.4.3 code from the official RTKLIB code base. I thought it would be interesting to do a little benchmarking between the the two official versions of RTKLIB (2.4.2 and 2.4.3) and the demo5 code to see to what extent this evolution of the code has affected the results.

First of all, though, it’s probably worth a little discussion about versions 2.4.2 and 2.4.3. Source and executables for both versions are available on Github at https://github.com/tomojitakasu. The source is in the RTKLIB repository and the executables are in the RTKLIB_BIN repository. Both of these repositories default to the “master” branch which is the 2.4.2 code. This is what you will get unless you specifically request the “2.4.3” branch of the code. For several years, almost all the development activity was on the 2.4.3 branch and only very minimal changes were being made to the master branch. In Jan 2018, there was a merge of the 2.4.3 changes back to the master branch although it appears that not all of the changes in 2.4.3 were merged back into the master branch. Since then development has continued on 2.4.3 without another merge back to the master branch.

For most of my data analysis posts, I focus on a single data set, spending a fair bit of time to make sure I am only analyzing the usable parts of the data, possibly tweaking the configuration file for that specific data, and digging into any issues that crop up. In this case I didn’t do that. I picked nine raw data sets, all with u-blox M8T receivers and a moving rover, made no effort to filter out bad data, and used a single generic configuration file for all nine data sets. Eight of the data sets are those that I have previously uploaded to my website at http://rtkexplorer.com/downloads/gps-data/ and the ninth was from my most recent drive around the neighborhood with a u-blox M8T and an antenna on top of the car. I ran solutions for each of the three RTKLIB versions on all nine data sets. For each solution, I converted the data from u-blox binary to rinex using the same code version as I did for the solution, since the different codes will affect this conversion as well as the solution.

I ran the post-processed solutions in “combined” mode, meaning that the solution is run both forward and backward and the results are then combined. Not only does this tend to produce better results, but the results also have higher confidence since RTKLIB compares the forward and backwards solutions, sample by sample, and downgrades any sample where the solutions in both directions are fixed and the results differ by more than four standard deviations. This tends to do a good job of detecting and rejecting any false fixes in the results. However, it is not foolproof. If the solution is fixed in only one direction and float in the other, then there is no additional validation.

I used the same configuration settings for running each of the three RTKLIB versions on each of the data sets with a few exceptions. Versions 2.4.2 and 2.4.3 do not have the “arfilter” feature that automatically holds off new satellites until their phase bias estimates have converged enough to not break the ambiguity resolution so I increased the fixed hold off (arlockcnt) from 0 to 10 for the 2.4.2 and 2.4.3 codes. The outlier detection scheme is also different in the demo5 code from the other two versions, making it necessary to increase the outlier threshold (rejionno). In this case I increased the threshold from 1.0 to 30.0 for versions 2.4.2 and 2.4.3. Lastly, the 2.4.2 code runs very slowly if dynamics is enabled, so I turned dynamics off for this code.

The specific code versions for the experiment were: 2.4.2 p13, 2.4.3 b33 and demo5 b33b2.

I used fix percentage as a metric for the test. This isn’t a perfect metric because it doesn’t take into effect the accuracy of the non-fixed results and it can be affected by false fixes. Overall, though it is probably the best single metric for this kind of test, especially since running the solutions in combined mode should reject many of the false fixes.

The fix percentages for each test are listed below. The data set names correspond to the names of the sample data sets on my website.

TestV 2.4.2V 2.4.3V2.4.3ademo5data set
137.7%93.9%96.3%94.4%union_0705
235.7%9.8%60.6%85.5%niwot2_car
352.4%16.2%89.0%80.2%drone_0414
485.8%51.2%93.8%98.8%m8t_niwot_0606
559.7%21.4%83.7%97.1%swift_m8t_road_0606
663.1%10.0%83.6%99.0%car_1114
754.6%26.6%73.3%94.2%car_0320
839.4%11.2%54.4%98.1%comnav_car
925.0%9.2%53.4%98.1%not uploaded
median52.4%16.2%83.6%97.1%
Experiment results in fix percentage

For the most part, the results match my expectations. Version 2.4.2 has the lowest fix percentage, 2.4.3 is in the middle, and demo5 has the highest. However, I ran into one very significant issue with the 2.4.3 code that I do not fully understand. In the table above, the “V2.4.3” column is the results using version 2.4.3 for both the conversion from raw binary to rinex as well as for the solution. As you can see, the fix percentages were very low for this test for all data sets except the first, significantly lower than even the 2.4.2 results. I did not fully debug this issue but the problem appears to be in the conversion from raw binary to rinex, not in the solution itself.

The V2.4.3a column is the results for running the 2.4.2 code for the raw binary to rinex conversion, and then the 2.4.3 code for the solution. This result is much more within my expectations for the 2.4.3 code. I suspect that the issue with the 2.4.3 rinex conversion is that when it is filtering out low quality observations it is not preserving the cycle-slips. RTKLIB can be very sensitive to unflagged cycle slips.

I am very curious if anyone who is a regular user of the 2.4.3 code can duplicate this result. As you can see from the first data set, it does not always occur, and is much less of a problem if the data does not have a large number of cycle slips, so it would need to be tested on a more challenging data set to see this issue.

Regarding the demo5 results, seven of the nine tests had over 94% fix rate and the median fix percentage was 97.1%. I consider this quite reasonable since all of these were fairly challenging data sets and most of them included at least a small amount of unusable data. The two data sets that did not perform as well (80-86% fix percentage) were both older. One did not include Galileo and the other was from a drone that had one particularly poor quality section of data. However both solutions had well over a 90% fix rate when run in the forward only direction which indicates the fixes in the combined solution were downgraded because of mismatches between the two directions. In one of these cases (test 3), the 2.4.3 solution obtained a higher fix rate than the demo5 code but it only got fixes in the forward direction, not in the backward direction so had no additional validation. Based on some discrete jumps in that solution, I suspect it would have also downgraded the fix percentage if it had achieved fix in the backwards direction.

Looking more closely at the cases where the demo5 solution points were downgraded to float for mismatches, it’s interesting because, at least at first glance, it appears that these were not false fixes, but discrepancies from using different combinations of satellites that were large enough to trigger the four standard deviation threshold. This is a little concerning and worthy of further investigation. Fortunately it only appears to occur when the data quality is fairly poor. However, this does emphasize the importance of insuring the best quality measurements possible, and not over-relying on RTKLIB to reject inaccurate solutions.

As always, I’d like to emphasize that these tests are intended only as one users snapshot of one fairly particular use case of RTKLIB and are not intended to be any kind of comprehensive analysis. Also, it’s important to understand that 99+% of the code in all versions of RTKLIB including the demo5 code are the result of many years of dedicated effort by Tomoji Takasu and his team at Tokyo University of Marine Science and Technology. My only contribution has been to add a few changes on top of this code to make it a little more focused on practical application for specific uses rather than a more generic academic tool.

Lastly, for reference, here’s a partial list of the most important configuration settings I used for this experiment for the demo5 code. The other two codes used the same settings with the exceptions I describe above.

pos1-posmode =kinematic # solution mode
pos1-soltype =combined # solution type (forward, backward, combined)
pos1-frequency =l1 # (l1, l1+l2, l1+l2+l5)
pos1-elmask =15 # min sat elevation to include in solution (deg)
pos1-snrmask_r =off # SNR mask rover (off, on)
pos1-snrmask_b =off # SNR mask base (off, on)
pos1-dynamics =on # add dynamic states to kalman filter (off, on)
pos1-navsys =15 # (1:gps+2:sbas+4:glo+8:gal+16:qzs+32:comp)
pos2-armode =fix-and-hold
pos2-gloarmode =on # Glonass AR mode (off, on, fix-and-hold, autocal)
pos2-bdsarmode =off # Bediou AR mode (off, on)
pos2-aroutcnt =50 # outage count to reset sat ambiguity (samples)
pos2-arminfix =50 # min # of fix samples to enable AR hold (samples)
pos2-rejionno =1.0 # phase bias outlier threshold (m)
pos2-maxage =30 # max age of differential (secs)
pos2-arthres =3.0 # minimum AR ratio for fix (m)
pos2-arthres1 =0.1 # max variance of position states to attempt AR (m)
pos2-varholdamb =0.1 # variance of fix-and-hold tracking feedback (cyc^2)

pos2-arfilter =on # automatic hold off for adding new sats to AR
pos2-arlockcnt =0 # fixed hold off for adding new sats to AR (samples)
pos2-minfixsats =4 # min sats required for fix
pos2-minholdsats =5 # min sats required for AR hold
pos2-mindropsats =10 # min sats required to drop sats from AR
stats-eratio1 =300 # ratio of input stdev of code to phase observations
stats-eratio2 =300 # ratio of input stdev of code to phase observations

6 thoughts on “RTKLIB Benchmarking: versions 2.4.2, 2.4.3, and demo5”

  1. Hi. I want to use rtklib’s Str2str for my base station. Thing is that Glonass Sats aren’t usable because of the missing MT1230. It’s doesn’t convert it from an Ublox raw or RTCM3 stream. Do you have any solution for it?

    Like

    1. Hi Richard. In most cases STR2STR is used just as a stream server to direct a stream of data from an input source to an output destination. In this case, the content of the stream will not be modified and if the input stream contains MT1230 messages, they will be relayed unchanged to the output stream. STR2STR does have a limited capability to generate RTCM messages from raw observation messages present in other formats in the input stream, but this does not include the ability to generate MT1230 messages. As long as your base station is generating MT1230 RTCM3 messages, there should be no issue streaming them with STR2STR.

      Like

      1. Hi sir!

        Thanks for you fast response.
        Well, let me explain my current setup:

        Ublox f9p configured as base station
        Sending out MT 1005,1077,1087,1097,1127,1230 on serial.
        RTKLib input stream serial Baudrate 115200 as listener
        RTKLib output stream as ntrip caster to local IP (127.0.0.1), in convert I set Rtcm3 -> rtcm3 and enter the following string:

        1004(1), 1005(10), 1012(1), 1074(1), 1084(1), 1094(1), 1124(1), 1230(5)

        I pick up the stream with Snip Caster As a push-in stream to see the contents but MT1230 doesn’t show up in the incoming stream data. If I do it directly from serial to Snip Caster I got all the messages send from the serial Ublox rtcm stream.

        Hopefully this make any sense.
        (I do this because I need legacy messages for an older type of receiver)

        Like

        1. Hi Richard. By checking the “Conversion From” box in the “Conv” menu, you are asking STR2STR to convert the incoming RTCM3 messages into it’s internal format and then back to RTCM3 before streaming them to the output port. It can do that for the other messages but not for the 1230 messages. It should not normally be necessary to do this, you should be able to relay them without converting them, which you can do by unchecking the “Conversion From” box. Have you tried this?

          Like

          1. Oh for sure.. this work, but I then I don’t have the MT1004 and 1012. An Ublox f9p can’t process these so my thoughts were to use rtklib to create those types from MSM messages. It works for 1004 and 1012 but not for 1230. Eventually it’s not even needed to create these because it’s already in the stream, but i understand it doesn’t relay that message type because of the convert checkmark

            Like

          2. Hi Richard. OK, I understand now what you are trying to do but I don’t believe RTKLIB can currently do this. I do hope to add more general support for the 1230 message to the RTKLIB code at some point which would allow you to do this.

            Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.