RTKLIB Benchmarking: versions 2.4.2, 2.4.3, and demo5

It’s been about four years now since I created the demo5 branch of RTKLIB. During that time I have added a number of features and enhancements to the code with a focus on low-cost receivers (primarily u-blox) and moving rovers while at the same time keeping synced with the latest 2.4.3 code from the official RTKLIB code base. I thought it would be interesting to do a little benchmarking between the the two official versions of RTKLIB (2.4.2 and 2.4.3) and the demo5 code to see to what extent this evolution of the code has affected the results.

First of all, though, it’s probably worth a little discussion about versions 2.4.2 and 2.4.3. Source and executables for both versions are available on Github at https://github.com/tomojitakasu. The source is in the RTKLIB repository and the executables are in the RTKLIB_BIN repository. Both of these repositories default to the “master” branch which is the 2.4.2 code. This is what you will get unless you specifically request the “2.4.3” branch of the code. For several years, almost all the development activity was on the 2.4.3 branch and only very minimal changes were being made to the master branch. In Jan 2018, there was a merge of the 2.4.3 changes back to the master branch although it appears that not all of the changes in 2.4.3 were merged back into the master branch. Since then development has continued on 2.4.3 without another merge back to the master branch.

For most of my data analysis posts, I focus on a single data set, spending a fair bit of time to make sure I am only analyzing the usable parts of the data, possibly tweaking the configuration file for that specific data, and digging into any issues that crop up. In this case I didn’t do that. I picked nine raw data sets, all with u-blox M8T receivers and a moving rover, made no effort to filter out bad data, and used a single generic configuration file for all nine data sets. Eight of the data sets are those that I have previously uploaded to my website at http://rtkexplorer.com/downloads/gps-data/ and the ninth was from my most recent drive around the neighborhood with a u-blox M8T and an antenna on top of the car. I ran solutions for each of the three RTKLIB versions on all nine data sets. For each solution, I converted the data from u-blox binary to rinex using the same code version as I did for the solution, since the different codes will affect this conversion as well as the solution.

I ran the post-processed solutions in “combined” mode, meaning that the solution is run both forward and backward and the results are then combined. Not only does this tend to produce better results, but the results also have higher confidence since RTKLIB compares the forward and backwards solutions, sample by sample, and downgrades any sample where the solutions in both directions are fixed and the results differ by more than four standard deviations. This tends to do a good job of detecting and rejecting any false fixes in the results. However, it is not foolproof. If the solution is fixed in only one direction and float in the other, then there is no additional validation.

I used the same configuration settings for running each of the three RTKLIB versions on each of the data sets with a few exceptions. Versions 2.4.2 and 2.4.3 do not have the “arfilter” feature that automatically holds off new satellites until their phase bias estimates have converged enough to not break the ambiguity resolution so I increased the fixed hold off (arlockcnt) from 0 to 10 for the 2.4.2 and 2.4.3 codes. The outlier detection scheme is also different in the demo5 code from the other two versions, making it necessary to increase the outlier threshold (rejionno). In this case I increased the threshold from 1.0 to 30.0 for versions 2.4.2 and 2.4.3. Lastly, the 2.4.2 code runs very slowly if dynamics is enabled, so I turned dynamics off for this code.

The specific code versions for the experiment were: 2.4.2 p13, 2.4.3 b33 and demo5 b33b2.

I used fix percentage as a metric for the test. This isn’t a perfect metric because it doesn’t take into effect the accuracy of the non-fixed results and it can be affected by false fixes. Overall, though it is probably the best single metric for this kind of test, especially since running the solutions in combined mode should reject many of the false fixes.

The fix percentages for each test are listed below. The data set names correspond to the names of the sample data sets on my website.

TestV 2.4.2V 2.4.3V2.4.3ademo5data set
137.7%93.9%96.3%94.4%union_0705
235.7%9.8%60.6%85.5%niwot2_car
352.4%16.2%89.0%80.2%drone_0414
485.8%51.2%93.8%98.8%m8t_niwot_0606
559.7%21.4%83.7%97.1%swift_m8t_road_0606
663.1%10.0%83.6%99.0%car_1114
754.6%26.6%73.3%94.2%car_0320
839.4%11.2%54.4%98.1%comnav_car
925.0%9.2%53.4%98.1%not uploaded
median52.4%16.2%83.6%97.1%
Experiment results in fix percentage

For the most part, the results match my expectations. Version 2.4.2 has the lowest fix percentage, 2.4.3 is in the middle, and demo5 has the highest. However, I ran into one very significant issue with the 2.4.3 code that I do not fully understand. In the table above, the “V2.4.3” column is the results using version 2.4.3 for both the conversion from raw binary to rinex as well as for the solution. As you can see, the fix percentages were very low for this test for all data sets except the first, significantly lower than even the 2.4.2 results. I did not fully debug this issue but the problem appears to be in the conversion from raw binary to rinex, not in the solution itself.

The V2.4.3a column is the results for running the 2.4.2 code for the raw binary to rinex conversion, and then the 2.4.3 code for the solution. This result is much more within my expectations for the 2.4.3 code. I suspect that the issue with the 2.4.3 rinex conversion is that when it is filtering out low quality observations it is not preserving the cycle-slips. RTKLIB can be very sensitive to unflagged cycle slips.

I am very curious if anyone who is a regular user of the 2.4.3 code can duplicate this result. As you can see from the first data set, it does not always occur, and is much less of a problem if the data does not have a large number of cycle slips, so it would need to be tested on a more challenging data set to see this issue.

Regarding the demo5 results, seven of the nine tests had over 94% fix rate and the median fix percentage was 97.1%. I consider this quite reasonable since all of these were fairly challenging data sets and most of them included at least a small amount of unusable data. The two data sets that did not perform as well (80-86% fix percentage) were both older. One did not include Galileo and the other was from a drone that had one particularly poor quality section of data. However both solutions had well over a 90% fix rate when run in the forward only direction which indicates the fixes in the combined solution were downgraded because of mismatches between the two directions. In one of these cases (test 3), the 2.4.3 solution obtained a higher fix rate than the demo5 code but it only got fixes in the forward direction, not in the backward direction so had no additional validation. Based on some discrete jumps in that solution, I suspect it would have also downgraded the fix percentage if it had achieved fix in the backwards direction.

Looking more closely at the cases where the demo5 solution points were downgraded to float for mismatches, it’s interesting because, at least at first glance, it appears that these were not false fixes, but discrepancies from using different combinations of satellites that were large enough to trigger the four standard deviation threshold. This is a little concerning and worthy of further investigation. Fortunately it only appears to occur when the data quality is fairly poor. However, this does emphasize the importance of insuring the best quality measurements possible, and not over-relying on RTKLIB to reject inaccurate solutions.

As always, I’d like to emphasize that these tests are intended only as one users snapshot of one fairly particular use case of RTKLIB and are not intended to be any kind of comprehensive analysis. Also, it’s important to understand that 99+% of the code in all versions of RTKLIB including the demo5 code are the result of many years of dedicated effort by Tomoji Takasu and his team at Tokyo University of Marine Science and Technology. My only contribution has been to add a few changes on top of this code to make it a little more focused on practical application for specific uses rather than a more generic academic tool.

Lastly, for reference, here’s a partial list of the most important configuration settings I used for this experiment for the demo5 code. The other two codes used the same settings with the exceptions I describe above.

pos1-posmode =kinematic # solution mode
pos1-soltype =combined # solution type (forward, backward, combined)
pos1-frequency =l1 # (l1, l1+l2, l1+l2+l5)
pos1-elmask =15 # min sat elevation to include in solution (deg)
pos1-snrmask_r =off # SNR mask rover (off, on)
pos1-snrmask_b =off # SNR mask base (off, on)
pos1-dynamics =on # add dynamic states to kalman filter (off, on)
pos1-navsys =15 # (1:gps+2:sbas+4:glo+8:gal+16:qzs+32:comp)
pos2-armode =fix-and-hold
pos2-gloarmode =on # Glonass AR mode (off, on, fix-and-hold, autocal)
pos2-bdsarmode =off # Bediou AR mode (off, on)
pos2-aroutcnt =50 # outage count to reset sat ambiguity (samples)
pos2-arminfix =50 # min # of fix samples to enable AR hold (samples)
pos2-rejionno =1.0 # phase bias outlier threshold (m)
pos2-maxage =30 # max age of differential (secs)
pos2-arthres =3.0 # minimum AR ratio for fix (m)
pos2-arthres1 =0.1 # max variance of position states to attempt AR (m)
pos2-varholdamb =0.1 # variance of fix-and-hold tracking feedback (cyc^2)

pos2-arfilter =on # automatic hold off for adding new sats to AR
pos2-arlockcnt =0 # fixed hold off for adding new sats to AR (samples)
pos2-minfixsats =4 # min sats required for fix
pos2-minholdsats =5 # min sats required for AR hold
pos2-mindropsats =10 # min sats required to drop sats from AR
stats-eratio1 =300 # ratio of input stdev of code to phase observations
stats-eratio2 =300 # ratio of input stdev of code to phase observations

14 thoughts on “RTKLIB Benchmarking: versions 2.4.2, 2.4.3, and demo5”

  1. What downgrade actual mean in this sentence: “downgrades any sample where the solutions in both directions are fixed” ?

    Like

  2. Hello!
    First, thanks for sharing your knowledge on RTKlib with the community. It is very helpful!
    I would like to ask if there is any difference between V 2.4.2 V 2.4.3 V2.4.3a and demo5 versions, in regards to RINEX reading. Do all of them have the same methods to read RINEX v3 files? Particularly, can all of them read all four global constellations, plus QZSS and IRNSS?

    Thanks in advance!
    German

    Like

    1. Hi German. In general, the demo5 code is kept synced with the 2.4.3 RTKLIB code, so these two codes should be similar with one big exception, specifically the way that Galileo E5b observations are handled These are technically L7 observations based on their codes and this is how RTKLIB 2.4.2 and 2.4.3 handle them. Unfortunately RTKLIB does not fully support L7 solutions. The demo5 code treats the E5b observations as “L2” which makes handling them in the RTKLIB solutions simpler and more efficient. There may be other less significant differences as well that are not coming to me immediately. Regarding QZSS and IRNSS, RTKLIB is supposed to support both and should be identical between 2.4.3 and demo5, but I have not tested either of these constellations in either code. Version 2.4.2 is missing many of the more recent updates that are in the 2.4.3 and demo5 codes, so may not have as much support for these constellations.

      Like

  3. Thank you for that was useful,
    I am processing some baselines with RTKLIB RTKPOST (rnx2rtkp) once with GPS only and the other time with GPS+Galileo. I use RTKLIB v.demo5 b33b2, most of the time the process gives an unexpected result. From the RTKPLOT I can see that GPS only gives more fix and better accuracy results, meaning according to that Galileo decrease the accuracy rather than improving it. Whereas I processed the same baseline and session with another commercial software and Galileo improves the time required to first fix and the accuracy in most cases.

    For these processes, I used the default settings of v.demo5 b33b2 except I changed:
    Processing mode: kinematic
    frequencies: L1+L2+L5
    Filter type: forward
    Elevation mask: 10

    note: the same configuration was used for both cases (GPS, GPS+Galileo), but ticking Galileo for the second case.

    I wonder what caused the problem, and what will be the right configuration for such work?

    Thank you,

    Like

    1. Hi Peshawa. In general, adding constellations should improve the solution. However I do see with dual frequency solutions that sometimes there are just too many observations for RTKLIB to handle well and it does not do a good job of rejecting all of the poor observations. This is particularly true for stationary rovers where the multipath will be worse since it varies very slowly unlike moving rovers where it is much more random. I would suggest tightening up some of your filter criteria (particularly elevation and SNR) to help remove some of the worst satellites. I have also made some improvements in this area in the demo5 b33c code so I would try that as well.

      Like

  4. Hello,
    Great work there with RTKLIB for F9P and your blog on how to set all this up! I am just getting started with demo5 and SimpleRTK and after having set up RTKNAVI in the way recommended by you (e.g. serial and ntrip input, solution output and logging) I am wondering whether it is possible to also send the RTCM corrections to the receiver via serial.
    After all, it is nice to get a RTK solution from RTKNAVI but the receiver is fixing much faster and there is no reason not to send the corrections to the receiver as well, I think.
    Do you have a solution for this?

    Thank you very much and your work is much appreciated.
    Best Regards

    Like

    1. Hi Klemens. Yes it is possible to send the base observations to the F9P and simultaneously receive the raw observations and NMEA position messages with RTKLIB but you have to workaround the limitation that two applications can’t connect to a single serial port. One way to do this is to run STRSVR to send the base observations to the F9P and select “Output Received Stream to TCP Port” when setting up the serial port options. STRSVR will then stream incoming data from the receiver to the selected TCP port which can be setup as an input to RTKNAVI. The other way is to use both serial ports on the F9P, one for incoming data, the other for outgoing data.

      Like

  5. Hi. I want to use rtklib’s Str2str for my base station. Thing is that Glonass Sats aren’t usable because of the missing MT1230. It’s doesn’t convert it from an Ublox raw or RTCM3 stream. Do you have any solution for it?

    Like

    1. Hi Richard. In most cases STR2STR is used just as a stream server to direct a stream of data from an input source to an output destination. In this case, the content of the stream will not be modified and if the input stream contains MT1230 messages, they will be relayed unchanged to the output stream. STR2STR does have a limited capability to generate RTCM messages from raw observation messages present in other formats in the input stream, but this does not include the ability to generate MT1230 messages. As long as your base station is generating MT1230 RTCM3 messages, there should be no issue streaming them with STR2STR.

      Like

      1. Hi sir!

        Thanks for you fast response.
        Well, let me explain my current setup:

        Ublox f9p configured as base station
        Sending out MT 1005,1077,1087,1097,1127,1230 on serial.
        RTKLib input stream serial Baudrate 115200 as listener
        RTKLib output stream as ntrip caster to local IP (127.0.0.1), in convert I set Rtcm3 -> rtcm3 and enter the following string:

        1004(1), 1005(10), 1012(1), 1074(1), 1084(1), 1094(1), 1124(1), 1230(5)

        I pick up the stream with Snip Caster As a push-in stream to see the contents but MT1230 doesn’t show up in the incoming stream data. If I do it directly from serial to Snip Caster I got all the messages send from the serial Ublox rtcm stream.

        Hopefully this make any sense.
        (I do this because I need legacy messages for an older type of receiver)

        Like

        1. Hi Richard. By checking the “Conversion From” box in the “Conv” menu, you are asking STR2STR to convert the incoming RTCM3 messages into it’s internal format and then back to RTCM3 before streaming them to the output port. It can do that for the other messages but not for the 1230 messages. It should not normally be necessary to do this, you should be able to relay them without converting them, which you can do by unchecking the “Conversion From” box. Have you tried this?

          Like

          1. Oh for sure.. this work, but I then I don’t have the MT1004 and 1012. An Ublox f9p can’t process these so my thoughts were to use rtklib to create those types from MSM messages. It works for 1004 and 1012 but not for 1230. Eventually it’s not even needed to create these because it’s already in the stream, but i understand it doesn’t relay that message type because of the convert checkmark

            Like

          2. Hi Richard. OK, I understand now what you are trying to do but I don’t believe RTKLIB can currently do this. I do hope to add more general support for the 1230 message to the RTKLIB code at some point which would allow you to do this.

            Like

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.