A few simple debugging tips for RTKLIB

Unfortunately, error reporting is not one of RTKLIB’s strengths.  Sometimes it will halt with an error message that is less than helpful, other times it will just continue through the entire solution with Q=0 or Q=2 without ever getting a fix because of a minor error in the input file configuration or parameters. I suspect anyone who has used RTKLIB with any regularity has run into cases like this innumerable times. It still happens to me all the time, almost always because I have made a silly mistake somewhere. For a new RTKLIB user, it can be quite difficult to figure out why they are not getting a valid solution in cases like this.

In this post, I will demonstrate some examples of common errors and some techniques to help identify them. First let’s take a quick look at a couple of real-time solution examples with RTKNAVI, then I will jump to post-processing cases for the rest of the examples.

In my experience, the most common error when using RTKNAVI is to improperly specify the base location. To demonstrate this, I go to the “Positions” tab on the “Options” menu in RTKNAVI and switch the Base Station position type from “Average of Single Position” to “RTCM Position”. In this example, my base stream is u-blox binary and contains no RTCM messages so RTKLIB will not get any valid base location information and can not calculate a solution. However, rather than halting and letting the user know there is a problem, RTKNAVI just runs through the whole solution without ever reporting either an error or a valid position. To help understand what happened in a case like this, the first step is to click on the tiny unlabeled square above the “Start/Stop” button. This will bring up the RTK Monitor window. Choose “Error/Warning” from the Monitor options as shown in the image below to bring up additional error and warning information.

In this case, every sample reports an “initial base station position error”. This should be enough information to track down the problem fairly quickly. Often this will be the case. If the additional information in this window doesn’t help, then the next step is to enable the debug trace and look at the trace file. This is covered in more detail in the post-processing examples further below.

In my experience, the second most common reason to get no solution with RTKNAVI is that navigation messages haven’t been enabled for either base or rover receiver. If this is the case, then all the SNR bars will be gray. When navigation data is available, the bars will be colored as in the image above. Lack of SNR bar color is the easiest way to detect this problem but another symptom is that the “Error/Warning” window will report “No common satellites” for every sample even though satellites are appearing in the SNR window.

A more general troubleshooting technique that will work for either real-time or post-processing and for GUI or command line RTKLIB applications is the debug trace file. I will demonstrate a couple of examples of using this with RTKPOST but the same basic technique can also be used with RTKNAVI, RNX2RTKOP, RTKCONV, CONVBIN, or STRSVR.

To enable the debug trace, set “Output Debug Trace” in the “Options” menu to something other than “OFF”. For simple errors, level 2 is enough and will produce a less overwhelming amount of information. For more detailed information about the GNSS solution, especially the ambiguity resolution, level 3 is better. For more details about the communication streams, level 4 can be useful. Level 5 includes many of the solution matrices and is too detailed for most general debug. Level 1 only contains the error messages available in the error window and is not useful. The image below from the RTKPOST Options menu shows the trace level set to 3. I usually set it to level 3 since I am often looking at the details of the ambiguity resolution, but for simple debug I would recommend starting with level 2 and that is what I will use in the examples below.

Next, I will intentionally make some silly mistakes and show how they appear in the error window and the trace file.

Let’s start with an example where RTKPOST actually does a good job of reporting the error to demonstrate how it should work, and occasionally does. To keep the example simple I will use files in the root directory, with names: rover.obs, base.obs, and rover.nav.

In this first case, I have made a typo in the name of the .nav file. Now, when I click on “Execute” to start the solution, it almost immediately halts and reports “Error: no nav data” in the error window near the bottom of the window. If only RTKLIB would always be as straightforward as this!

Next, I intentionally make a typo in the rover observation file and click on “Execute”. Again, the solution halts almost immediately but in this case the error message in the error window is much less useful. Not only does it not tell me which file has a problem, but it also suggests there is something specific wrong with the input data, not that the file is completely missing. Fortunately, clicking on another tiny unlabeled square (circled in red below) brings up the trace file, assuming trace debug was enabled as described above. This tells us exactly what went wrong, that the input rover observation file could not be opened.

For the next example, I will correct the typo in the file name but then set the location in the base rinex file header to all zeros. This is another common problem and usually happens because the rinex file was generated by RTKCONV or other application from a binary receiver output file that did not contain navigation messages. Without navigation information, RTKCONV can not estimate the receiver location and sets the approximate position field in the rinex header to all zeros. Again, the solution halts almost immediately, and the error message is identical to the previous case. However, looking at the trace file, there is now no mention of a file open error. To understand the difference between these two trace files, it’s important to know that the entries in the trace file beginning with 2 are warnings, and those beginning with 1 are errors. RTKLIB treats the missing file as a warning since there may be other observation files that are present, and doesn’t get an actual error until it looks for the base position information. With just the error, it is impossible to tell these two case apart, but by looking at the warnings as well, it is fairly easy to differentiate them.

Another fairly common error is to use the wrong base observation file. If the base observations do not overlap the rover observations in time, then the solution will run all the way through the data, without error, but the solution status will always be Q=0. Here’s the trace output for this example. I aborted the run before it completed, after it was obvious that something is wrong which is why the error windows shows “aborted”. The “age of differential” is the time between rover observation and closest base observation so again in this case, looking at the trace file makes the problem much more obvious

If the wrong navigation file is used, the solution will again run to the end with no error, and the status will remain Q=0, just as above, but this time the trace file will report “no common satellites”

Another mistake I make fairly often, is when I am specifying the precise base location rather than using the approximate location in the rinex header, and I forget to change it when moving to a different data set with a different base location. In this case, the solution will run without error, but the fix status will remain in float (Q=2) for the entire solution. There are several reasons why you might never get a fix, including just poor quality data, but in this case when I look at the trace file I see extremely large outliers right from the first sample. If all the satellites are reporting outliers and they are as large as this, that almost always means that the base location has been specified incorrectly.

These are only a few examples but probably cover over 80% of cases I look at where people are unable to get a solution. Most of the rest are caused by low quality data.

If you have other examples of common failures that I haven’t covered here, please describe them in a comment, along with the signature seen in the trace file.

Advertisements

Glonass Ambiguity Resolution with RTKLIB Revisited

To get a high precision fixed solution in RTKLIB we need to resolve the integer ambiguities that come from the carrier phase measurements.  Resolving the integer ambiguities for the GLONASS satellites is more challenging than resolving them for the other constellations.  This is because, unlike the other constellations, the GLONASS satellites all transmit on slightly different frequencies.  This introduces an additional bias error in the receiver hardware.

These hardware biases are constant, generally the same for all receivers from the same manufacturer, are proportional to carrier frequency and are similar between L1 and L2.

In the demo5 version of RTKLIB, there are four choices for how to handle GLONASS ambiguity resolution (AR). I will cover all four briefly, but then focus on the “autocal” setting which I have enhanced in the most recent version (b29c) of the demo5 code.

Off:  If Glonass  AR is set to “Off”, then the raw measurements from the Glonass satellites will be used for the float solution but ambiguity resolution will be done only with satellites from the other constellations.  If you are not using the demo5 version of RTKLIB, this is usually your only choice when using receivers from different manufacturers for the rover and the base.  However, you are giving up a significant amount of information by ignoring the GLONASS ambiguities and so I would not recommend this setting if you are using the demo5 code, unless of course your receivers don’t support the Glonass satellites.

On:  If Glonass AR is set to “On” then RTKLIB will treat the Glonass ambiguities the same as the ambiguities from the other constellations and will not make any attempt to account for the additional hardware biases.  Use this setting if your base and rover receivers are from the same manufacturer, since in this case, the biases will cancel and can be ignored.  There are also some cases in which different manufacturers have equal or nearly equal biases as we will see later, in which case you can also use this setting.  This is your best solution for dealing with Glonass ambiguities.  I always try to use matched receivers for base and rover if possible.

Fix-and-Hold: This is an option I have added to the demo5 code for Glonass AR.  It is an extension to the “fix-and-hold” method used for other constellations but instead of using the additional feedback to track the ambiguities, it uses it to null out the hardware biases.  I recommend this setting when using the demo5 code with unmatched receivers.  It takes advantage of the additional information in the Glonass ambiguites most of the time.  However, fix-and-hold is not enabled until after a first fix has been achieved, and so the Glonass ambiguities are not available until then.  This can mean longer time to first fix and less robustness compared to the “On” option, so don’t use this option for matched receivers.

Auto-cal:  This option adds additional states to the kalman filter to estimate the receiver hardware biases as a function of carrier frequency, one state for L1, another for L2.  In previous experiments I had not had any success with it.  Recently, however, I discovered that if I adjusted the filter settings, it can be effective for a zero baseline case, where base and rover are both connected to the same antenna so that almost all other errors are completely cancelled.  With a little more experimentation I also found that for short baselines it can also be effective if the kalman filter state is pre-set to something close to the final value before the solution is started.  It will then usually converge to the correct bias value.  However, there is currently no mechanism in the code to adjust any of these values, so I have not found this mode to be useful in its current implementation.

To make the auto-cal option more flexible, and hopefully more useful, I made a few modifications to it in the b29c code.  I added the capability to pre-set the initial state value and also to adjust the internal filter settings, specifically the initial variance and process noise for this state.  The units for the state, and hence for the initial value are in meters per frequency channel and values generally are within +/-5 cm per channel.  I used some existing config parameters that are currently unused to reduce the amount of code I needed to change.  Unfortunately it means that the names are not as descriptive as they could be.  The new config parameters are:

pos2-arthres2 = relative GLONASS hardware bias in meters per frequency slot
pos2-arthres3 = initial variance of GLONASS hardware bias states
pos-arthres4 = process noise for GLONASS hardware bias states

Bias values have been published for some of the most popular geodetic quality receivers but are generally not available for lower-cost or less popular receivers.  Here is a table of values from a paper published by Lambert Wanninger in 2011 for nine receiver manufacturers.

biases

I was able to verify these results for Trimble, Leica, and Novatel, but I found a much lower value for Septentrio so I suspect the biases may have changed in their newer receivers.

To demonstrate the modified autocal option, I will start with a zero baseline case between a ComNav receiver and a Tersus receiver.  It is easiest to measure the hardware biases in the zero baseline case because most other errors will cancel and the hardware biases will be the dominant error.  In this case, I have significantly reduced the initial variance setting from the original value of 1.0 to 1E-7 and increased the process noise from 1E-6 to 1E-3.

I have run the solution several times with the initial bias value set to different numbers between -.05 and 0.06.  Here are the results for both L1 and L2.

biases1

The convergence occurs just after first fix is achieved.  If a fix is not achieved, then the state will not converge as you can see above for the 0.06 example.   In this case, the initial value was too far from the correct value and prevented getting a fix.  As you can see, all the other cases converged towards a single value around -.022, both for L1 and for L2.

Another way to visualize the error in the initial value is to look at the GLONASS residuals after first fix is achieved.  The plot below shows the GLONASS L1 carrier phase residuals  for different initial values, for 0.03 on the left, -0.05 in the middle, and what I believe is the correct value for this receiver combination of -.022 on the right.

 

acal1

Here are the same plots for the L2 carrier phase residuals.

acal2

Through a slightly tedious process, I am fairly easily able to iterate the residuals down to near zero for different pairs of receivers in my possession.  Note that this gives me the relative difference in biases for each receiver pair, and not absolute values for each receiver, unlike Wanniger’s table which is for absolute biases.

Extending the table to receivers used in nearby CORS stations is a little more challenging because the initial bias value needs to be fairly close to get a first fix and hence a convergence, but still possible if the base station is not too distant.   I found data sets that included CORS data from Leica, Novatel, Trimble, and Septentrio receivers.  Using the above procedure to iterate the residuals down to near zero, I was then able to extend my table and make the values absolute by choosing the unknown offset to make my bias pairs align with Wanninger’s table.  This is the resulting table I created.

ComNav    =   2.3 cm
Leica          =   2.3 cm
Novatel      =  2.3 cm
Septentrio = -0.3 cm
SwiftNav   = -0.2 cm
Tersus        = -0.1 cm
Trimble      = -0.7 cm
u-blox         = -3.2 cm 

To generate an initial value for the bias state from this table for an RTKLIB solution, subtract the base station bias from the rover bias, then divide by 100 to convert from centimeters to meters.  This value can then be used to set the “pos2-arthres2” config parameter in the config file.  For the RTKPOST and RTKNAVI GUI option menus I have labeled this “Glo HW Bias”.

To test this code on an independent set of data after generating the table, I used a data set recently sent to me by a reader.  It consists of a u-blox  M8T receiver for rover and Leica receiver just a few kilometers away for base, and was collected in Europe.  The rover position was static but I ran the solution in kinematic mode to make the solution a little more challenging and to make any errors in the solution more visible.

To generate the correct config value for RTKLIB I  subtracted the Leica bias of 2.3 cm from the above table from the u-blox bias of -3.2 cm to get a relative bias between receivers of -5.5 cm or -0.055 m.  I added “pos2-arthres2=-0.055” to the config file and then ran the solution four times, with pos2-gloarmode set to “off”,”fix-and-hold”,”autocal”, and “on”.  Although I left the bias value set for all runs it is ignored unless gloarmode is set to autocal.

Here are the times to first fix, the number of satellite pairs used for the initial fix, and the number of satellite pairs being used for fix after 10 minutes.

  Time to # sat pairs used # sat pairs used for
GLO AR mode first fix for initial fix fix after 10 min
OFF 4:10 7 7
Fix&Hold 4:10 7 11
Autocal 1:05 14 14
On 6:47 14 14

As you would expect, the time to first fix for gloarmode=”off” was the same as “fix-and-hold” since “fix-and-hold” does not use the GLONASS satellites for initial fix.  After 10 minutes it was still only including four of the GLONASS satellites in the ambiguity resolution which was a little unusual, typically I would have expected more GLONASS satellites to be included.

With gloarmode=”autocal”, the time to first fix was reduced from 250 seconds to 65 seconds and the number of satellites included in the first fix increased from 7 to 14., both significant improvements.

The most surprising thing in this data is that when gloarmode was set to “on” it acquired a fix at all.  In many similar cases it will never get a fix.  The GLONASS carrier phase residuals after initial fix were very high though as can be seen below.  The left plot is with gloarmode set to “on”, and the right plot is with it set to “autocal”.

biases3

The ambiguity resolution ratio was also much higher when autocal was enabled as can be seen below (yellow/green=autocal, olive/blue=on) which improves robustness.

biases2

The large residuals did not affect the solution position, as the two solution did not differ by more than 2 mm at any time.  The autocal solution however is much more robust in the sense that it is less likely to lose fix.

Although I have found the results with autocal enabled are generally excellent with relatively short baselines (<10 km), I have found the results less encouraging for longer baselines (>25 km).  In these case I have found that I often get better results with pos-gloarmode set to “fix-and-hold” then I do with “autocal”.  I don’t understand exactly why this is, but suspect that the fix-and-hold correction is more general and may be correcting for more than just the GLONASS hardware biases.

The code changes for this feature are included both in my Github repository and in the newest (demo5 b29c) executables available to download from the rtkexplorer website.   If you choose to experiment with this feature, please let me know if you find any errors in my table, or can add values for any additional receivers.

[Note 6/17/18:  I had a issue with uploading the executables to the website.   If you downloaded them prior to 6/17/18, please download again to get the updated version.] 

RTKLIB: Code comments + new Github repository

If you’ve spent any time perusing the RTKLIB source code you will surely be aware that it is very much what I would call “old-school” code. Comments are sparse and variable names tend to be short and not always very descriptive. The upside is that the code is quite compact, significant pieces of it can be seen on a single page which can sometimes be quite nice. The downside is it can be difficult to follow if you have not been poring over it for a long time. To help myself understand the code I have added a fair bit of commenting to most of the code used to calculate the carrier-phase solution, which is really the heart of the RTKLIB kinematic solution.  This is mostly in the rtkpos.c file.

The changes are too extensive to spell out here so I have forked the RTKLIB code in Github and posted my commented code there. If you are interested, you can download it from the demo1 branch at https://github.com/rtklibexplorer/RTKLIB.  

All the previous code changes are also there as well as a couple I have not yet posted about.  The receiver data I have been using for these posts is also there in the rtklib/data/demo1 folder.  The solution file updates for VS2010 c++ are in the rtklib/app/rnx2rtkp/msc folder and the modified executable is in the rtklib/bin folder.

Building RTKLIB 2.4.3 code with Visual C++ 2010

One of my goals in this project is to make some changes to the RTKLIB solution algorithms and evaluate them specifically for my set of inputs. RTKLIB is intended to work in many different environments. I would like to customize it to be optimal for calculating differential solutions for two single frequency receivers assuming low velocities and open skies. Some of the changes I would like to make are based on what other people have already tried but for which code is not available. Other changes are based on looking at cases in my data where RTKLIB did not provide a good solution, understanding why, and changing the algorithm appropriately.

To make these changes, I need to be able to modify the RTKLIB code and rebuild it. Since I am doing this initial evaluation work on a Windows PC I will need to build the code in that environment. The GUI versions of the RTKLIB programs require the Embarcadero C++ compiler which I do not have and which is fairly expensive to purchase. The CUI versions can be built with Visual C++, which is available for free, so I have chosen to go that path. As mentioned before, I also find the CUI interface with a matlab wrapper better for tracking configuration information for multiple runs.

The project files for Visual C++ are in the rtklib\app\”app name”\msc folders. They are configured for Visual C++ 2008. Since I already have Visual C++ 2010 loaded on my machine I had to make a few changes to make the project files work. Some of the changes would also be required with VS 2008 since it does not look like the project files have been kept up to date with recent RTKLIB changes.

  1. Convert solution files to VS 2010: Click on msc.sln file in rtklib\app\rnx2rtkp\msc folder. This will bring up the visual studio conversion wizard which will convert the VC 2008 project files to VC 2010.
  2. Add new src files to project: Add tides.c and ppp_corr.c to the list of source files in the “Solution Explorer” window by right-clicking on the source folder and selecting “Add”. This prevents build errors from unfound symbols.
  3. Modify Include Directories: Select “Properties” from “Project” tab. Select “General” under “C/C++” under “Configuration Properties” in the menu on the left hand side. Replace the existing entry with “..\..\..\src” to make it independent of code path. This enables the compiler to find the rtklib.h file.
  4. Modify Target Name: Select “General” under “Configuration Properties” in the the menu on the left hand side and change “Target Name” to “rnx2rtkp” to avoid linking errors.
  5. Add winmm library: Select “Input” under “Linker” under “Configuration Properties” in the menu on the left hand side, and add “winmm.lib” to “Additional Dependencies”. This avoids linker errors for an unresolved TimeGetTime symbol.
  6. Fix LPCWSTR conversion warning: Select “General” under “Configuration Properties” in the the menu on the left hand side and change “Character Set” from “Use Unicode” to “Not Set”

To run these apps from the data folders you will need to add the paths for the executables to your Windows path variable. The executables are in app/”app name”/Release/”app name”.exe or app/”app name”/Debug/”app name”.exe depending on whether you built in Debug or Release mode. I always build in Debug mode and point my path variable to that folder just to make it easier to switch between debugging with VS and running stand-alone, but either will work.

Kinematic solution with RNX2RTKP

 

In the last post we used the RTKPOST GUI to generate a kinematic position solution using “ebay.obs” for the rover data and “zdv13480.15o” for the base station data. To do the equivalent run from RNX2RTKP we use the following command line, run from the folder containing the data files.

rnx2rtkp -k test1.conf -o out.pos ebay.obs zdv13480.15o ebay.nav

The -k option is to specify the configuration file, the -o option is the output file. In this case we use the configuration file we saved from RTKPOST in the previous post.

To plot the calculated solution use the following command line:

rtkplot out.pos

This calls up the same plot GUI we previously accessed by clicking on the “Plot” button in the RTKPOST GUI.

The plots should be identical between this solution and the previous solution generated with RTKPOST.

To keep track of changes between various runs with the same data, I use a matlab wrapper to create a sub-folder, copy the config file, trace file, stats file, and result file to that sub-folder, input a short description of the run, and also save that to the sub-folder.