New b27 release of demo5 code

I’ve just released a b27  version of the demo5 code that incorporates all the changes in the b27 release of the official code.  I’ve also included some fixes for the Galileo satellites.  The biggest fix I added was for an array overrun that was causing solutions to intermittently report zero status when Galileo satellites were enabled.  Another fix was to correctly decode the range accuracy estimates in the Galileo ephemeris data.  The code can be downloaded from the code download section of the rtkexplorer.com website.   The source code is available in my Github repository.  Please let me know if you find any issues with this code, especially if it was something that was working in the previous code.

One issue I am aware of is the time tag feature which enables simulated real-time runs in RTKNAVI appears to be broken in the new code.  There was an existing incompatibility  between time-tag files that were created in the win32 version of RTKLIB and run in the win64 version or vice-versa.   It looks like an attempt to fix this in the latest release code actually made things worse and now the time-tag files don’t work at all, at least in the win32 version.

By the way, for anyone using the free starter edition of the Embarcadero compiler, be aware that it does not support building win64 executables.  The code will build, but the option to specify win64 is not available, so the compiler will just build the win32 code.  I hadn’t realized this until I tried to build the win64 version recently.

Also, I’ve recently got a hold of two low cost dual frequency receivers, one from Swift Navigation and one from Tersus, so I am collecting and analyzing data with them now.  I should have some results and comparisons to share in my next post soon.

Advertisements

New firmware, new satellites, new code

CSG Shop is now shipping all of their M8N and M8T u-blox receivers with the latest version 3 firmware.  This is not such good news for the M8N units since the raw measurements are scrambled and these receivers need to be downgraded to the previous firmware version before using with RTKLIB.  For the M8T receivers though, the new firmware is good news because it contains support for the Galileo satellite system.

I now have two of their M8T receivers with the new firmware and did a little testing to see how RTKLIB works with the Galileo measurements.  I did have to make a couple small changes to get things working.

First of all, the RNX2RTKP compile options for including the Galileo code was not enabled.  For some reason, all the other apps did have this option enabled.  To enable it, I had to add “ENAGAL” to the “Preprocessor Defintions”  for C/C++ in the Project menu in Visual Studio.

The second issue I ran into was in the decode_rxmrawx() function that decodes the raw u-blox RXM-RAWX messages.  There is a line of code in this function that sets the code type based on the system.

raw->obs.data[n].code[0]=
       sys==SYS_CMP?CODE_L1I:(sys==SYS_GAL?CODE_L1X:CODE_L1C);

This line sets the code to L1X for Galileo, but that code type doesn’t seem to be supported by RTKLIB and the measurements in the RINEX file for the Galileo satellites get left blank.  Changing the “L1X” in the above statement to “L1C” resolves the problem.  That leaves an unnecessary check in the code but I will leave it there at least until I understand what it was supposed to do.  After that everything else worked fine including ambiguity resolution with the Galileo satellites, so that was quite encouraging.

Next,  I put the two receivers outside in the front yard to collect a longer set of data.  Not an ideal environment because they were close to the house but fairly open skies otherwise.  In an hour of data collection I got measurements from 11 GPS satellites, 8 GLONASS satellites, 5 Galileo satellites, and 3 SBAS satellites. After collecting the data, I processed it with various constellation options to see how they compared.  For all the solutions, I set ambiguity resolution mode to “continuous”, position mode to “kinematic”, and opened up the position variance threshold for AR (arthres1) to allow the solution to lock up as early as possible.  I also enabled all constellations for ambiguity resolution in each case.  Here’s how they compared:

satcombos1a

satcombos1b

Note that the time scale on the GPS-only plot is very different than the others since it took much longer to lock up than any of the other combinations.  With the GPS satellites only, there was an initial short false fix after 14 minutes, then a good fix at 27 minutes that lasted a few minutes but it did not get a solid fix until 43 minutes after it started.  That’s a long time to wait!  Adding a second constellation significantly improved the results, with solid fixes coming after two minutes with GLONASS added, five minutes with SBAS added, and 7 minutes with Galileo added.  Adding a third consellation improved things even more, with times to first solid fix varying from 12 secs for GPS+SBAS+GLO, 3.5 min for GPS+GAL+GLO, and 6 min for GPS+SBAS+GAL.  Using all four constellations gave a time to first solid fix of 2 minutes, not the fastest time, but better than two out of three of the three constellation answers.

It is risky to conclude too much from one data set, but these results are consistent with other data I’ve looked at (for three constellations) that show the more satellites you use the better the answer.  This seems to make sense to me since more information should be better than less information.  However, I often hear or read recommendations to use only the GPS data for better results which I don’t understand.  If anyone has data to support that recommendation I would like to see it to understand it better.

I do sometimes see that one bad satellite can prevent or delay a solution no matter how many good satellites there are and this may be part of the answer.  The more satellites you use, the higher chance there is of having a bad one and RTKLIB is not great at rejecting a bad satellite.  The “arlockcnt” and “ARFilter” features do help prevent bad satellites from getting into the AR solution but they do not reject a satellite if it goes bad after being accepted into the solution.  I have added a new feature starting with the demo5 b26a code that does try to reject bad satellites after they have been accepted into the AR solution but have not had a chance to do a lot of testing on it yet.  It was enabled for the test above and may possibly have helped, I did not look into the details.  The feature is enabled by setting the “pos2-mindropsats” to a value lower than the number of satellites in the solution, in which case it will cycle through dropping all the satellites, one by one, one each epoch, and reject a satellite that has a large negative effect on the AR ratio.  If you try this feature, be careful not to set the minimum satellite threshold too low or you will increase the chances of a false fix.  I would recommend values no lower than 10 satellites.

I have released a new version of the demo 5 code (b26b) with the fixes for Galileo, a couple of new features and fixes, and GUI updates for RTKPOST and RTKNAVI for all the new input parameters for both b26a and b26b codes.  The binaries and a list of the changes are available here.  The source code is available on my Github page.

 

 

 

Pseudorange corrections for the u-blox TRK-MEAS command

When configuring u-blox GPS receivers to output raw pseudorange and carrier phase data, we normally use the RXM-RAWX command for the M8T receiver. For the M8N receiver, since it does not support the RXM-RAWX command, we normally use the undocumented TRK-MEAS command instead. These are what the “xxx.cmd” scripts in RTKNAVI or STR2STR are doing when starting up the receivers.

Something I did not realize until fairly recently was that the M8T also supports the TRK_MEAS command and furthermore, both commands can be enabled simultaneously.

This is normally not a good thing to do since it confuses the RTKLIB decode apps and will result in a cycle slip reported for every epoch. In fact I first discovered the capability to run both at once accidentally by running the M8N “cmd” script on an M8T after previously running the M8N script. I then spent an embarrassingly long time figuring out why the data was so garbled. If you ever see two outputs in your observation file for every epoch, this is probably the cause.

But, for learning purposes, this is actually quite an interesting configuration because it allows us to directly compare the TRK-MEAS output to the RXM-RAWX output and verify if the two are providing the same information.

To do this experiment I first combined the two command scripts to create one that would enable both raw measurement commands. I then collected some data with an M8T receiver using this script to configure the receiver. Running an unmodified version of RTKCONV on the resulting raw output file allowed me to confirm I was collecting output from both commands. Here’s one epoch from the observation file with output from both commands. I’ve deleted some of the satellites to make it easier to read. The last column is SNR which is reported with a resolution of 0.25 by the TRK-MEAS command and 1.0 by the RXM-RAWX command which makes results from the two commands easy to distinguish.

trkmeas1

Psuedorange and carrier phase are the two 13 digit fields. At first glance, the two command outputs look quite different. This is at least in part because the TRK-MEAS command reports transmission time rather than pseudorange which RTKLIB converts to a sort of pseudo-pseudorange by subtracting an arbitrary constant time off each satellite in the epoch. This error will be removed later as part of the receiver clock offset so it can be ignored. It will be constant for all satellites so we can determine it’s value by removing the common offset between all satellites.

My analysis tools don’t have an easy way to separate the two sets of data from the combined observation file and I also wanted to run solutions on each data set separately so rather than work with the combined file I chose to modify RTKCONV to separate the two sets of data. I compiled two temporary versions of this app, one with the TRK-MEAS decode function disabled and one with the RXM-RAWX decode function disabled. I then ran each on the combined raw data file to create two observation files, one with the TRK-MEAS output and one with the RXM-RAWX output.

Here’s the difference between the pseudorange measurements from the two commands, the top plot is the GPS and SBAS satellites and the bottom plot is the GLONASS satellites. In both cases they are actually differenced with a reference satellite (GPS) to remove the common clock error mentioned above.

trkmeas2

What this plot shows is that once the common error is removed, the pseudorange outputs of TRK_MEAS and RXM-RAWX are identical for the GPS and SBAS satellites but differ by constant integer number of meters for the GLONASS satellites.

Halfway through this exercise I remembered someone had left a comment a few months ago with a link referencing something similar. At the time I hadn’t fully understand the significance of the comment but going back to it I realized they were talking about exactly the same thing. The link was to the OpenStreetMap project and this is the table from the link. Apparently someone there must have done a very similar experiment and their results match mine exactly.

trkmeas3

It turns out, not surprisingly, that the integer offsets correspond directly to the carrier frequency of each GLONASS satellite, and the above table can be expressed more concisely that way. They also found that the integer offsets are different between the 2.30 u-blox firmware and the 3.01 firmware which is what the column titles refer to.  My M8T receivers are running the 2.30 firmware.

trkmeas4

I find the arbitrariness of the numbers in this table quite strange but the data suggests very strongly that they are real. If they are real and unchanging then they can quite easily be corrected in the RTKLIB TRK-MEAS decode function. Of course, the data only shows they are different, it doesn’t show which is correct. It seems more likely that the RXM-RAWX would be more accurate since it is the documented command, but we can verify this.

To do that, I ran solutions for both data sets. In the plot below, the yellow/green trace is the solution for the RXM-RAWX data, and the green/blue trace is the solution for the TRK-MEAS data. The pseudorange measurements will have a steadily diminishing effect on the solution as the kalman filter converges and the carrier-phase measurements start to dominate. What this plot shows is that the initial errors, when psuedorange dominates, are smaller in all 3 axes with the RXM-RAWX data and presumably because of this, the fix occurs earlier with this data.

trkmeas5

I then modified the TRK-MEAS decode function in ublox.c to include an adjustment to the pseudorange measurements based on the values in the table above. Rerunning the solutions with this change gave the following result. As you can see the two commands now give virtually identical solutions.

 

trkmeas6

It would not be very useful if this correction to the TRK-MEAS results applied only to the M8T and not to the M8N, but since the two modules share the same hardware, it is very likely that the corrections apply to both.

I had collected some data with an M8N receiver running 2.01 firmware at the same time I collected the M8T data above, so let’s take a look at that data next to see if we can confirm this. I converted the M8N TRK-MEAS raw data to two sets of observation files using the unmodified and modified version of RTKCONV to give uncorrected and corrected measurements. I then ran solutions on both sets of data.

In the plot below, the yellow/green trace is the solution for the corrected measurements and the green/blue trace is the uncorrected measurements. Again all three axes show smaller initial errors with the corrected measurements, although in this case it didn’t result in an earlier fix. On average, smaller position errors result in earlier fixes, but the correlation is not 100% and the fix times tend to be quite discrete so I believe the reduced initial position errors are significant even though they did not result in an earlier fix in this particular example.

trkmeas7

Just to be sure though, let’s dig in a little deeper to see if we can find some more convincing data.  Let’s look at the pseudorange residuals first.  We would expect that reducing errors in the GLONASS pseudorange measurements should be most visible in the pseudorange residuals.  Plotting the pseudorange residuals for just the GLONASS satellites, uncorrected on the left, corrected on the right confirms this.

trkmeas9

The same plot for the GPS satellite pseudorange residuals shows little change between corrected and uncorrected, again as we would expect since we have not corrected those.

trkmeas10

All of this is quite encouraging and suggests it would be worthwhile to add this correction feature to the code. So I’ve gone ahead and done this to the demo5 RTKLIB code by adding a receiver option for the u-blox receivers. Receiver options are entered from the “Options” menu from RTKCONV or RTKNAVI and from the command line with CONVBIN. Below is an example of setting the new option with RTKCONV. The name of the option is “-TRKM_ADJ” and it is set to “2” to apply the 2.01 or 2.30 firmware offsets and “3” to apply the 3.01 firmware corrections. (Note:  the second dash in the option specified in the image below is incorrect, it should be an underscore.)

trkmeas8

The options used by RTKLIB during decoding are listed in the header of the observation file so you can verify which options were used by looking there.

This change should help the M8N solutions get to a first fix quicker and more reliably.

I’ve already uploaded the source code changes to my Github page and plan to update the executables soon.

 

AR Filter:A RTKLIB cycle-slip enhancement

Some of you may remember, one of the first code changes I made to RTKLIB was fixing a bug in the arlockcnt feature. Arlockcnt is an input parameter that specifies how many samples delay occurs before a new satellite (or a satellite that just recovered from a cycle-slip) is used for ambiguity resolution. Holding off use of the new phase-biase estimate from the kalman filter until it has had enough time to converge prevents corruption of the ambiguity resolution integer set. This in turn prevents a loss of fix.

Although waiting a fixed number of samples is fairly effective, it is not an optimal solution. Ideally we would use information from a new satellite as soon as it was converged and not after a fixed amount of time since some satellites will converge faster than others. When your data looks like this one, then every additional sample you can process is going to help.

arfilt0

This is what the AR filter attempts to do. In the current code implementation, a new satellite is unconditionally added to the integer ambiguity set when the arlockcnt expires regardless of the effect it has on the AR (ambiguity resolution) ratio. This means that the arlockcnt must be set conservatively, to insure the slowest satellite has converged, and means that most satellites will not be used for ambiguity resolution as early as they could be. In the case of frequent cycle-slips, this could mean loss of fix from having too few satellites available or it could mean a false fix since less satellites gives a less robust ambiguity resolution.

When the AR filter is enabled, a new satellite is still added to the integer ambiguity set when the arlockcnt expires but the effect of adding each new satellite is evaluated and if it causes a significant degradation in the AR ratio, that satellite’s use in ambiguity resolution will be delayed for a few more samples before being re-evaluated. Exactly how to define “significant degradation” is a bit subjective. I have chosen to disqualify a new satellite if it causes the AR ratio to drop below the AR fix threshold or if drops by more than a factor of two and the result is within 10% of the AR fix threshold. Exactly how long a satellite should be delayed is also subjective. I chose to delay a disqualified satellite for five samples plus a stagger of one sample for additional satellites. If two satellites are disqualified on the same sample, it could be either satellite or both that caused the disqualification. By adding a stagger to the delay for the second satellite, they will be re-evaluated independently on different samples.

To evaluate the change, I ran two solutions on the Ublox M8T data from my previous series of “M8N vs M8T” posts. This is my most challenging data set from a cycle-slip perspective. The solution below on the left is with arlockcnt=0 and AR filter disabled. The solution on the right is with arlockcnt=0 and AR filter enabled. As always, the yellow represents a float solution, and the green, a fixed solution. As you can see, enabling the AR filter significantly improved the number of fixes. Normally I would not set arlockcnt to zero if the AR filter was not enabled, this was for comparison purposes only.

arfilt1

As you would expect, with the AR filter disabled, increasing arlockcnt from 0 to 75 samples (15 sec) improves the solution for this data set as shown below but it still loses fix relatively often compared to the solution above with the AR filter enabled.

arfilt2

The plot below compares the number of satellites available for ambiguity resolution between the “arlockcnt=75/filter off“ solution and the “arlockcnt=0/filter on” solution”. Notice that we have significantly reduced the number of samples with less than 10 satellites available for AR by enabling the AR filter. More satellites should mean less chance of losing fix and also less chance of a false fix.

arfilt3

In this example, the accuracy of the fixed solution points did not seem to be noticeably affected by enabling the AR filter. As usual, I evaluate accuracy by comparing the receiver position relative to the position of a second receiver mounted on the same rover, both relative to the same base receiver. The difference between the two rovers should be a perfect circle, so any errors will appear as deviations from the circle. Plotting for only the fixed points, the “arlockcnt=75/filter off“ solution is on the left and the “arlockcnt=0/filter on” solution on the right. In both cases the errors appear to be very similar and within a few centimeters. This probably makes sense since the same satellites were used to calculate position in both cases, it was only the ambiguity resolution that differed. Any advantage from having more satellites in the AR calculations could be offset by the fact that the additional satellites were probably noisier since they may not have been fully converged. Also, the plot on the left does not include many of the points on the right, since samples without a fix are not included.

arfilt4

I actually created the AR filter feature quite a while ago but never got around to describing it or even fully testing it by reducing arlockcnt to zero. I have now done that, and made some small improvements to it in the last few days. I have updated my Github repository and executables folder with the latest version.

That pretty much completes my general explanation of this feature but there are a few details to be aware of if you are interested in trying it out yourself.

First of all, enabling the AR filter will slightly increase the code execution time since if a satellite is rejected, the ambiguity resolution has to be re-run without the rejected satellite. The difference is small enough however that I don’t think it will be an issue in the vast majority of cases.

The second thing is to understand is how the arlockcnt interacts with the half-cycle valid bit. A typical cycle-slip (at least on a Ublox receiver) looks like the plot below. There is usually a gap of no data, then a cycle-slip (red tick), then a number of half-cycle invalid samples (gray tick), then a final cycle-slip. The second cycle-slip is actually not reported by the receiver, but is added during the translation from raw data to RINEX format when the half-cycle valid bit transitions. Any time the half-cycle status is invalid, that satellite will not be used for ambiguity resolution regardless of the arlockcnt. The arlockcnt will be reset by the second cycle-slip and count from there. So, in this example, if arlockcont were set to 10, all the samples from the beginning of the gap until 10 samples after the second cycle-slip will be ignored for ambiguity resolution.

arfilt5

The last thing to mention is that one of the recent code changes I referred to above was to add a pseudo half-cycle invalid bit for the SBAS satellites for the M8T receiver. For some reason, the Ublox receivers don’t seem to report the half-cycle status for the SBAS satellites. The change I made was in the raw data to RINEX translation where I set the half-cycle invalid bit for a fixed delay after a cycle-slip on a SBAS satellite.  This makes cycle-slips on the SBAS satellites look very similar to the rest of the satellites.  I had previously done this for the M8N receiver and that change has been migrated to the release code but hadn’t got around to doing it for the M8T. This attempts to avoid the half-cycle uncertainties from possibly causing a false fix if the SBAS satellites were used too early for ambiguity resolution.

RTKLIB: Demo 4 vs 2.4.3 B16 code

Over the last six months or so I have made a number of changes to the RTKLIB code. A couple of these (ArlockCnt bug and M8N half-cycle bit) have made it back into the officially released code on Github but most of them have not. While I was in the mode of comparing solutions, I thought it would be worthwhile comparing solutions from the latest demo4 code from my Github repository to the latest released code (2.4.3 B16) to see how much they differ. The demo4 code is based on the 2.4.3 B12 released code but I do not believe the differences between releases B12 and B16 should affect this comparison.

I used the same data sets and input configuration files I used for the M8T vs M8N comparisons I described in the last few posts. The only differences being that I removed the options and input parameters that do not exist in the released version from the configuration files for the released code runs.

Here are the solutions from the previous post for both the pair of M8N receivers and the pair of M8T receivers using the latest demo4 code.

     M8N                                                                             Reach M8Trelease1

Here are the solutions for the same data sets using the 2.4.3 B16 code, both RTKCONV and RNX2RTKP.

     M8N                                                                             Reach M8Trelease2

A quick look at the vertical scales as well as the shape of the plots shows that the M8N solution is completely invalid whereas the M8T solution looks reasonably close although with a noticeably lower fix ratio. The difference between the M8N and the M8T is much greater with the release code than it was with the demo4 code. This is consistent with observations from other users that the M8T is significantly better than the M8N, something I did not really see in the demo4 solutions.

Let’s also look at the differences between solutions since they can indicate errors. Here is the difference between the demo4 M8N and M8T solutions from the previous post. Remember that this difference should be a circle with a radius equal to the distance between the rover antennas since the data is from two different receivers mounted on the same rover. I have only plotted data for the fixed points since those are the ones we have highest confidence in.

release3

There is no point in looking at the release M8N solution since it is so poor, but we can compare the M8T solutions from the two code versions. In this case, because we are using two solutions from the same data set, the difference should be zero. Here is a plot of the difference between those two solutions for the M8T data set, again only for the fixed solution points.

release4

Here is the same data, plotted by axis.

release5

The differences are quite large, up to nearly a meter in x and y, and more in the z axis. Based on the previous comparison between the M8N and the M8T with the demo4 code, I have fairly high confidence in the demo4 solution so I suspect the errors are in the release solution. We can verify this though by looking for discontinuities in the solutions at the erroneous points.

Let’s pick the point at 01:22:47 which shows a large difference between solutions in all 3 axes. Here are the two solutions zoomed into this point in time.

Reach M8T: demo4                                Reach M8T: 2.4.3 B16release6

Notice the large jump in the released code solution. Since this data is from a car driving on roads, the large discontinuities in the y and z axes are not real and confirm the error is in the release code and not the demo4 code.

So, several people now have asked if these changes will make it into the official code. As I’ve mentioned before in one of my comments, I’ve chosen not to submit all of my code changes to the official code repository. This is because my changes are focused on one small corner of the world of GPS (low cost) and tested on an even smaller corner (low-cost Ublox) and I am not sure how well they all will translate into the larger realm that RTKLIB supports. At minimum, it seems they would require a fair bit of testing on multiple hardware platforms which I don’t have the time or resources to do. I’m also not sure some of them would all be considered sufficiently mathematically rigorous for general widespread use. But I will continue to make them available in my Github repository for anyone that would like to incorporate them into their code.

I have just submitted my most recent change, the improvement in decoding cycle-slips from the M8T raw data, as an issue to the official Github repository.

 

Update 8/28/16:  I now have a demo5 version of this code in which I have made the GUI versions (RTKNAVI and RTKPOST) fully functional with all of the demo4 features.  The code is in a new demo5 branch in my Github repository and the executables are here

Ublox M8N vs M8T: Part 2

In my last post, I compared data from a pair of Ublox M8N receivers with data from a pair of Emlid Reach M8T based receivers, and found issues in both data sets. In this post I will look into those issues more closely.

To do this, I collected another data set with a few differences from the previous one.

  • On the M8N rover receiver, I replaced the original antenna that had only a 1” cable with a Ublox ANN-MS-0-005 antenna with a much longer cable. This allowed me to move the receiver from the car roof to inside the car, a friendlier thermal environment. The goal was to see if this would avoid the all-satellite cycle-slips I saw last time at the beginning of the data set. This antenna cost me $20 from CSG but is available from other places for less.
  • On both Reach M8T receivers, I modified the dynamic platform model setting from “Airborne < 4g” to lower acceleration settings; “Pedestrian” for the rover, and “Stationary” for the base. I also changed the setting on the M8N base receiver from “Pedestrian” to “Stationary” to make them consistent. The goal here was to avoid the erroneous fixed solution points I saw in the solution from the previous M8T data. To insure the modifications occurred correctly, I used the u-center software to read back the settings from the receivers after the data was taken.
  • I chose a measurement route with much more tree cover than the previous one to increase the difficulty of the solution and to help differentiate the two receiver sets.

Here is a Google map of the measurement route. In this area there are few native trees, so by driving through residential areas, rather than next to them as I did in the previous data set, I encountered many more trees. There were numerous locations with tall trees on both sides of the road as well as places where I drove directly under large tree branches.

walker1

Here’s an example of some of the more challenging tree cover encountered in the route. The route locations are only approximate since the base locations are not calibrated, that is why the lines are not actually on the road.

walker2

Here is the observation data from both rovers. The red ticks are cycle-slips, the gray ticks are half-cycle invalids.

                             M8N                                                                          Reach  M8Twalker3

The first thing to notice is that there are no simultaneous cycle-slips in the M8N data. Of course, this doesn’t prove we will never get them but it suggests that maybe moving the receiver inside the car did help. Unfortunately, there is one simultaneous cycle-slip in the M8T data very close to the start at 00:54. Apparently, both the M8N and the M8T are susceptible to these slips, which is not surprising, since the chips are part of the same family. We do see only a single slip, which is an improvement over the previous data. Since they always occur near the beginning of the data collection, I still believe they are most likely caused by thermal transients. I suspect that the best way to avoid them would be to turn on the receivers 15 minutes before starting to collect data in an environment as similar as possible to the data collection environment.

Next, let’s look at the SNR vs. elevation plots. Last time we saw noticeably better numbers with the Reach setup because of the more expensive antenna. This time, with the Ublox antenna on the M8N receiver, the two are much more similar. SNR isn’t everything, and there still are reasons why the more expensive Reach antennas are likely better than the Ublox antennas, but we’ve at least closed the gap some between them.

                            M8N                                                                          Reach  M8Twalker4

Here are the position solutions from both receiver sets.

                            M8N                                                                          Reach  M8Twalker5

The M8N solution is very good, with nearly 100% fix after the initial acquire until the very end when I parked the car under some trees in the driveway. This is despite a very challenging data set with many cycle-slips.

Unfortunately, the M8T solution is not as good. The initial acquire is delayed because of the simultaneous cycle-slip I mentioned earlier. It does eventually acquire, but then loses fix for quite a long period near the end of the data (the yellow part of the line in the plot above)

Even more concerning, when we look at the acceleration plots, we see the same spikes in the M8T data as we did in the previous data set.

                       M8N                                                                          Reach  M8Twalker6

Again, these spikes align with erroneous fixed solution points in the M8T data as can be seen in deviations from the circle in the difference between the two solutions.

walker7

Clearly, changing the dynamic platform model setting in the receiver did not fix the problem. A couple of experienced users have commented that this setting does affect the front end of the receiver on earlier Ublox models and it’s still possible it affects the M8T as well, but it does not appear to be the cause of this problem. We will need to look elsewhere for the solution.

We are running identical RTKLIB solution code on the two data sets and we have verified that the receiver setup is nearly identical for both data sets. So what else can be different between them? One possibility is the conversion utility, RTKCONV, that translates the raw binary output from the receiver into the RINEX observation files that are used as input to the solution. Since the raw measurements are output by different commands on the two receivers, there are two different functions in the RTKCONV code that process them.

Let’s look first at the RTKCONV code to convert the data from the UBX_TRK_MEAS command used by the M8N receiver. I won’t show the code here but just describe functionally what happens in the code. Each data sample from the receiver contains a status bit indicating carrier-phase lock and a count of consecutive phase locks. RTKCONV sets the cycle-slip flag for that sample if the carrier-phase is valid and an internal code flag is set. If the carrier-phase is not valid, then the cycle-slip flag is left in its previous state. The internal code flag is set if the phase lock count is zero or less than the phase lock count for the previous sample and is only reset for the next sample if the carrier-phase is valid.

For the UBX_RXM_RAWX command used by the M8T receiver, there is also a status bit indicating carrier-phase lock. Instead of a count, there is a time for consecutive phase locks but functionally it is equivalent. The RTKCONV code for this command does not use an internal code flag and the cycle-slip flag is set or cleared directly every sample that the carrier-phase is valid. The cycle-slip flag is set if the phase lock time is zero or less than the previous sample and cleared otherwise.

I know that’s a bit confusing and the difference is fairly subtle. The most important thing to understand is that RTKCONV is doing more than just translating from binary to text, it is deciding which samples to set cycle-slips for based on a somewhat complicated algorithm, and that algorithm is different for M8N and M8T.

Basically, the effect of this difference is that for the M8N, if a cycle-slip is followed by an invalid phase then the next valid phase will always be flagged as a cycle-slip while for an M8T it won’t necessarily be so. It’s easier to understand by looking at a picture. The observation plots below show the location of one of the acceleration spikes in the M8T solution which is caused by a cycle-slip on satellite G05. The plot on the left shows which samples were flagged as cycle-slips with the existing code, the plot on the right shows the cycle-slips after modifying the code to be functionally equivalent to the M8N code. Note the extra two cycle-slips with the change.  The extra cycle-slips in this case are a good thing because they are flagging samples with large phase errors and preventing RTKLIB from incorporating them into the solution.

              Reach M8T before change                                   Reach M8T after changewalker8

Modifying the RTKCONV code for the UBX_RXM_RAWX command in this way and re-running the solution gives us the the position plot on the right and the acceleration on the left.

walker9

Much better than before! Not quite as good as the M8N but still a significant improvement. The difference between the M8N solution and the improved M8T solution is shown below.

walker10

Remember this should be a circle if both solutions are free of errors.  Actually in this case not quite a circle because there is also a separation between the two base stations ,but that effect is small and we can ignore it for now.  This is much better than the equivalent plot from the original M8T solution shown earlier. There are obviously some erroneous fixes even after the improvement, so this is still a work in progress, but I think it is a big step in the right direction.

This data set was quite a bit more challenging than the previous one and in reality both solutions are quite good given the number of cycle-slips we saw, but there is always room for improvement.

The other thing to note with this data set is that the better quality antenna made a big difference on the quality of the M8N solution.  I suspect I will not be going back to the old antenna!  I knew the nicer antenna would help but I have resisted using it till now because of the extra cost.  There are similar looking antennas with similar gain specs available for as little as $4 so I may try some of these to see how they compare.

I’m not going to post the code to my Github repository or my binaries until I’ve had a little more time to understand the remaining errors but if anyone wants to take a closer look now, here are the code modifications to ublox.c that I made in the decode_rxmrawx function.

walker11

RTKLIB: Static-start feature

The static-start feature is something I added to the demo4 code a while ago but never got around to explaining, so I will do it now.

As I’ve mentioned before, I always like to initially leave the rover stationary after first turning on the receiver long enough to get a first fix-and-hold before allowing it to move. This significantly reduces the chances of getting an initial false fix. However, we could benefit even more from doing this if we could tell RTKLIB that the rover was stationary during this time. This is because we would remove the uncertainty of rover motion from the solution and the kalman filter would converge faster. By doing this we should be able to reduce the time to first fix.

RTKLIB has a solution mode for a moving rover (kinematic) and one for a stationary rover (static) but currently no way to switch between the two. What the new static-start mode I have added does is to start the solution in static mode and then switch to kinematic mode after the solution first qualifies for a fix-and-hold. It does not require fix-and-hold to be enabled, it will still make the switch to kinematic mode when that qualification occurs, even when fix-and-hold is not enabled.

Below is the only functional code modification and it is in the relpos() function in rtkpos.c. The new line of code is in red. There are also a handful of bookkeeping changes to handle the new option in the input configuration file that I won’t include here but can be seen in the demo4 code.

/* hold integer ambiguity if meet minfix count */
if (++rtk->nfix>=rtk->opt.minfix) {
    if (rtk->opt.modear==ARMODE_FIXHOLD)
        holdamb(rtk,xa);
    /* switch to kinematic after qualify for hold if in static-start mode */
    if (rtk->opt.mode==PMODE_STATIC_START)
        rtk->opt.mode=PMODE_KINEMA;
}

This feature is enabled by changing “pos1-posmode” in the input configuration file from “kinematic” to “static-start”.

So, let’s do a little testing to see how it does. I used my latest demo4 data set and ran three solutions. The initial acquire for all three are plotted below.

static_start1

The first two are both in kinematic mode. For the first one I have used the default value of 100 for “stats-eratio1”, the ratio of variances between pseudorange and carrier-phase measurements. In the second plot, I have used my preferred value of 300. This is unrelated to the static-start modification but is relevant to the acquire time so I have brought it up again. As you can see changing this value reduced the time to first fix from roughly eight minutes to a little less than five.

In the third plot, I changed “pos1-posmode” from “kinematic” to “static-start” to enable the static start feature and again used 300 for “stats-eratio1”. This reduced the time to first fix from a little less than five minutes to about two and a half minutes, nearly a factor of two. The actual times to first fix will vary fairly significantly depending on signal quality, number of satellites and quality of the sky view but the trends shown here should hold even if the times change.

To understand this change a little better it would be good to understand what is the difference between “static” mode and “kinematic” mode in RTKLIB, so let’s discuss that next.

In my earlier post describing the RTKLIB receiver dynamics feature, I explained how the kalman filter makes two estimates each sample, where the first estimate is a prediction of how far the rover has moved since the last sample. If receiver dynamics is enabled, this estimate is made using the estimated velocity and acceleration of the rover, if not, RTKLIB simply using the pseudorange measurement for the current sample along with its large uncertainty.

In either case, the uncertainty (or variance) of the position states are increased after this update because of the unknowns of what happened during that time interval. The second kalman filter estimate (the measurement update) will then reduce the variances again using the additional information from the carrier-phase measurements. So for each sample, the uncertainties of the position states will be increased by the prediction estimate, and then decreased by a larger amount with the measurement update, zigzagging towards zero The smaller the variances, assuming they are accurate, the more likely it is that the integer ambiguity resolution will resolve a fix.

If however, we know that the rover did not move during that time, there is no need for the kalman filter to increase the variances with passing time, since the rover will be exactly where it was one sample earlier., and the variances can converge towards zero more quickly. That is exactly what RTKLIB does. If static mode is enabled, then RTKLIB skips the prediction update step entirely and just assumes the rover is where it was last sample and that the uncertainty in that position has not changed. Here is a plot of the variances from the last two solutions above comparing kinematic to static-start.  I show only the z-axis because it tends to be the limiting factor to finding a fix,at least when the rover is stationary.  This is because the geometric diversity of the satellites is always most limited in the up-down direction, since all the satellites below the antenna are obscured by the earth.

static_start2

The sudden drop in variance in both traces occurs when the integer ambiguities are first resolved and the solution switches from float to fix. The step up in the static-start trace occurs when fix-and-hold first occurs and the solution switches from static to kinematic. At this point, the uncertainties from a moving rover are introduced and the variance jumps accordingly. The reason the line is smooth for the kinematic case and not zigzagging as I described above is because the data I am plotting is from the output position file and includes only one variance value for each sample. This is the value at the end of the sample, after the measurement update.

So what happens if the rover is not left stationary long enough and there is no fix before the rover starts moving? If static start is not enabled, a fix will very likely be found while the rover is moving. The chance of this fix being wrong can be fairly high however, which leads to an incorrect solution that RTKLIB reports with high confidence of being correct (i.e. fixed), although it is not.

If static-start is enabled and no fix is found before the rover starts moving, then the chance of finding any fix, correct or incorrect, while the rover is moving is almost zero.  This is because the model the kalman filter is using so poorly matches the actual rover. If it did find a fix, it would not be maintained for more than a very short time, since again, the kalman filter is incorrectly assuming that the rover is not moving.

I would argue this is a win-win situation. Not only have we reduced the time to first fix, but we have also reduced the chance of RTKLIB reporting data as being valid when it is not. Any time RTKLIB reports a bad solution as good, it can significantly undermine confidence in the entire solution.