RTKLIB on a drone with u-blox M8T receivers

Drones are a popular application for RTKLIB and quite a few readers have shared their drone-collected data sets with me, usually with questions on how they can get better solutions. In many cases, the quality of this data has been fairly poor and it has been difficult to get satisfactory results. I was curious to understand why this environment tends to be so challenging since in theory a drone should have more open skies than just about any other application.

To do an experiment, I bought an inexpensive consumer drone from Amazon. I chose the X8C from Syma since it is beginner model and a little larger than some options. I figured the larger size should make it better able to carry some extra weight.

After a few practice flights to get the hang of flying it, I used some duct tape and double-sided foam adhesive to attach a u-blox antenna and 90 mm diameter ground plane to the top of the drone and a u-blox M8T receiver with my custom CHIP data logger underneath where the camera usually goes. I used the landing gear as a spool to wind the unnecessary five meters of antenna cable which was the heaviest part of the whole setup. From a weight perspective, the Emlid Reach units would have been a better choice, but I wanted to collect data from the Galileo constellation of satellites as well as GPS and GLONASS so I used my CSG receiver with the newer 3.0 firmware. I used a second CSG receiver mounted on top of my car as the base station.  Here’s a stock photo of the drone on the left and after my modifications on the right.

drone1drone2a

Although the drone was able to lift the extra weight fairly easily, it seemed to affect the stability of the flight control system and after a few minutes the prop motors would start to fight each other. At that point the drone would start to descend even at full throttle and the drone would land hard enough to usually bounce on its side or back. Still I was able to make a number of short flights which were adequate for testing purposes.

Here’s the observation data for the first set of flights, base station on the left and drone on the right. Red ticks are cycle-slips and gray ticks are half-cycle ambiguities. Ideally, the drone data would look as clean as the base but as you can see it is significantly worse and it turned out to be unusable for any sort of reliable position solution.  The regions without cycle-slips in the drone observations are the times in between flights in which the drone is sitting on the ground.

drone3

Clearly, while the drone is flying, something is interfering with the GPS receiver or antenna, most likely either EMI or mechanical vibration. I could have used a fancy test stand and RF sniffer to try and locate the source of interference but since this blog is focused on low-cost solutions I just used some duct tape, a steel bar, and the RTKLIB code instead.

I used two types of duct tape, both the polyester/fabric type that everyone calls duct tape, and also the metal foil type that is actually used to repair or install ducts. I first used the non-metal duct tape to securely attach the landing gear to the heavy steel bar. The steel bar was convenient because it was easy to attach but anything heavy enough to prevent the drone lifting off under full throttle would work fine.

I then started an instance of RTKNAVI on my laptop and connected it to the receiver on the drone.  The goal was to simulate a complete drone flight while the drone was sitting on the ground and at the same time watch the RTKNAVI observations to detect any degradation of the measurements.  I used a wireless connection but a USB cable would have worked too.

Unfortunately RTKNAVI won’t plot the observation data real-time, but by selecting the tiny “RTK Monitor” box in the bottom left corner of the main RTKNAVI screen, then choosing “Obs Data” from the menu, I was able to get a continuously updating listing of the observations.  Cycle-slips show up as non-zero values in the first column with the I heading. I chose a location outdoors with open enough skies that any degradation in the observation data would be obvious.

drone4

I first observed the cycle-slip column with the drone powered down to verify I wasn’t getting any cycle-slips on all but the lowest elevation satellites. I then continued to observe the cycle-slip column while sequencing through the steps required to fly the drone. I first powered on the drone, then powered on the transmitter, then issued the calibration/connection sequence, then turned on the throttle to low. So far, so good, no sign of cycle-slips. I then started moving the joysticks to issue steering commands which caused the motors to change speeds. All of a sudden I started getting cycle-slips, the more aggressive the steering commands, the more cycle-slips I saw. Aggressive changes in throttle also caused cycle-slips but full throttle with no adjustments or steering commands was fine.

Next I moved just the antenna, then just the receiver away from the drone while issuing steering commands. Moving the antenna away had no effect but moving the receiver away eliminated the cycle-slips.

At this point my guess was that the interference was coming from the relatively high power switching in the motor control circuits and that the antenna ground plane was shielding the antenna from this interference but nothing was shielding the receiver. To test this theory, I attached a layer of the metal duct tape to the bottom of the drone to act as a shield between the drone controller board and the receiver.  I then re-attached the receiver to the bottom of the drone and re-ran the experiment. This time there were almost no cycle-slips even with the most aggressive steering.

I then removed the steel bar and ran a second set of short flights with the layer of metal tape still in place. I was a little concerned that the new shield would interfere with commands sent from the transmitter to the drone so I first tested everything while still on the ground and then kept the drone fairly close during the flight. Fortunately I didn’t see any sign of commands not getting through.

The drone data looked much cleaner in this flight!  Unfortunately, this time the base data was no good with many simultaneous cycle-slips throughout the observation data. At this point I realized that I had placed the base station receiver directly on the top of the car when collecting the data which was very hot at the time. Usually I keep the receiver in the car to avoid this and only place the antenna on the roof. I have seen this kind of severe temperature effects cause simultaneous cycle-slips before but never to this extent. Again the data was completely unusable.

So, back out there again for a third round of flights. This time, everything looked much better. I still saw a few cycle-slips, especially when first applying the throttle at take-off, so my shielding was not perfect but a dramatic improvement over the first flight. The plots below show the results. The top two plots are position solutions for the z-axis. The top plot is with continuous ambiguity resolution and the middle plot is with fix-and-hold enabled. The bottom plot is the drone observation data.

drone5

I made three adjustments to the input configuration file from what I would normally use for my car based measurements.  First of all, since the drones have very open skies, I adjusted the minimum elevation angles from 15 degrees to 10 degrees.   Then, after plotting and observing the accelerations from an initial solution, I increased the vertical acceleration dynamics estimate (stats-prnaccelv) from 0.25 to 1.0.  Finally, because I was seeing slightly higher position variances in the initial solution than I usually do, I adjusted the position variance AR threshold (pos2-arthres1) from 0.004 to 0.1  Both of these last two changes would make sense if the level of vibration were higher in the drone than I am used to seeing, which is quite likely.

Each time the drone landed/crashed due to the unstable flight control system it would bounce to the side or upside-down and that is what is causing the cycle-slips and switch from fix to float at the end of each flight. As you can see though in every case I quickly get another fix after I put the drone upright again. The fixes are solid enough to hold through the entire flight even in continuous mode for all but one of the flights. With fix-and-hold enabled all flights maintained 100% fix rate. The data is as good as or better than similar experiments where I have mounted the rover on top of a car.

This is not surprising since the skies are more open in this experiment. Having over twenty satellites available for ambiguity resolution also helped. I used all the satellites (GPS/GLONASS/Galileo/SBAS) for ambiguity resolution and took advantage of the new feature in the demo5 b26 code that cycles through all the satellites and will throw a single one out if it is preventing a fix. This will automatically occur anytime the number of satellites available for ambiguity resolution is greater than the config parameter “pos2-mindropsats” which defaults to twenty.

I have added the raw data and the configuration file to the  sample data set section at rtkexplorer.com

I imagine different drones will have different issues and not all will be as easy to fix as this one, but the method described here or something similar should be helpful any time drone data is not looking as clean as the base station data.

The fix I chose was very easy to implement but a better fix would probably have been to wrap just the receiver in a shield rather than placing a shield between the control board and the receiver. This would protect the receiver better and avoid affecting commands sent from the transmitter.  In fact, based on these results, I suspect shielding the GPS receiver on a drone is always a good idea.

Receiver warm-up glitches

I’ve described before the occasional glitches that both the M8N and M8T seem to be susceptible too in their first few minutes of operation, but my previous description was buried in one of my more technical posts and maybe not seen by people more interested in just the practical side of using RTKLIB, so I thought it was worth bringing them up again.

Here is an example of one of these glitches which was in a data set recently sent to me by a reader, and one that was giving him trouble finding a solution.  The data is very clean, except for a nearly simultaneous cycle-slip (shown by red ticks) on every satellite.

rec_glitch

Here is a zoomed in image of the same glitch.

rec_glitch2

I see these glitches on both the M8N and the M8T receivers.  Every occurrence I have seen, the glitch occurred within a few minutes of turning on the receiver, and was present on every satellite.  In this example it occurred seven minutes after starting up, usually I see it within in the first five minutes.

These glitches are very disruptive to the RTKLIB solution.  Since the cycle-slips affect every satellite, all the phase-bias kalman filter states are reset and the solution has to start again from the beginning.  In some cases, the phase-biases initial values may have larger than normal errors in which case it is even worse than starting over.

I don’t have any good suggestions on how to deal with these other than to avoid them in the first place.  From my experience I believe they are more likely to occur if the external environment of the receiver has just changed.  For example if it went from hot to cold, or into the sun.  Once the receiver has had time to stabilize, everything is usually OK.

Giving the receiver time to adapt to it’s current environment before collecting data and protecting the receiver from sudden changes should help avoid these glitches.  Using external antennas with cables rather than the small antennas that come with the receivers helps because it allows you to place the receiver in a more protected location than the antenna.  For example, when I collect data from a moving car, I place the antenna on the roof but keep the receiver in the car.

For information on plotting the observations with cycle slip enabled see this post.  For another post where I discuss this problem in more detail, see this post.

Does anyone else have more information on what causes these glitches and maybe other steps that can be taken to avoid or deal with them?

 

 

AR Filter:A RTKLIB cycle-slip enhancement

Some of you may remember, one of the first code changes I made to RTKLIB was fixing a bug in the arlockcnt feature. Arlockcnt is an input parameter that specifies how many samples delay occurs before a new satellite (or a satellite that just recovered from a cycle-slip) is used for ambiguity resolution. Holding off use of the new phase-biase estimate from the kalman filter until it has had enough time to converge prevents corruption of the ambiguity resolution integer set. This in turn prevents a loss of fix.

Although waiting a fixed number of samples is fairly effective, it is not an optimal solution. Ideally we would use information from a new satellite as soon as it was converged and not after a fixed amount of time since some satellites will converge faster than others. When your data looks like this one, then every additional sample you can process is going to help.

arfilt0

This is what the AR filter attempts to do. In the current code implementation, a new satellite is unconditionally added to the integer ambiguity set when the arlockcnt expires regardless of the effect it has on the AR (ambiguity resolution) ratio. This means that the arlockcnt must be set conservatively, to insure the slowest satellite has converged, and means that most satellites will not be used for ambiguity resolution as early as they could be. In the case of frequent cycle-slips, this could mean loss of fix from having too few satellites available or it could mean a false fix since less satellites gives a less robust ambiguity resolution.

When the AR filter is enabled, a new satellite is still added to the integer ambiguity set when the arlockcnt expires but the effect of adding each new satellite is evaluated and if it causes a significant degradation in the AR ratio, that satellite’s use in ambiguity resolution will be delayed for a few more samples before being re-evaluated. Exactly how to define “significant degradation” is a bit subjective. I have chosen to disqualify a new satellite if it causes the AR ratio to drop below the AR fix threshold or if drops by more than a factor of two and the result is within 10% of the AR fix threshold. Exactly how long a satellite should be delayed is also subjective. I chose to delay a disqualified satellite for five samples plus a stagger of one sample for additional satellites. If two satellites are disqualified on the same sample, it could be either satellite or both that caused the disqualification. By adding a stagger to the delay for the second satellite, they will be re-evaluated independently on different samples.

To evaluate the change, I ran two solutions on the Ublox M8T data from my previous series of “M8N vs M8T” posts. This is my most challenging data set from a cycle-slip perspective. The solution below on the left is with arlockcnt=0 and AR filter disabled. The solution on the right is with arlockcnt=0 and AR filter enabled. As always, the yellow represents a float solution, and the green, a fixed solution. As you can see, enabling the AR filter significantly improved the number of fixes. Normally I would not set arlockcnt to zero if the AR filter was not enabled, this was for comparison purposes only.

arfilt1

As you would expect, with the AR filter disabled, increasing arlockcnt from 0 to 75 samples (15 sec) improves the solution for this data set as shown below but it still loses fix relatively often compared to the solution above with the AR filter enabled.

arfilt2

The plot below compares the number of satellites available for ambiguity resolution between the “arlockcnt=75/filter off“ solution and the “arlockcnt=0/filter on” solution”. Notice that we have significantly reduced the number of samples with less than 10 satellites available for AR by enabling the AR filter. More satellites should mean less chance of losing fix and also less chance of a false fix.

arfilt3

In this example, the accuracy of the fixed solution points did not seem to be noticeably affected by enabling the AR filter. As usual, I evaluate accuracy by comparing the receiver position relative to the position of a second receiver mounted on the same rover, both relative to the same base receiver. The difference between the two rovers should be a perfect circle, so any errors will appear as deviations from the circle. Plotting for only the fixed points, the “arlockcnt=75/filter off“ solution is on the left and the “arlockcnt=0/filter on” solution on the right. In both cases the errors appear to be very similar and within a few centimeters. This probably makes sense since the same satellites were used to calculate position in both cases, it was only the ambiguity resolution that differed. Any advantage from having more satellites in the AR calculations could be offset by the fact that the additional satellites were probably noisier since they may not have been fully converged. Also, the plot on the left does not include many of the points on the right, since samples without a fix are not included.

arfilt4

I actually created the AR filter feature quite a while ago but never got around to describing it or even fully testing it by reducing arlockcnt to zero. I have now done that, and made some small improvements to it in the last few days. I have updated my Github repository and executables folder with the latest version.

That pretty much completes my general explanation of this feature but there are a few details to be aware of if you are interested in trying it out yourself.

First of all, enabling the AR filter will slightly increase the code execution time since if a satellite is rejected, the ambiguity resolution has to be re-run without the rejected satellite. The difference is small enough however that I don’t think it will be an issue in the vast majority of cases.

The second thing is to understand is how the arlockcnt interacts with the half-cycle valid bit. A typical cycle-slip (at least on a Ublox receiver) looks like the plot below. There is usually a gap of no data, then a cycle-slip (red tick), then a number of half-cycle invalid samples (gray tick), then a final cycle-slip. The second cycle-slip is actually not reported by the receiver, but is added during the translation from raw data to RINEX format when the half-cycle valid bit transitions. Any time the half-cycle status is invalid, that satellite will not be used for ambiguity resolution regardless of the arlockcnt. The arlockcnt will be reset by the second cycle-slip and count from there. So, in this example, if arlockcont were set to 10, all the samples from the beginning of the gap until 10 samples after the second cycle-slip will be ignored for ambiguity resolution.

arfilt5

The last thing to mention is that one of the recent code changes I referred to above was to add a pseudo half-cycle invalid bit for the SBAS satellites for the M8T receiver. For some reason, the Ublox receivers don’t seem to report the half-cycle status for the SBAS satellites. The change I made was in the raw data to RINEX translation where I set the half-cycle invalid bit for a fixed delay after a cycle-slip on a SBAS satellite.  This makes cycle-slips on the SBAS satellites look very similar to the rest of the satellites.  I had previously done this for the M8N receiver and that change has been migrated to the release code but hadn’t got around to doing it for the M8T. This attempts to avoid the half-cycle uncertainties from possibly causing a false fix if the SBAS satellites were used too early for ambiguity resolution.

Ublox M8N vs M8T: Part 3

 

This is the third part in a series of posts comparing data from a pair of Ublox M8T receivers with data from a pair of Ublox M8N receivers. In the first post I identified a problem with erroneous fixes in the RTKLIB solution for the M8T data. In the second post I found that the source of the problem appeared to be in the handling of cycle slips in the translation from the receiver binary output to the text input used by RTKLIB. I also showed that adopting an algorithm more similar to the one used to translate the M8N binary improved but did not entirely fix the problem. In this post I will show that if the cycle slip translation is more precisely adapted to the differences in binary output between the M8N and the M8T then we can virtually eliminate the erroneous fixes and get an M8T solution that is now slightly more accurate than the M8N solution.

One thing I may not have made clear in the previous posts is the way I am collecting and processing the data from the Emlid Reach M8T receivers. I am using their ReachView software to collect the raw GPS data but all the processing is done afterwards using my own demo4 version of RTKLIB. I don’t have access to Emlid’s RTKLIB source code and don’t know if they have made any changes that might have improved this issue in their own real-time version of RTKLIB. So this is more a comparison between the M8N and the M8T receivers using my version of RTKLIB and not an evaluation of anything specific to the RTKLIB version of Reach.

Last post I finished with this plot showing the difference between the M8N solution and the M8T. For reasons I’ve explained before, we know this should be a perfect circle with a radius equal to the distance between the receivers.

walker12

By looking at spikes in the z-axis accelerations we were able to show that the deviations from the circle in the above plot were due to errors in the M8T solution and not the M8N solution. This plot was taken after I had modified the binary to text translation in RTKCONV to make the M8T translation more similar to the M8N translation. Before that, the errors were much larger.

Let’s start by taking a closer look at the binary outputs of the two receivers. The details for the M8T are available in the M8 receiver spec and for the M8N in this document from Tomoji Takasu but I will summarize them here.

Both receivers provide three values for each satellite output sample that can be used to evaluate the quality of the carrier phase measurement. The first is the carrier-phase valid bit. Both receivers provide this in the tracking status byte. It indicates that the receiver is currently locked to the carrier-phase for this satellite. Note that it does not tell us if the receiver has been continuously locked since the previous output sample and it also does not indicate anything about the quality of the measurement.

The second output (locktime or lock2) indicates how long the receiver has been locked to the carrier-phase for this satellite. If this count is lower than the count for the previous sample, then an unlock, i.e. cycle-slip, has occurred. Both receivers also provide this value, and this is what RTKCONV uses to detect a cycle-slip.

The third value (mesQI or cpStdev) is a quality indicator and this is where the receivers differ. On the M8N this appears to be more of a state indicator than a true quality indicator. The comments indicate that it’s value corresponds to:

0:idle
1:search
2:acquired
4-7:lock

For the M8N, the RTKCONV code throws out any phase measurement for which this quality indicator is not between 4 and 7.

On the M8T, the quality indicator is actually a numeric estimate of the standard deviation of the carrier phase measurement. If the value is between 1 and 7, then the estimate of standard deviation is this value multiplied by 0.004 meters. If this value is 15, then there is no valid estimate. This value is currently ignored by the RTKCONV code and that is the root of the problem.

Before discussing how to modify the code to incorporate this information, we need to understand exactly how the cycle-slip bit is interpreted by RTKLIB. On a sample for which this bit is set, RTKLIB will reset the phase-bias state for that satellite based on that carrier-phase measurement. Note that the reset is done using the measurement from the current sample, not the following sample. What this means is that the cycle-slip bit should not be used to indicate that there is a cycle-slip on the current sample. Rather it should indicate that there has been a cycle-slip since the last good measurement and that the quality of the current sample is now high enough to use it to reset the phase-bias state.

So how high should the quality indicator be before resetting the phase-bias? It will be a trade-off between time and accuracy. I chose the value of four since it’s in the middle of the scale and somewhat equivalent to what is used on the M8N, but it could be argued that this value should be higher or lower. Forcing the cycle-slip to be set on a good sample also eliminated the problem we previously saw with samples subsequent to the cycle-slip not being flagged, so I was able to remove the code I added in the previous post.

Here are the changes I made to the original code for the M8T receiver. The new variable cpstd is the standard deviation of the carrier phase measurement as described above. The constant STD_SLIP is the quality threshold described above and is set to 4.

walker13

Here is the position solution calculated with the above code change.

walker14

Here is the difference in position between the two solutions. As you can see, the deviations from the circle have been significantly reduced.

walker15

The noise in the z-axis accelerations has also been significantly reduced and is now slightly lower than in the M8N data.

       M8N                                                                           Reach M8Twalker16

I have posted the raw data and config files to my data set library in the niwot2_car folder and the code changes to my Github page.

Ublox M8N vs M8T: Part 2

In my last post, I compared data from a pair of Ublox M8N receivers with data from a pair of Emlid Reach M8T based receivers, and found issues in both data sets. In this post I will look into those issues more closely.

To do this, I collected another data set with a few differences from the previous one.

  • On the M8N rover receiver, I replaced the original antenna that had only a 1” cable with a Ublox ANN-MS-0-005 antenna with a much longer cable. This allowed me to move the receiver from the car roof to inside the car, a friendlier thermal environment. The goal was to see if this would avoid the all-satellite cycle-slips I saw last time at the beginning of the data set. This antenna cost me $20 from CSG but is available from other places for less.
  • On both Reach M8T receivers, I modified the dynamic platform model setting from “Airborne < 4g” to lower acceleration settings; “Pedestrian” for the rover, and “Stationary” for the base. I also changed the setting on the M8N base receiver from “Pedestrian” to “Stationary” to make them consistent. The goal here was to avoid the erroneous fixed solution points I saw in the solution from the previous M8T data. To insure the modifications occurred correctly, I used the u-center software to read back the settings from the receivers after the data was taken.
  • I chose a measurement route with much more tree cover than the previous one to increase the difficulty of the solution and to help differentiate the two receiver sets.

Here is a Google map of the measurement route. In this area there are few native trees, so by driving through residential areas, rather than next to them as I did in the previous data set, I encountered many more trees. There were numerous locations with tall trees on both sides of the road as well as places where I drove directly under large tree branches.

walker1

Here’s an example of some of the more challenging tree cover encountered in the route. The route locations are only approximate since the base locations are not calibrated, that is why the lines are not actually on the road.

walker2

Here is the observation data from both rovers. The red ticks are cycle-slips, the gray ticks are half-cycle invalids.

                             M8N                                                                          Reach  M8Twalker3

The first thing to notice is that there are no simultaneous cycle-slips in the M8N data. Of course, this doesn’t prove we will never get them but it suggests that maybe moving the receiver inside the car did help. Unfortunately, there is one simultaneous cycle-slip in the M8T data very close to the start at 00:54. Apparently, both the M8N and the M8T are susceptible to these slips, which is not surprising, since the chips are part of the same family. We do see only a single slip, which is an improvement over the previous data. Since they always occur near the beginning of the data collection, I still believe they are most likely caused by thermal transients. I suspect that the best way to avoid them would be to turn on the receivers 15 minutes before starting to collect data in an environment as similar as possible to the data collection environment.

Next, let’s look at the SNR vs. elevation plots. Last time we saw noticeably better numbers with the Reach setup because of the more expensive antenna. This time, with the Ublox antenna on the M8N receiver, the two are much more similar. SNR isn’t everything, and there still are reasons why the more expensive Reach antennas are likely better than the Ublox antennas, but we’ve at least closed the gap some between them.

                            M8N                                                                          Reach  M8Twalker4

Here are the position solutions from both receiver sets.

                            M8N                                                                          Reach  M8Twalker5

The M8N solution is very good, with nearly 100% fix after the initial acquire until the very end when I parked the car under some trees in the driveway. This is despite a very challenging data set with many cycle-slips.

Unfortunately, the M8T solution is not as good. The initial acquire is delayed because of the simultaneous cycle-slip I mentioned earlier. It does eventually acquire, but then loses fix for quite a long period near the end of the data (the yellow part of the line in the plot above)

Even more concerning, when we look at the acceleration plots, we see the same spikes in the M8T data as we did in the previous data set.

                       M8N                                                                          Reach  M8Twalker6

Again, these spikes align with erroneous fixed solution points in the M8T data as can be seen in deviations from the circle in the difference between the two solutions.

walker7

Clearly, changing the dynamic platform model setting in the receiver did not fix the problem. A couple of experienced users have commented that this setting does affect the front end of the receiver on earlier Ublox models and it’s still possible it affects the M8T as well, but it does not appear to be the cause of this problem. We will need to look elsewhere for the solution.

We are running identical RTKLIB solution code on the two data sets and we have verified that the receiver setup is nearly identical for both data sets. So what else can be different between them? One possibility is the conversion utility, RTKCONV, that translates the raw binary output from the receiver into the RINEX observation files that are used as input to the solution. Since the raw measurements are output by different commands on the two receivers, there are two different functions in the RTKCONV code that process them.

Let’s look first at the RTKCONV code to convert the data from the UBX_TRK_MEAS command used by the M8N receiver. I won’t show the code here but just describe functionally what happens in the code. Each data sample from the receiver contains a status bit indicating carrier-phase lock and a count of consecutive phase locks. RTKCONV sets the cycle-slip flag for that sample if the carrier-phase is valid and an internal code flag is set. If the carrier-phase is not valid, then the cycle-slip flag is left in its previous state. The internal code flag is set if the phase lock count is zero or less than the phase lock count for the previous sample and is only reset for the next sample if the carrier-phase is valid.

For the UBX_RXM_RAWX command used by the M8T receiver, there is also a status bit indicating carrier-phase lock. Instead of a count, there is a time for consecutive phase locks but functionally it is equivalent. The RTKCONV code for this command does not use an internal code flag and the cycle-slip flag is set or cleared directly every sample that the carrier-phase is valid. The cycle-slip flag is set if the phase lock time is zero or less than the previous sample and cleared otherwise.

I know that’s a bit confusing and the difference is fairly subtle. The most important thing to understand is that RTKCONV is doing more than just translating from binary to text, it is deciding which samples to set cycle-slips for based on a somewhat complicated algorithm, and that algorithm is different for M8N and M8T.

Basically, the effect of this difference is that for the M8N, if a cycle-slip is followed by an invalid phase then the next valid phase will always be flagged as a cycle-slip while for an M8T it won’t necessarily be so. It’s easier to understand by looking at a picture. The observation plots below show the location of one of the acceleration spikes in the M8T solution which is caused by a cycle-slip on satellite G05. The plot on the left shows which samples were flagged as cycle-slips with the existing code, the plot on the right shows the cycle-slips after modifying the code to be functionally equivalent to the M8N code. Note the extra two cycle-slips with the change.  The extra cycle-slips in this case are a good thing because they are flagging samples with large phase errors and preventing RTKLIB from incorporating them into the solution.

              Reach M8T before change                                   Reach M8T after changewalker8

Modifying the RTKCONV code for the UBX_RXM_RAWX command in this way and re-running the solution gives us the the position plot on the right and the acceleration on the left.

walker9

Much better than before! Not quite as good as the M8N but still a significant improvement. The difference between the M8N solution and the improved M8T solution is shown below.

walker10

Remember this should be a circle if both solutions are free of errors.  Actually in this case not quite a circle because there is also a separation between the two base stations ,but that effect is small and we can ignore it for now.  This is much better than the equivalent plot from the original M8T solution shown earlier. There are obviously some erroneous fixes even after the improvement, so this is still a work in progress, but I think it is a big step in the right direction.

This data set was quite a bit more challenging than the previous one and in reality both solutions are quite good given the number of cycle-slips we saw, but there is always room for improvement.

The other thing to note with this data set is that the better quality antenna made a big difference on the quality of the M8N solution.  I suspect I will not be going back to the old antenna!  I knew the nicer antenna would help but I have resisted using it till now because of the extra cost.  There are similar looking antennas with similar gain specs available for as little as $4 so I may try some of these to see how they compare.

I’m not going to post the code to my Github repository or my binaries until I’ve had a little more time to understand the remaining errors but if anyone wants to take a closer look now, here are the code modifications to ublox.c that I made in the decode_rxmrawx function.

walker11

Ublox M8N vs M8T

Thanks to the generosity of Emlid, I am now the proud owner of two of their Reach precision GPS units. At a list price of $570 for a pair, these fall more into the category of low-cost rather than the ultra-low cost receivers I have been working with, but they do allow me to do some comparisons between the two. Their units use the more capable (and more expensive) Ublox M8T receivers rather than my Ublox M8N receivers and also have higher quality (and more expensive) Tallysman 4721 antennas, rather than the cheap antennas that were included with my receivers.

To compare the two I collected a fairly challenging data set, with maximum distances from the base station of over two kilometers and maximum velocities of over 70 km/hr, with relatively unobstructed rovers, but partially obstructed base stations. I mounted one of my receivers and a Reach unit on top of my car and also one of each at the base location. Here is a Google earth plot of the ground track. The Google Earth plots are a really nice feature of RTKPLOT I have not used until now but have quickly become a fan of. I could not make this feature work with the 2.4.3 version of RTKPLOT so had to go back to the 2.4.2 version. I also found I had to specify the solution format (out-solformat) as “xyz” instead of the “enu” that I usually use to get the solution in a format Google can use. The track was over a mix of dirt and paved roads running through agricultural fields and next to residential areas.

union1

Here is a plot of the base station location, marked in red below. It is somewhat obstructed by the sheds and boats around it. I selected this location in part because it was very windy that day and this spot was fairly sheltered from the wind, but also thought it would make the solution a bit more challenging and therefore help differentiate the two receiver sets.

union2

Most of the rover’s route was relatively unobstructed, but there were a few scattered trees near the road in some locations. The plot below shows an example of one of these spots. Note also, that the ground track goes off the road a bit at the end of the loop. I suspect this is due to inaccuracies in the base station location (and maybe the Google maps) rather than the RTKLIB solution since I did not make any attempt to calibrate the base station against any known reference.

union3

Here’s the base station observation data, the M8N is on the left, the Reach M8T data on the right. From a cycle slip perspective they are fairly similar but we see a few of the satellites (G15, R16, and R18) are noticeably better with the Reach. This is most likely because of the higher quality antenna.

                                  M8N                                                              Reach M8Tunion4

Looking at the SNR vs elevation for both receivers, we see the Reach has noticeably higher signal strength especially in the lower elevation (10-20 degrees) region where it is most important. Again, this is to be expected with the higher quality antenna.  For some reason, the M8T SNR numbers are rounded off to the nearest integer. I don’t know why that is, but that is what makes the plots look like they are formatted differently.

                                 M8N                                                                       Reach M8Tunion5

Looking next at the rover data, here are the observations for the two receivers.

     M8N                                                                              Reach M8Tunion6a

The rovers were stationary until 00:34 and then moving till 00:57, then stationary again. While they were moving and at the end when they were stationary, the two look fairly similar from a cycle-slip perspective with maybe a slight advantage for the Reach. However, during the initial stationary period, there were several slips that occurred simultaneously on every satellite in the M8N data. I’ve circled one of them in blue above.  I believe these must be caused by some sort of discontinuity in the receiver and have nothing to do with the satellite signals themselves. My best guess is that they were caused by temperature fluctuations in the chip. It was a very hot day, with an intense sun heating the dark car roof, combined with a strong wind that would create a cooling effect. Because the antenna that was connected to the M8N receiver has only a one inch lead, it was mounted on top of the car in the sun, while the M8T receiver, with a longer antenna lead, was mounted inside the car and not subject to the same temperature fluctuations. Also, notice that the simultaneous slips all occurred near the beginning of the data set, possibly while the receiver was still reaching some sort of thermal equilibrium. To try and avoid this in the future, I will either switch to another inexpensive antenna I have with a longer lead, or let the receiver sit longer before starting to collect data.

Unfortunately these slips occurred during the initial stationary period I use for the first acquire and prevented that acquire from occurring. Rather than give up on the data, though, I decided to try running a solution with the M8N rover and the M8T base. I don’t know if it was because of the better signal strength, or because I just got lucky, but I was able to get an initial acquire with that combination. So for this exercise, the rest of the data is all based on a comparison between the M8N rover and the Reach M8T rover, both referenced to the M8T base station. It turns out that having a single base station also makes the accuracy analysis a little easier as I will describe later. This is not exactly the comparison I wanted to make, but one I think is still worth doing.

So how did they do? Here’s the solutions using my demo4 code. The input configurations were identical with one exception. Since with the M8T receivers, RTKLIB can resolve the GLONASS integer ambiguities without assistance, while with the M8N receivers, it can not, I set the input parameter “gloarmode” to “on” for the M8Ts and to “fix-and-hold” for the M8Ns. This enables my extension to the fix-and-hold feature to adjust for the additional errors on the GLONASS and SBAS satellites.

                             M8N                                                                        Reach M8Tunion7

The Reach solution (on the right) looks very clean with a very fast acquire and then 100% fixed values after that. The M8N solution (on the left) also acquired quickly but then ran into the simultaneous cycle-slips, causing problems until it re-acquired at 00:40. After that it stayed fixed for nearly 100% of the time with a short dropout around 00:49.

The next questions, of course, are: Do the solutions match? And are the fixes all accurate? To check this, I will use a similar technique I did earlier when I had only two receivers, both mounted on the rover. For that case, I solved for the distance between the two receivers which forms a circle equal to the distance between the receivers. In this case, I will solve for the distance between each rover and the base, then difference the two. This should also give us a circle with radius equal to the distance between the two rover receivers. Having only one base simplifies things by avoiding addition of a second term caused by the separation between the two base receivers. Here’s the result of that operation, plotted only for the fixed solution points since those are the ones we are counting on to be accurate. On the left is the position difference (x,y,z) and on the right is the ground track difference.

                           M8N                                                                            Reach M8Tunion8

The separation between the two receivers on the car roof was 15 cm and we see here that most of the points fall on the circle of that radius. However, there are several deviations, up to half a meter in error. They are also easy to see as spikes in the z-axis on the left plot. These are unexpected and quite concerning since we really rely on the fix status to let us know which points are valid and accurate.

Zooming in on the observation data for the largest of these spikes at around 00:51 shows that both rover receivers saw cycle slips on satellites G02 and G13 at this time, presumably from passing a tree or other obstruction.

                                   M8N                                                                Reach M8Tunion9

Zooming in on the position data for this point, shows discontinuities in the Reach data but not the M8N data which is surprising, I had expected the opposite. As the driver of the car, I am certain that I did not drive over any meter high cliffs during the test, so the Reach M8T data must be wrong.

                                 M8N                                                                          Reach M8Tunion10

Looking at other error samples shows the same thing. It is more easily seen as spikes in the accelerations as shown here, especially the z-axis accelerations which only occur on the M8T data and not the M8N data

                               M8N                                                                       Reach M8Tunion11

So what could be causing these errors? The two solutions used identical code and input parameters except for the fix-and-hold for GLONASS integer ambiguities setting. I reran the M8T solution with this parameter enabled to make them completely identical but this did not fix the problem. With the code being identical we have to suspect the difference is in the receivers themselves or at least in their setup. The next step I took was to use the Ublox u-center evaluation software to examine all the registers for both receivers. It’s a little trickier to do this with the Reach receiver since we don’t have direct access to the M8T chip, but there are instructions on setting up a tcp port in the Reach documentation, which is what I did.

For the most part, the differences between the two receiver setups were slight but I did find one significant difference. The receiver dynamic platform model settings are quite different. I have my M8N receivers set up for a “Pedestrian” model, while the Reach M8T receivers are set up for the “Airborne <4g” setting. According to the Ublox M8 Receiver Description document, that setting is only recommended for extremely dynamic environments. Here’s the details from the manual.

union12

I have heard differing opinions on whether this setting affects the front-end of the receiver or whether it only affects the back-end position calculations. If it only affected the back-end, then this setting would not matter since we use only the raw GPS measurements from the receiver. However, if it does affect the front-end, we might expect it to have an effect like this since it would most likely be opening up the bandwidths of the phase lock loops tracking the carrier-phases and thus minimizing any filtering effects.

So that’s where things stand at the moment with this data set. I see issues with both receiver sets and have speculated on what may be causing the problems but have not had time to verify that either one of my guesses is correct. I had hoped to do that before posting this article but it’s been almost two weeks since my last post, so I’ve decided to just go ahead and post what I have.

This also gives me the chance to ask if anybody else has seen similar issues and/or might have other possible explanations for what is going on?

I have also uploaded this data to my data set library.

Ublox M8N half cycle valid bit

In my last post I introduced a new more challenging data set with a higher number of cycle slips than in previous sets. Even after improving the RTKLIB code to better handle missing data samples, there are still a large number of points in the solution with only a float status (yellow in the plot). In addition, the last 15 minutes appears to have at least a meter of error in the vertical axis since the data was taken in a series of loops and the vertical measurements should be the same as earlier loops.

 

half_cycle1

By plotting the observations of the rover, we can see that there are many cycle-slips on a large number of the satellites. This is the primary cause of the poor solution. Here’s the rover observations with a 15 degree elevation mask.

half_cycle2

In order to improve this solution, we are going to have to take a closer look at the cycle slips. The first thing I noticed when examining a few of these cycle slips is that there are a lot of half cycle errors in the carrier-phase measurements following a reported cycle slip. This is true whether or not an actual cycle slip occurred. I had seen this earlier as well when doing an exercise of looking at the double differences. Here’s a plot from that previous post showing what I am talking about. The second red circle contains a number of samples all in error by almost exactly one half cycle.

half_cycle3

Let’s go back to the raw ublox data before it is converted into RINEX format to see if that can shed any light on what is going on. To see the raw unconverted ublox data I enabled the trace option when running convbin by adding “-trace” to the command line. I also had to change an “if #0” in the code in the decode_trkmeas() function in ublox.c to an “if #1” to output the full debug information. Below on the top is the rover observations zoomed into a time period around 6:45:27.0 and below is the raw ublox data for that sample.

half_cycle4a

half_cycle4b

The flag byte is what we are interested in. Since the M8N does not officially support raw output, there is no documentation for this data byte. From the existing ublox.c code though we can see how RTKLIB is using it. It is interpreting bit 5 (0x20) as phase lock and bit 6 (0x40) as the half cycle bit. When the half cycle bit is set, one half cycle is added to the carrier-phase measurement. The other 6 bits are being ignored.

Looking at the documentation for the M8T which does officially support raw data provides a hint as to what might be going on. In the description of the RAWX command one of the bits in the trkStat byte is used to indicate whether the half cycle bit is valid or not. Since the M8N and the M8T share the same core, it would be reasonable to assume there was an equivalent bit in the M8N.

We know from the data that the half cycle errors occur shortly after a reported cycle slip so let’s look at those satellites. From the observation plot we can see that in this case satellites G30, R03, R22, and R23 all have reported cycle slips in the last few seconds before the sample we have raw data for (6:45:27.0). If we look at the flag byte for those satellites, we can see that bit 7 (0x80) is clear for all four satellites. It is set for all the other satellites except G21 which has no valid data and I120 which is the SBAS satellite. Each line in the raw data is one satellite, “sys” is the system (0=GPS,1=SBAS,6=GLO) and “prn” is the satellite number. We can also see that when bit 7 is clear, the half cycle bit (0x40) is never set which would also be consistent with bit 7 being the “half cycle valid” flag.

I added the following line of code to the decode_trkmeas() function in ublox.c to update the observation with the half cycle valid status in the same way it is done for the M8T.  RTKLIB refers to this bit as the “Parity Unknown” bit but in a comment in the RTKPLOT part of the manual explains that this mean the half‐cycle ambiguities in carrier‐phase are not resolved.

     raw->obs.data[n].LLI[0]|=flag&0x80?0:2; /* half cycle valid */

I then reran the conversion using the convbin raw data conversion CUI. Plotting the updated observation data using RTKPLOT with the “Parity Unknown” option set to “ON” gave the following result:

half_cycle5

The gray ticks indicate unknown parity (i.e. half cycle invalid). Unfortunately the “unknown parity” ticks seem to take precedence over the “cycle slip” ticks, so any sample that has both shows up as “unknown parity”. This is why there seems to be less cycle slips in this plot than the original plot above.

So, let’s rerun the solution on the improved observation data using the same configuration settings as previously. The result is plotted below.

half_cycle6

This looks a lot better. Not only do we see a lot more fixes, but the obvious vertical errors between 7:15 and 7:30 have gone away. The two yellow sections in the ground track plot on the left align with the two locations where the car went directly underneath overhanging branches so it is not surprising that those spots are going to be more difficult to maintain fixes for.

There may be more opportunity for improvement here if we can figure out how to take better advantage of the half cycle status. The current RTKLIB code simply resets the phase bias estimate for that satellite every time this bit changes state.

I have uploaded this change to the demo4_b12 branch in my GitHub repository. I have also posted the modified rtkconv executable here.

The reason that I posted rtkconv (the GUI version) instead of convbin (the CUI version) is that I am now able to build the GUIs with my new Embarcadero compiler and because I have found that convbin does not seem to work for the newer (3.xx) formats of RINEX.  I prefer the newer formats because they are easier to parse.