Limitations of the RTCM raw measurement format

In the last post I described a process to troubleshoot problems occurring in real-time solutions that are not seen in post-processing solutions for the same data.  I collected a data set demonstrating this issue, and traced the problem to the conversion of the measurement data from raw binary format to the RTCM format.  This conversion is typically done in real-time applications to compress the data and minimize bandwidth requirements for the base to rover real-time data link.  In this post I will look into that example in more detail and also explore some of the limitations of the RTCM format.

First, it is important to understand that the conversion to RTCM is not a lossless process. There are several ways in which information is lost in this process.  In some cases these losses are probably not significant but in other cases it is not so clear that is the case.

So let’s look at some of those differences.  We actually have three formats to compare here: the raw binary format from the u-blox receiver, the RTCM format, and the RINEX format.  Both the RTCM and RINEX formats contain less information than the raw binary format and information is lost when the conversion is made to either format.  The reason I include the RINEX format here is because in the post-processing procedure, the measurements, whether they come from the raw binary format or the RTCM format, must first be converted to RINEX format before being input into the solution.   What I see with my example data set that fails in real-time is that it looks good in post-processing if the raw measurements are converted directly from raw binary to RINEX but fail if the raw measurements are first converted to RTCM and then the RTCM is converted to RINEX.  Therefore it is very likely that there is something critical that is lost in the conversion to RTCM that is not lost in the conversion to RINEX.

The official RTCM spec is not freely available on the internet (it must be purchased), so I have relied on this document from Geo++ for the RTCM details.  Here is a chart of the most significant differences I am aware of between the three formats.  In the case of RTCM, these numbers apply only to the older 1002/1010 messages used by Reach and most other systems, not the newer MSM messages.

U-blox binary RINEX 3.0 RTCM 3.0
Psuedorange resolution double precison floating point 0.001 m 0.02 m
Carrier phase resolution double precison floating point 0.001 cycles = 0.2 mm 0.5 mm
Doppler resolution single precision floating point 0.001 Hz Not supported
Time stamp resolution double precison floating point 100 nsec 1 msec
Lock time 1 ms Lock status only Variable (> 1 ms)
Half cycle invalid Supported Supported Not supported

 

To figure out which (if any) of these differences is responsible for the failure I needed a way to run the solution multiple times, each run done with only a single difference injected into the conversion.

I already had a matlab script I had previously written previously to parse a RINEX observation file into a set of variables in the matlab space.  So I wrote a second script that goes the other way, from variables in memory to a RINEX observation file.  Once I had done this, I could read in the good RINEX observation file translated directly from the u-blox binary file, modify a single measurement type, write it back to a new RINEX observation file, then run this file through a solution.

My first guess was that it was the missing  “Half Cycle Invalid” flag that would prove to be the culprit since I have seen this before with the M8N receiver as described in this post.  Although I suspect that this probably is true in some cases, it did not make a difference with this data set.  My next suspect was the missing doppler measurements, since RTKLIB uses the doppler measurements when estimating the receiver clock bias, but again, it was not the case.  In the end it turned out to be my very last guess that made the difference and that was the time stamp resolution.  So much for me thinking I was starting to get the hang of this RTK stuff!  The differences were so small in the time stamps relative to the distance between them, that I had unconsciously  ignored them.  For example, the two first time stamps in the good measurements were 49.9995584 and 50.999584 but the time stamps in the failing measurements had been rounded off to 50.0000000 and 51.0000000.  Even after discovering that this round-off error makes a difference, it still is not obvious to me why this is true.  In any GPS solution, the receiver clocks are assumed to lack sufficient accuracy  to be relied upon without correction and the clock errors are one of the unknowns in the solution along with the three  position axes.  I don’t know why RTKLIB does not correctly estimate this error in its clock bias estimate and remove it.  Maybe one of you guys who has been doing this a lot longer than I have can explain this?

Just to be sure it wasn’t a fluke, I started the data processing at three different times in the data set, and I also ran additional solutions with the sign of the error in the time stamps reversed.  In every cases, regardless of sign, or starting location, the solution failed to get a fix when the error was present and succeeded when the error was not there.

I have read somewhere that more expensive receivers will typically align there time stamps to round numbers which would avoid the need for as much resolution.  The only expensive receivers I have access to are the CORS stations so I took a look at data from a couple of them.  Sure enough, it appears to be true that they do use round numbers for their time stamps.  If this is more generally true it might explain why the RTCM spec does not have sufficient resolution for the u-blox data but would work fine for more commonly used, higher priced receivers.

I was curious why the u-blox time stamps don’t occur at round numbers so took a look  at the hardware description spec.  I found this explanation

“In practice the receiver’s local oscillator will not be as stable as the atomic clocks to which GNSS systems are referenced and consequently clock bias will tend to accumulate. However, when selecting the next navigation epoch, the receiver will always try to use the 1 kHz clock tick which it estimates to be closest to the desired fix period as measured in GNSS system time”

I interpret this to mean that the receiver is aware of alignment error in its clock source relative to GPS system time, and it adjusts the time stamp values to  includes its estimate of that error.

Something else I am curious about but have not had time to investigate in any detail is how this issue is affected by differences between the RXM_RAWX measurements which are what is normally used with the M8T receiver, and the debug TRK_MEAS messages which also contain the raw measurements and are the only raw measurement messages available on the M8N receiver.  Looking at several data sets from the both the M8N and M8T, it appears that the TRK_MEAS time stamps for both receivers are aligned to round numbers  while the RXM-RAWX measurements are not aligned.  This means that the TRK_MEAS messages would not be affected by the lack of resolution in the RTCM format.   However, the TRK_MEAS measurements lack the compensation for inter-channel frequency delays in the GLONASS measurements and so would not be a good substitute.  Maybe it’s possible to combine the two into a single set of measurements?  The two include different references and clock errors so it is not obvious if that is possible. Below is an example of partial TRK_MEAS and RXM-RAWX outputs for the same epoch when both were enabled, TRK_MEAS on the top, and RXM_RAWX below.

trkmeas1

Another avenue I considered is using the newer MSM messages (1077,1087)in the RTCM format instead of the current 1002/1010 messages that Reach and most other users are using.  These have higher resolutions for the pseudorange and carrier phase, and include doppler and half cycle invalid flags.  Unfortunately, the resolution for the time stamps does not seem to have changed, or if it has, it hasn’t changed enough to see a difference in the output for the small deltas in my example.

There also appears to be a bug in the RTKLIB implementation of the encode or decode of these messages which sometimes causes the number of integer cycles in the carrier phase measurements to be incorrect (the fractional part is fine).    This bug appears to be present in both the official 2.4.3 release and the demo5 code but some of the changes I have made to the u-blox translation in the demo5 code seem to have increased the frequency of these incorrect measurements.

Reach does use the MSM messages for the SBAS measurements although it does not need to since the 1002 message supports SBAS as well as GPS.   It is possible this could introduce a problem for users in North America where the WAAS satellites used for SBAS correction include carrier phase measurements.  Users in Europe would not see this problem because the EGNOS satellites used for SBAS correction in Europe don’t provide the carrier phase.  I did not see any corruption in the SBAS carrier phase measurements in the initial RTCM data in this example but after I enabled the 1077 and 1087 measurements, I did see corruption in the measurements in all three systems.

So, unfortunately this is still somewhat a work in progress and I don’t have any easy answer how to fix this.  I am hoping some of the experts out there can comment and help put some of the pieces of the puzzle together.

In the meantime I would suggest using the u-blox binary format for the base-rover data-link instead of the RTCM format.  The bandwidth requirements will be 2.5 to 3 time higher but some of this can be offset by reducing the measurement sample rate for the base station.

I believe a long term fix is going to require two things.  First of all a workaround to the time tag resolution issue described in this post.  But even with fixed, the half cycle valid flag and doppler information will still be lost.  I haven’t  done any tests to understand how critical the doppler measurements are, but I have demonstrated in the post I referenced above, that losing the half cycle valid flag can definitely degrade the solution.  Fortunately, the newer MSM RTCM messages do include both half cycle valid flag and doppler.  They do not appear to be usable until the bug in the encode/decode of the carrier phase data is fixed, so that will have to happen as well.

On the other hand, I suspect most real-time RTK systems do use RTCM and manage to live with its limitations so maybe I am overreacting here.  I would be interested in other people’s opinions and experiences with RTCM on u-blox or other receiver types.

 

 

 

12 thoughts on “Limitations of the RTCM raw measurement format”

  1. Hi, I’m using radio to transform base station M8T rtcm data to rover M8T. Due to radio air speed, what’s the minimal message for RTK in str2str conversion ?
    This is my option:
    GPS:1077+1019 Galileo:1097+1046 QZSS:1117+1044 Beidou:1127+1042 SBAS:1107
    Is it right?
    Thanks, yours.

    Like

    1. Hi Pony. The M8T does not have an internal RTK solution engine like the M8P or F9P so sending the messages directly to the M8T will not be useful. I assume you meant to send them to an M8P or real-time RTKLIB solution? In that case you will need observation and ephemeris for each constellation you plan to use, I would suggest a minimum of GPS and Galileo, but more is better. You will also need either a base position message (e.g. 1005) or you will need to specify the base location in the solution config parameters.

      Like

      1. Hi, glad to see you again.
        Thanks for your reply. You are right, I’m using rtklib to solve. Following your guide, I added 1005 message to str2str parameters and choose MSM5 rtcm messages of base station. But there is a strange thing, rtkrcv always returns “initial base station error”. The errror/warning dialog shows RTCM 1005: staid=0 pos=-90.00000000 0.00000000 -6373187.000.
        At same time, both rover and base station can get around MSM5 of 28-32 satellites. I can’t find what I’m doing wrong. Please figure me out. Thanks.
        Sincerely.

        Liked by 1 person

        1. ant2-postype already set to rtcm. But still get RTCM 1005: staid=0 pos=-90.00000000 0.00000000 -6373187.000
          Should I configure something in U-Center like UBX-SFRBX or UBX-RAWX? I think the str2str cannot get properly values from M8T receiver due to my corrupt configuration.

          Like

        2. Hi Pony. I’m guessing that you are trying to use the conversion option in STR2STR to generate RTCM3 1005 messages. STR2STR can only generate valid 1005 messages if it has a valid position for the base receiver which you can setup in the STR2STR command line options. Ideally though, you would not be using the conversion options in STR2STR to generate messages and instead would just be relaying the messages from the base receiver which should be setup to output the 1005 and other messages directly. The numbers you are seeing in the 1005 location output correspond to 0,0,0 in ECEF XYZ coordinates.

          Like

          1. Thanks again.
            Do you mean that I should configure M8T to generate rtcm 1005 directly?
            Does str2str passthrough 1005 when received?
            It is difficult to set every unknown points in str2str parameters. Do you have another solution for this?

            Like

          2. I have an idea. Using “averaging single base pos” algorithm in strsvrstart. If norm(sta.pos) = 0 then fetch maxaveep times of raw ubox message.

            Like

  2. Hi Tim and Felipe,
    Well between you I think you have come up with the solution we need here. I have used MSM messages in the past with both ublox and NVS receivers,and still saw errors, its clear to me now from your analysis ( hindsight is wonderful!) that although the MSM 7 messages contained more detailed information they did not increase the time stamp resolution, which remains at 1 msec for all MSM levels . Doppler seemed to help but it still left errors due to the conversion.
    MSM is of course necessary for Galileo and the other constellations.
    Trimble and others use there own proprietry data transmission format possible to get round some of these problems, or maybe for other reasons, nevertheless RTCM3 works well on NTRIP BKG base stations, so its not a fault of RTCM3, per se, as long as the data set is aligned as Felipe suggests.

    Like

    1. Hi Felipe, Igor, Anthony. Thanks, that helps! I tried adjusting the time-tags, P, and L to align the time-stamps as Felipe suggested using my matlab script starting from the good observation file and then running a solution on the adjusted observations. That works great. I then tried using the RTKLIB time tag adjust option (-TADJ) as Igor suggested. This option looks like it is intended to do the same thing but the adjusted L and P values don’t quite match my matlab numbers and the resulting observation file does not give a good solution. So not quite there yet, but feel like I’m going in the right direction.

      Like

      1. OK, it’s working now! The time tag adjust option had a bug in it that caused it to improperly handle bad carrier phase observations. With that fixed, I can now translate to RTCM with the time tag option enabled and then to RINEX and get a good solution with the result. I hope to publish a post later today with the details. Thanks again for everyone’s help!

        Liked by 1 person

  3. RTKLIB seems to support adjustment of time tags via -TADJ parameter, however it is only available in RXM-RAW, but not for RXM-RAWX. It will also account for that in the pseudorange and phase:
    raw->obs.data[n].L[0] =R8(p )-toffFREQ1;
    raw->obs.data[n].P[0] =R8(p+ 8)-toff
    CLIGHT;
    We could probably implement it in an automated fashion.

    By the way, there seems to be a change in time tags between 2.3 -> 3.01. In 3.01 they are much closer to a full second, which resulted in inability to generate RTCM messages at all when defining a certain output period (time interval check would not pass). https://github.com/emlid/RTKLIB/commit/b724ca5899429f31ccb0dac009d5300e33090549

    Like

  4. My suggestion is to (1) calculate the time stamp rounding error, (2) subtract that value from pseudorange and carrier-phase observations, and only then (3) adopt the rounded time tag. For example, for a time stamp of 50.999584, the error would be 51-50.999584=0.000416. I’d also suggest to compare the result with that of teqc +smtt -ublox. A similar issue was discussed at length in the teqc mailing list, from which I quote:
    “The main point is that there be consistency between time, phase, and pseudoranges.
    Therefore either:
    rx ms clk jumps in time tags (== receiver time), smooth phase and pseudorange
    (e.g. teqc -smtt output, Berne translation output)
    or
    smooth time tags (== GPS time), rx ms clk jumps in phase and pseudorange
    (e.g. teqc +smtt output, clockprep output, GIPSY input)
    are equally valid representations of the observables in RINEX.”
    http://postal.unavco.org/pipermail/teqc/2005/000221.html

    Like

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.