-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blips in MS5611 sensor data #23
Comments
These blips (and incorrect pressure reading spikes) are an unsolved mystery and further investigation would be more than welcome. One known issue is RF noise from the VHF transmitter but this seems to be something different. |
@mihu-ov My glider has a Dittel FSG-2T but the unit was off. It is also equipped with a Stratux (which has no transmitter), and a SoftRF unit, both of which were on. The SoftRF is capable of transmitting at 900-ish MHz although its a pretty low power output. I also have multiple switching converters going from 12V to 5V to among other things power the SoftRF and Stratux... But their state should have been unaffected by stopping an re-starting the daemon, so this seems unlikely. So you are saying this is a known issue? I'm surprised more hasn't been done by this, this seems like a pretty significant problem. The blips do get through the Kalman filter and are quite noticeable. |
In my setup I have no issue with blips and only minimal effects when transmitting on VHF. In one of our club gliders (DG300) there is an occasional blip and vario goes wild when transmitting. Your traces are probably an extreme case - that´s great for investigating the problem (RF issues or anything else). |
Again, I see absolutely no reason to believe this is RF (at least from outside the OV) related. My VHF transceiver was off at the time. Those waveforms are 100% repeatable. When I start the OS, the red, orange, and purple traces are what happen. You can see it is regular. They are occurring every 92 to 94 samples. When I kill the process and restart the blips either mostly or completely go away. I do not see how an external device could be impacting that. Blue is TE, Orange is static... out of the same data file. |
I agree, I don´t believe your case is RF induced. |
There are some signs that point to this, but others that don't:
None of this explain why only temperature conversions are corrupted or why it's periodic, or why restarting makes the problem go away. But here's the other thing... Even if the data is somehow corrupted, why is it consistently increasing for a couple samples and then consistently decreasing for more samples. I mean, okay, maybe we have a bad temperature conversion, this results in bad compensation value, and the next pressure read is consistently high, and the temperature compensation corruption is consistently decreasing it until the next temp conversion... but there are multiple high samples prior to multiple low samples, and the shape isn't very consistent. Anyway, I'll try a few experiments:
|
Okay, here's what I found: I tried everything listed above and found nothing of use. Increasing the usleep time may have decreased the magnitude of the problem. The problem occurred at random meas_counter values, and was not tied to a temperature. I tried adding a usleep at the beginning, and tried resetting in the middle of the run. And while the time between conversion and read was not terribly consistent, it was always longer than the requested usleep value. In other words, that whole exploration was a bust. But, I did find something very interesting out. If sensord is started with: systemctl start sensord.service (or restart) it has the blips This seems 100% repeatable. Doesn't matter if the service is disabled or enabled, doesn't matter if it's started, stopped, started again, or restarted, or whatever. Similarly, when I run it from the command line, it appears to run clean. If I edit the /lib/systemd/system/sensord.service to change it from Type = forking to Type = simple, and change the command line to /opt/bin/sensord -f -c /opt/conf/sensord.conf it does NOT have the blips. I disabled the service and added: "/opt/bin/sensord -c /opt/conf/sensrord.conf" to the ovmenu-ng.sh script and it appears that is a workaround. So something about systemctl type=forking is causing the problem. My solution is clearly a hack work-around, but hopefully someone who knows Linux better than me might have some ideas. |
Is sensord printing anything to stdout periodically? |
I'm not sure what you mean. I didn't pipe stdout to anything, and if it's run through systemctl I won't see stdout. I don't recall seeing anything when running it from the command line, and wouldn't think I would unless I put in -f mode, which I didn't try. What were you expecting/looking for? |
Just trying to think up anything that might be different between running from systemd and console. |
I don't know, I was talking to a friend who is way more Linux knowledgeable than me, and he was stumped too, but thought maybe its run with different privileges or environmental variables or something. But since it only fails as "type=forking" that doesn't make much sense... And I'm completely stumped why I'm the only person who seems to be having this problem. |
If you can think of something specific to try, I can certainly perform an experiment, but I went through everything obvious. I still have no idea what is causing the erroneous readings let alone why systemctl has anything to do with it. |
Wow, great investigation, thank you! Ignoring obvious outliers was a rough hack but finding the root cause would be much better. There is some interesting reading on the effects of sample rate jitter on https://www.freesteel.co.uk/wpblog/2014/11/26/ms5611-altitude-sensor-very-sensitive-to-own-working-temperature/ and also http://blueflyvario.blogspot.com/2012/06/latency-and-timing.html |
I didn't get much out of the bluesky thing. A lot of his blog seems to be him noticing that filters can cause latency. Stuff like that is well understood, and manageable. The freesteel one on the other hand was definitely interesting. That doesn't look exactly like I am seeing, but its definitely similar. When I was looking at timing data I was looking for a short time, not an irregular time. All the times are irregular (although typically by less than 1ms) and by looking at the data I've already collected I can't see an obvious pattern (but the data I have is time from conversion to read, not from conversion to conversion). I think others have pointed out that the way we are accessing the sensors isn't great. It's very difficult to precisely control timing on a non-real time system, and to do this properly the conversions would have to be done in hardware. Despite that, I think we could better improve timing control if that was necessary. One obvious thing would be to monitor the time between samples and adjust usleep to better target a 12.5 ms interval. In other words, check the time immediately after waking up from usleep, and then check the time right before going to usleep, and adjust the usleep time to be target time-awake time. Another possible thing would be to run this as nice -10. If for some reason timing is getting screwed up by a disk access (as might be implied by the fact that not writing to the log seemed to fix the problem), I'm not sure that changing the nice level will help since the disk access might have higher priority than even nice -19 (or whatever maximum un-niceness it). What really bothers me is that I seem to be the only person who is affected. It's hard to believe... I'm using the Stefly hardware and the stock 20193 release and seeing the problem. |
Alright, I'm out of ideas at this point. I tried nice -15 it didn't make much if any difference. The glitches that I described two comments ago, were caused by the code change, which is what fixed the blips. I'm out of ideas, it seems these things are finicky. As a workaround, I am successfully running it with systemctl as Type=simple with the -f command line option. No glitch, no blips, and its only a minor change (edit to /lib/systemd/system/sensord.service) and it's good-to-go. That's how I plan to do it going forward. I did notice we are recalculating the sensitivity and offset (the MS5611 internals) each time we read the pressure. Since these are uniquely tied to data that is calculated after a temperature read, it seems like it makes more sense to calculate these after each temperature read. This shouldn't result in incorrect results, we're just repeating calculations that don't need to be repreated. I still would like to experiment with higher sampling rates (and averaging) and improving the accuracy of timing. If we sampled on each 12.5ms clock, in theory that would reduce the standard deviation of the error by a factor of 2 with no major cost. Another question is... is there a good reason why we are putting out updates every 50ms. Does a vario actually benefit from being that fast, or would it be worth trading off some response rate for better filtering (albeit, I don't have a real good concept on the response rate of the Kalman filter). |
If removing logging removes the blips, I'd suspect filesystem write cache flushes for this effect. Just for fun, try to redirect log to
and restart the device. Will it make any difference? |
Also, what is the contents of that log, actually, when you see the blips. Is sensord writing to that log when blips are happening? |
Its not clear to me that removing logging fixes it. But I can definitely try the symlink next time I try. I don't think it's writing anything to sensord.log... Just a bunch of stuff when it's kicked off. There's a -r command line option (or something like that which records the output. That's what I was using. I modified the string -r puts out. I can't tell that the -r write is affecting things. I don't think much of anything is being written to sensord.log (/home/root/.xcsoar/logs/sensord.log). In general I have not tried correlating sensor data to vario output. But I have no reason to believe the blips in the raw data weren't causing the vario blips. When the sensor is "quiet" there are no vario blips. |
Hi guys, had a similar issue some time ago. Periodically one of my pressure sensors spiked the vario. Not 5 m/s but a solid 2 or 3 m/s every 3 or 4 seconds. In my case it was a bad solder joint. |
Thanks for chiming in. Where did you get the hardware (and when)? The blips I was seeing on the vario were about 0.5 m/s tops. I assume you didn't get to the point that you looked at the data as closely as I did (ie could confirm the simultaneous blips on static/TE and that the blips looked like my blips)... |
You're welcome! |
I was using the stefly hardware, which I presume was assembled in a factory, so it seems unlikely there was a soldering issue. I was thinking if you had ordered from Stefan that perhaps there might have been a bad batch. |
I recorded the NMEA output and it was a bit disturbing when I looked at the graphs: This is a recording of the POV,E statement (i.e. vario output). I have no idea what that hash is from 6500 to 10000. After 70000 it gets noisier. At some point toward the end (which I assume is that portion) I re-started the daemon and turned on temperature readings (see my next PR). Now, I did something a bit stupid here. I initiated the temperature conversion AFTER initiating the MS5611 conversions. That's probably a no no given the comments made on the Freesteel website. I tried a different variant which initiated the temperature conversion before the pressure conversions and... std deviation with no temp conversions ~ .019 As you might expect, standard deviation of temperature increases from .06 degrees C when initiated before pressure to .11 degrees after. And yes, when looking at the data in the second run which was only 4582 samples (4 minutes?) there was still a single blip -- with a magnitude of about +/- .15 mbar in the static (this of course is impacted by the IIR filter) I'm thinking all these things are dumping noise into ground and it is sensitive to the quality of the ground. Right now, we are performing both MS5611 conversions simultaneously reading it out and then waiting. There's no reason why we can't interleave the pressure readings. The more I am looking at this, the more it feels like the hardware should be redesigned. |
@hpmax I assume you recorded a sensord log file and are tracing it with excel, right? It is nearly at the bottom of the thread. Perhaps yo can analyze my file and see if my device also has these spikes. |
All the plots were done in Octave (GNU copy of Matlab). Probably could have used Excel or LivreOffice, but I like Octave. The first few plots were from data out of sensord, using the equivalent of the -r command line option (ie raw and unfiltered). The last plot was from "Debug" of the NMEA stream from XCSoar with a Perl script to strip out unnecessary values. I'm about half done a re-write that will poll the sensors more frequently, regularly, and fully interleaved. I really hope I'm not trying to chase down a hardware problem in software. I don't understand how others don't seem to be running into this problem. |
So that doesn't look like the "blip" to me but it does look like it might be a "spike". The blips were in the magnitude of .3-.5. I never saw one that large and they always started with a positive component. But the spikes that were happening more irregularly do look kind of like that. Also interestingly, yours looks like it jumps downward afterward. |
I made a few variants of the software to test different sampling strategies. Running the daemon from the command line, I used: I've never seen the chips below 30C, so the second order compensation doesn't do squat, but I see little reason not to use it just in case. I'm not even sure why this is set as an option to turn it on. It should just be on, there's virtually no penalty when you don't need it, and the penalty is well worth it if you do. I took a moving average (+/- 32 samples) and subtracted that out from the original data to try to remove the sensor drift. My three sampling strategies were:
There should be a benefit to running the sampling algorithm faster. Assuming the noise is uncorrelated and IID, the standard deviation will go down with the square root of the number of samples included. In other words, if you take 4000 samples and average them out into 1000 bins of 4, the standard deviation should be half what it was if you only had 1000 samples taken at 1/4th the rate. When I took the standard deviation of the static and TE sensors, I got:
Side note: These numbers were calculated based on "digoutP", "sens", and "off" in Octave. Sensord's p number is slightly higher standard deviation (0.02 - 0.03 Pascals). I assume this is because of quantization error by doing all integer calculations. So, now I tried averaging (2 in the case of #2, and 4 in the case of #3) Conclusions: Clearly, the noise is NOT uncorrelated. Looking at the data the results with a ~1.4 STD seemed to show an interference pattern that looked a little like envelope modulation. The ~2.1 STD showed what looked like a triangular waveform superimposed on it It was very distinct, and fairly large. It is bizarre to me that the two MS5611s behaved differently in sampling variant #2. I did not see any of the previously described behaviors this time. I collected 92800 samples on variant #1, 28300 samples on variant #2, and 103700 samples on variant #3. Worst cases were (based on subtracting the moving average):
I'll also point out something interesting about the AMS5915. While datalogging, I have sometimes seen absurd dynamic pressures -8 billion or something like that. I have to assume this is some sort of numeric nonsense, since the sensor is limited to 50mbar (5000 Pascals). So I was trying to get to the bottom of it. So I checked over the ams5915.c code. It appears to be implementing things correctly, but it seems like they could have been implemented more intelligently. I wish I could catch it outputting something funny. But it did output something funny. My dynamic pressure is -0.3318 Pascals with an output word of 1551. The datasheet provides the following formula (1551 - 1638) * 50 /13107 = Pressure which is exactly what is calculated -- except that 1638 is supposed to be 0 pressure. Kinda weird. I guess its possible it works slightly below 0 and this is just some sort of sensor error/drift, but it has been pretty consistent down there. I have a few more experiments on tap. I'm going to make the following variants:
If anyone else has suggestions on experiments, I'm all ears. |
Thank you for putting such an effort in getting the best data out of our sensors! Frankly, I don´t understand all the details yet but from a first try in interpreting your data:
So my current summary would be - please correct me if I am wrong:
|
It might be overkill, but I figure... future expansion. It's hard for me to gauge how real-time a task audio output is for instance. The next question is do we change the sensors... If we replace the the MS5611's the best choice seems to be the LPS22HH. It might be a better, I'm not sure. I am fairly happy with the AMS5915, and don't see many other good choices on this one, but I also don't see where you can buy them. |
Okay, I've had it running for several hours and am relatively happy with the results, I think it probably could still use some improvement, and it definitely could use a critical eye, but here goes: First, I changed the ms5611 routines slightly. The Sensitivity and Offset calculations were moved into the read_temp routine instead of read_pressure. It really makes no sense to have them in read_pressure. The sens/off variables should only change as a result of change in temperature. So why not calculate it after doing a temperature read rather than after a pressure read. Second, I separated out the read_pressure and the compensation routines (i.e. there is now a ms5611_calculate_pressure). Third, the sens/off calculation in read_temperature is not always performed, it depends on the value of another parameter passed into it. As for the pressure_measurement_hander: I add a new variable called "glitch". Glitch is normally 0. If glitch is 0, I record the raw temperature data (D2, sens and off) for each sensor into a 8-slot circular buffer, based on the last read value. If too much time passed (>3 ms beyond anticipated) is detected, I take the average of all values in the circular buffer (if glitch was already 0 when detected) and increment glitch by (4 + (glitch magnitude / 50ms)). Measure the pressure of one sensor and temp of the other and alternate each time through based on meas_counter&1. If (glitch)>0 { If (glitch<2) process calculated pressure, else discard So basically the algorithm is, if a glitch is detected save the last known good D2 (raw temp) values. Keep doing what you're doing, but compensate the D1 value by the difference in D2 between current and last known good * a constant. When reading the temperature out, you are just doing it to get the D2 value, not to update sens/off. Also, depending on how big the glitch we are throwing out the first couple pressure samples (not passing them through to the filter) since the impact of the glitch is proportional to both size and time since the glitch. This is running at nearly twice the speed that the previous program was running sp we now need to adjust the way we are filtering, in the case of pitot and static, there are three obvious ways:
The third option produces slightly better results, although I suspect it has a slightly longer time constant than the others. I'm inclined to go with option #3 because in many ways its the simplest. Like the IIR, the Kalman filter can just be run more often, although some tweaking of the gain may be needed. Standard deviation pre-filter is about 0.86, post IIR filter about 0.28 Pascals. Worst case is about +/- 6 Pascals pre-filter and +/- 1.4 Pascals post-filter. Some tweaking will be needed for the nmea_message handler may be needed for the meas_handler. Deciding when to dump out data would be good. Anyway, this is the basic algorithm. Unfortunately, the devil is in the details, and it's easy to miss something and use old data when you're not supposed or vice versa. Actual code is uglier than the psuedo code. I'm not sure my glitch discard is working correctly, and I'm also not sure what K is supposed to be. I picked .297 (30/101) based on the data I collected before. I'm concerned that it may be sensor specific and/or temp specific. I.e. there are two compensation routines (TE_D1 += (Stat_D2-Avg_Stat_D2)*K1) and (Stat_D1 += (TE_D2-Avg_TE_D2)*K2) if the K values are different it means I will need a calibration routine to determine the correct K values. Probably not a bad idea either way. Another question I just thought of is that perhaps we don't want to just read the temperature value and calculate it, but rather either use a rolling average (FIR filter) or an IIR filter. Might be worth checking if this reduces noise a little. I presume the same amount of noise is in the temperature measurement as the pressure measurement, so filtering the temperature measurement might actually work well with this. Plus, it works really well with this scheme (it actually makes it simpler than my current method). I guess I should try that. |
I think I am close to done messing with it... Standard deviation of Kalman output (TE pressure) improved from 0.02 m/s to 0.0144 m/s. Note: Averaging (on either the input or output) improves filtered data only slightly. So there isn't too much point to averaging if you are filtering The new program outputs pressures twice as often as the old, however the time constants should be close to the same. I'm not sure about the Kalman filter, since I don't understand the dynamics of it. It produces twice as much data as it used, to but since all output is filtered, I may just decimate the data. The Kalman filter takes < 4 seconds to settle to less than .05 m/s. In several hours of testing, I've seen only two "blips". One of them I suspect was me closing up the trailer. The other... I don't know what it was... Pressure dropped about 4-5 Pascals and came back up, in a triangular shape over the course of a few seconds. It looked nothing like the timing jitter noise I had seen, and was perfectly correlated between static and TE. I'm still working on getting a new Kalman filter. The new filter will use all three sensors to filter all three pressures simultaneously. That means, in theory, you will only need two sensors (albeit I'm not sure that's actually true since there is some drift between the sensors. |
I have posted a forked version (hpmax/sensord). I am hoping it will have substantially better noise performance but otherwise function like the current baseline. Please try it out! |
Hi guys. This is the port I was thinking might be used for the hardware sensor driver. |
Before we go too far off the deep-end of a new sensordboard, please try my new software. I think it's a significant improvement over the old one... and I think I can squeeze out more improvements (which I think may be pretty significant).
2 I am hoping to roll out a new Kalman filter which I think will be better able to reject the glitches as well as allowing it to work without TE. (Note that having a second MS5611 is vital to my current compensation scheme). Incorporating both sensors in should in theory further reduce noise since we now essentially have twice as much data to work with and the noise between them should be completely uncorrelated. In other words, I'm hopeful we could reduce standard deviations another 40% or so. However, if we do decide to go to new hardware, I think to make it worth the effort, I think we should go all out and ensure we are not only using pressure measurements but also accelerometer, gyro, magnetometer, and GPS to do a full IMU. This will require significant software effort. It may require software changes to Xcsoar to communicate IAS, TAS, compass direction, altitude, etc and/or two way communication with Xcsoar. It will require serious questions... Should the board be external? Should it have the audio in it, or a signal to drive a remote display (or multiple remote displays, or multiple XCsoars (imagine having two XCsoar units, one in front and one in back with a common sensor box feeding both). I personally like the idea of moving it outside of OV, and connecting it via USB. I'm not sure if ESP32 supports USB though, and I'd like to see it have multiple USBs, where it could drive multiple XCSoars, as well as be able to be directly connected to a GPS. Using an external GPS receiver would give the option of upgrading as newer versions come out with better performance. This is a big undertaking. We also should probably decide on which specific sensors to use: MS5611 vs LPS22HH for instance? MS4515DO vs AMS5915? LSM6DSM and MMC5983A for the acc/gyro/magneto? This is a big undertaking, is everyone in? |
I agree with testing the software approach first. No questions about it!
I think the new hardware is a long term undertaking too.
In the meantime I was thinking about already built boards and OpenVarios.
They already have the sensors we have including the old IMU. (MPU9150)
Reading your work led me to ponder about how using MPU9150 would impact on
the improvements you achieved.
If adding a driver and data processing for IMU impacts on your work
it won't be nice.
That's why a dedicated hardware interface that can feed preprocessed NMEA
on it's own timing seemed a good idea to me.
With a simple mod we can use full sensors while new hardware is ready and
still benefit from your improvements.
My ESP32 proposal is an experiment to see if it can be done. I'm not saying
it is the way to go.
Do you believe a separate issue to discuss hardware is a good idea?
Thanks again for the work guys!!
Cheers!
El mar., 25 ago. 2020 a las 10:37, hpmax (<[email protected]>)
escribió:
… Before we go too far off the deep-end of a new sensordboard, please try my
new software. I think it's a significant improvement over the old one...
and I think I can squeeze out more improvements (which I think may be
pretty significant).
1. Right now, my compensation scheme is to simply multiply by .297.
One thing I'd like to do is change sensorcal, to continuously read off the
sensor and periodically throw in delays to induce the glitches. I'd then
take repeated measurements and perform linear regressions, not just first
order, but second, or third, to see if I can better dial in the
compensation and do it as part of the calibration routine.
2 I am hoping to roll out a new Kalman filter which I think will be better
able to reject the glitches as well as allowing it to work without TE.
(Note that having a second MS5611 is vital to my current compensation
scheme). Incorporating both sensors in should in theory further reduce
noise since we now essentially have twice as much data to work with and the
noise between them should be completely uncorrelated. In other words, I'm
hopeful we could reduce standard deviations another 40% or so.
However, if we do decide to go to new hardware, I think to make it worth
the effort, I think we should go all out and ensure we are not only using
pressure measurements but also accelerometer, gyro, magnetometer, and GPS
to do a full IMU. This will require significant software effort. It may
require software changes to Xcsoar to communicate IAS, TAS, compass
direction, altitude, etc and/or two way communication with Xcsoar. It will
require serious questions... Should the board be external? Should it have
the audio in it, or a signal to drive a remote display (or multiple remote
displays, or multiple XCsoars (imagine having two XCsoar units, one in
front and one in back with a common sensor box feeding both).
I personally like the idea of moving it outside of OV, and connecting it
via USB. I'm not sure if ESP32 supports USB though, and I'd like to see it
have multiple USBs, where it could drive multiple XCSoars, as well as be
able to be directly connected to a GPS. Using an external GPS receiver
would give the option of upgrading as newer versions come out with better
performance.
This is a big undertaking. We also should probably decide on which
specific sensors to use: MS5611 vs LPS22HH for instance? MS4515DO vs
AMS5915? LSM6DSM and MMC5983A for the acc/gyro/magneto?
This is a big undertaking, is everyone in?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#23 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AKZLFH2LT7WNQO62ZOQYPTLSCO5C7ANCNFSM4PONFAIA>
.
|
I think the MPU9150 could probably be used without much impact on what I did. Although I do not have interrupt driven behavior, my usleep's are now semi-intelligent waiting only enough time to get to when it needs to wake up again. That means even if it takes up 3 ms of time calculating stuff, the program should compensate. John Wells did have a variant with the MPU9150. It's worth pointing out that the MPU9150 is pretty old. They aren't being sold any more, the 9250 (which is also end-of-life) has significantly better performance and the LSM6DSM is much better still. However, I think it's JUST an AHRS. In other words, the 9150 isn't aware of what's going on with the pressure sensors or vice versa, and neither are aware of what's going on with GPS. These are really pretty critical to integrate together. In fact, in theory, they should be integrated at a lower level with the GPS raw data. I think some GPS units can do that. Again, that would require some serious expertise, which I'm not sure any of us has. My friend has some, but he's not going to be willing to devote that much time to it. I really don't have a good feeling for how much we are loading down the Cubieboard. In theory, I wish we could start again with Raspberry Pi. Because, you know, it's actively developed and we it's far newer, and incorporates WiFi and BT. Of course, it doesn't have the LVDS interface and it probably consumes more power. I am not clear what that board you were showing was. It doesn't appear to be either the Cubieboard or the Adapterboard (at least not the one I got from Stefan). I am confused though. Are you proposing stick an ESP32 on the Adapterboard and then use that to interface to the current Sensorboard? |
That's the idea. Connecting IDC cable to ESP32 and ESP32 to cubieborad with another IDC cable. The adapterboard I posted is the design you can download from Openvario.org. I wanted to change I2C pins to UART pins for the IDC connector but that would lead us to loose the ADC converter (Voltage) that is soldered to adapterboard. Using that free port would allow us to have an internal UART and still have voltage readings. I think that enabling functions one by one can be beneficial too. We can tackle the full IMU step by step. |
I thought ESP32 was Timo's idea. However,I don't really see the benefit of an adapter-adapter-board. It makes the whole contraption more mechanically complicated. Now you have to figure out a way to mount another board and route more cables.... And yes, it;s another board, because even if you don't need 3.3V because you already have it you'll still need two IDC cables and two mounting points. Personally, I think it would be better to just redesign the sensorboard. Obviously, you'd want to design it in such a way that you could re-use the manifold block. The only real pricey part is the AMS5915 or MS4515DO, the rest are probably less than thirty dollars total. You'd still need a custom board regardless. Plus changing the sensorboard allows fewer mechanical changes, a more straightforward construction and the opportunity to upgrade to newer sensors that aren't EOL. It is pretty frustrating how much of this project is EOL. My audio amplifier apparently died, and the one in there is an AS1702 which appears to be EOL. I found one company selling it, and they have 25 left. To make things more complicated, like I said, I don't think Stefan's boards match what you posted. I don't see those holes on the board. I do see an unused header that is mounted between the IDC connector and the Cubieboard connector, but I think its smaller pitch holes than you have. Why not just bite the bullet, and do a proper redesign with modern hardware and get all this done in one go? Imho, the big task here is not the hardware, it's the software. It's going to be a complete rewrite of sensord, possibly of variod, and there will likely be changes to xcsoar, and all new software in the ESP32. And that is the case whether we use the existing hardware + ESP32 condom or not |
Yes, Timo proposed ESP32, connected outside on an IGC RJ45 connector. |
I'm not in favor of doing a stop-gap update. Like I said, the cost to spin a new board isn't that high in the grand scheme of things, although assembling isn't that fun. I'd like to use JLCPCB but they have a limited selection of parts. :( I think my software changes along with the other potential improvements that I'd like to roll out within the next few weeks, should be more than adequate to justify not monkeying with the hardware unless we are doing a thorough update. And that would require someone to commit to working on all the software. Preferably someone who has real experience with IMUs and Kalman filters. Also, that brings up another interesting thing. I think Stefan audio amp is on a daughter board attached to the adapter board, it's not on the sensor board. He's also using a linear amp, I don't know if it would be better to switch to a Class D amplifier. I wish there wasn't so much forking on the hardware. It's also worth pointing out that if we modify the hardware (whether with the ESP32 adapter, or a full new sensorboard, we're going to wind up with two software versions which are incompatible with one another. |
I have committed a new update into hpmax/sensord The new change is the addition of a program called "compdata". Compdata logs data off the ms5611s and then writes it to disk. It records the data in memory and then when finished (after about 1.5 hours) it will dump it all to disk in two text files. If someone wants to download this, compile and run it for me, that'd be great. I'd like to collect data on other setups other than my own. Also, mine will possibly be out of commission for the next 1-2 weeks. Please try to keep the air around the sensors as still as possible. Long term drift isn't going to bother me, but short term fluctuations because of movement or something may disturb it. |
Hi @hpmax. Tried to compile and run compdata to gather information but make fails.
error message reads:
|
Not sure what happened there, I may not have pushed all the files or something. Try again. I also included a pre-compiled compdata which should work. Unfortunately, my OV is down for repairs so I can't test it, I'm hoping to get it fixed Monday or Tuesday. |
@hpmax I used your .bin file. The typo in the makefile is still there and I couldn't compile the code. |
@iglesiasmarianod Thanks for the data. Unfortunately, I don't have my development or testing setups available to me at all, hopefully, I should be back to "normal" in a week or so. I did analyze your data... and if you want to, you can too, data format is: The first row is C data (ie stuff stored in the EEPROM of the ms5611), you can ignore it for now, the remainder of the data looks like: Where F stands for filtered. Filtering stops as soon as the glitch starts, so the F values will be defined at the start of each glitch and not change again until the next glitch. What I'm looking for is a relationship between (D2F-D2) on one sensor with (D1F-D1) of the other sensor. In other words, if we assume that the D2 should stay constant during the glitch (D2 is proportional to temperature), how much does D1 change relative to the change in D2 which we know to be bogus. It looks like every sensor is different, but not very different. It's possible that the differences are proportional to the C values, but I haven't looked into that yet. As a result we could probably use a default value and allow people to calibrate if they wish to improve performance. It's possible if the relationship to the C values were understood the default values could be improved, but I doubt it would help if you cal'ed the individual sensors. There is a temperature relationship. At the start of your run, it appears the linear/scale constant is -.32923, whereas at the end of your run it was -.31188 (and your sensor clearly heated up). I had measured 0.31 on mine. The other sensor went from -.32975 to -,30881. There is also a scalar term that varies from 37 to 18.24 on the first sensor, and from 46.09 to 19.15 on the second. I am hoping this range is adequate to extrapolate. From what I can tell, adding a second order term doesn't help much. |
Brown trace is StaticD1-StaticD1f For some perspective, the brown trace has a mean of -400, and a std deviation of 445. By comparison, the compensated data has a mean of 1 and std deviation 75. Target for both is 0. Note that these are both in raw D1 values. D1 is 8.7e6, if we assume 1:1 linearity that means 1 Pascal is 87 in D1 space. Also, note that I only captured data during glitches, so this is not an accurate reflection of what the average and std deviation truly would look like in actual usage. |
Just thinking, could the amplifier on the sensorboard cause issues due to switching noise? Any change in sensor data if power and input signal is removed from the amp? |
@mihu-ov I'm using Stefan's hardware. On it the amplifier is on a daughter board off the adapter board and isn't on the sensor board. It is also a Class AB amplifier which shouldn't throw a lot of noise out. Mine failed, and am in the process of replacing it with a Class D, because I'm hoping it will be more robust. That said, those spikes you see on the brown trace were deliberately injected by adjusting timing delays. The reason the amplitude is different is because I randomized the delay. The longer the delay, the bigger the spike. Obviously you can see there is some correlation between the orange trace and the blue trace, but I did my best to compensate. There is noise in the D1 data. But I think we have to expect it. The question is, how much is caused by electrical noise. I really don't know. At this point I can't tell that it's a real problem, or that @iglesiasmarianod data is worse than mine. Rht now I'm only recording data during timing glitches. Perhaps we could try recording data under normal circumstances and see how much difference there is. Looking at the schematic, I'm actually fairly happy with it, although I might throw in a 10uF cap on the input to all four regulators, and an extra 1uF on the output, next to the .1uF. However... Looking at the BOM, I don't have a lot of confidence in the capacitors. Unfortunately, with MLCC capacitors, it's important to know about the specific capacitor you are using, because some have very substantial capacitance loss over the working voltage, and they can be wide variation in self resonant frequency and ESR as well. Some lose capacitance with frequency, up until they gain it back again when it goes self resonant. |
Just to let you guys know, I bought some LPS22HHs, ESP32-WROOM32Ds, and LSM6DSMs to play around with. I'll try to design a board and see how good the results are compared to what we have now. I'll also want to get MMC5983A and MS4515DO (pin-compatible equivalent to the AMS5915, but claims somewhat better performance -- price appears to be slightly better and it is more available in the US, but it's still expensive and not widely available) but unfortunately Newark didn't have those. I'm also building up an amplifier based on the MAX98304 to see how bad it is from an EMI perspective. |
I'm basically done with a new software update. There is nothing Earth shattering here, but I have added configurable quadratic compensation, and the compensation for the two sensors can be sensor specific (i.e 2 sets of coefficients). I don't think the quadratic (versus linear) is that important, but it's basically free, so why not? Configurable individual coefficients is a big deal though. One of my sensors was using a linear term of -.33, whereas the other was -.29. And swapping the coefficients made the jitter noise readily apparent when graphing the data whereas using the correct ones it was MUCH harder to spot. I have significantly enhanced compdata, so it will now run tests and calculate a set of coefficients and write them into the configuration file (note it just appends them to the end -- so you could end up running it 5x and getting 5 different coefficient sets). It's actually a pretty slick setup, and I think we want to include this in the ov-menu. Now the problem... It always gets within 10%, but it doesn't always get better than that, and 10% isn't that great. So you may still want to hand tweak this and I am trying to figure out how to make this easy for the average user. I was thinking of adding in a feature "-nj" which deliberately adds noise jitter. You can then use "-r" to record data and plot it in Matlab/Octave/Excel/whatever. If you are seeing high jitter blips you want to increase the magnitude of the linear term. If you see low jitter blips you want to decrease the magnitude of the linear term. Perhaps a binary search could be done, although I'm not sure how you would handle the quadratic and constant terms. The other question is how much do the compensation coefficients drift with temperature/pressure/time/whatever. Anyone got any ideas here? |
I'm starting to re-think this. After tuning it up as good as I could get it. I let it run overnight and looked at the results. It started out real good, but over time it got worse. A lot worse. It's possible it's temperature sensitive, but I can't find much evidence that's the case. I'm still working on that, but I'm not expecting to see a clear correlation. And I've looked at the temperature of both sensor, as well as change in the ratiometric difference between the sensors (and there is some over time, but it's pretty subtle). One thought I've had is dynamically adjusting the compensation. In other words, after each detected jitter event, I analyze the data generated during the event, and re-calculate the coefficients to use for the next event. But of course, I don't know if the frequency or magnitude of the jitter affects this. Magnitude presumably doesn't have a huge impact because that would cause a quadratic term and the quadratic term -- compensation is almost entirely in the linear term. One problem I see, is that I am deliberately creating timing jitter to measure the results. This can be done with intentionally large amounts which gives nice clear results, using smaller amounts of jitter, I think might result in a worse fit and unfortunately in actual use I don't have control over how large or frequent the timing error is, and I certainly don't want to deliberately induce it. |
I'm working on a new release that should be done in the next couple days unless something happens. First some definitions: My current method of measuring noise is to look at the difference from the current sample vs previous. I'm ignoring the first 1000 samples during which it is settling. When I am talking about filtering something, in general I am referring to an IIR filter in which Y(k) = (Y(k-1)*7)+X(k))/8. The new software implements the following changes:
First, when a glitch is encountered, the program attempts to estimate (overestimate) the length of the glitch based on the size of the glitch (in D2-D2 filtered) and time constants that have been observed. When a glitch is first observed it looks for the maximum change in temperature vs filtered temperature over the first few cycles, it then calculates how long it thinks it will take to return to normal, it calculates this in a conservative manner. It exits glitch mode after the timer elapses (in cycles not seconds) OR the temperature reading is within 15 counts of the filtered temperature -- note that it can't exit glitch compensation mode during the first few cycles while it's observing the glitch magnitude as I've seen it do weird things in the first or second cycles after the glitch. Secondly, there is an additional counter, which is essentially a watchdog counter. Glitch mode is re-triggerable. I worry that in theory, it could stay in glitch mode forever, and that the filtered temp value will remain stationary while the actual temperature moves. As such, if you stay in glitch mode for too long, it will now exit glitch mode, and update the filtered temperature values to the current temperature value as a fail-safe. It may produce some temporary ugliness on the outputed values, but hopefully it should resolve in a fraction of a second and then it will be over. I believe the problem in the previous post was a rapid change in temperature which caused it to never exit glitch mode. Once it fails to leave, the filtered value becomes more and more wrong, and any future glitches screw things up the compensation badly. My first fix likely goes a long way to addressing this by guaranteeing it gets out after a fixed period of time. But in theory, it's still possible that it could get stuck if it keeps getting re-triggered. The watchdog addresses this unlikely possibility.
I've been running the new code nearly continuously for the past 24 hours and have been very happy with what I'm seeing: 47.5% has less than 1 mbar sample to sample variation. Filtering with a (7*X(k-1)+X(k))/8 FIR filter we get: The future: I have no plans for additional software updates to address the glitches (except maybe point 5) ... however after this, my next software update will probably add the temperature sensor (DS18B20) back in. More long term (yeah, I know, I've been promising it for a while), before the end of the year -- I'm hoping to have a new Kalman filter code. The Kalman filter code should be a major improvement. First, the noise in the two sensors appears to be almost completely uncorrelated which means along with standard deviation data collected by compdata on the two sensors, both sensors can be used to estimate both static and TE pressure. This should decrease noise by nearly 40%. But this also means that if you don't have a TE probe, you should plumb the static line to both static and TE ports -- and this will be an option in the configuration. It'll be interesting to see which gives better results. I also plan to use the glider's polar to better improve the estimate... NOTE: Comparing the standard deviation to the previous standard deviations I listed, it seems something got worse... But in reality, it's just how the data was presented. The data shown above was based on taking the standard deviation of the delta between subsequent samples. Previously, I had taken the standard deviation of the data minus a +/- 32 point moving average. I went back and recalculated the data using the old method and got the following: 61.8% has less than 1 count difference from average. I also ran it a second time with twice as much data and std deviation further decreased to 1.12 and the rest of the distribution improved as well. So the new code in fact appears to be about 10% better than my August 22nd numbers, which in turn were better than my August 16th numbers. |
Bad news... I hooked up a vacuum pump and dropped the pressure to about half an atmosphere (about 9500 feet?) and the coefficients changed. I guess, maybe that shouldn't have been a shock. So now I have to figure out the compensation relationship as a function of pressure. I'm hoping it's something simple like compensation = measured compensation * (D1 / measured D1). This also calls into question whether a changing D2 will impact that as well. I guess that's next. Ughh... |
As luck would have it (or maybe not), my vacuum source had a slow leak -- which means I got a nice gradient through my data collection. Looking at the data (visually, not exhaustively)... D2 (raw temp data) glitches seem pretty consistent over pressure (haven't checked over temperature). As expected, D1 glitches weren't consistent. I tried: (D1-D1filtered) * max(D1 filtered)/D1 filtered This helped, but it under compensated the data. That is, at low pressure, the glitches were still less than they were at normal pressure. Then I tried: (D1-D1 filtered) * max (Pressure) / Pressure This OVER compensated the data. That is, at low pressure glitches were now LARGER than they were at normal pressure. Then I tried: (D1-D1 filtered ) * max (D1 filtered - X) / (D1 filtered - X) for different values of X. On the static sensor, X = 3.2e6 gave pretty consistent results (at least visually). But on the TE sensor, I think it was closer to 3.5e6. Then I tried: (D1 - D1 filtered) * max (Pressure + X) / (Pressure + X) I thought X = 23000 gave pretty consistent results on both static and TE sensor. There are a few problems here...
Looking at the temperature data I saw these blatantly obvious undulations in the temperature data. I was thinking... Maybe this was the effect of the glitch. While it's glitching the temperature was dropping, but the peaks were about 25 minutes apart. I suspect this was the furnace turning on and off in my house. If we assume there was maybe 1C swing from on to off, and that was about 3200 counts... Let's say the max temperature difference we'll ever see is 80C hard to imagine external temp being more than +/- 40C, and the internal temp while potentially higher, should have less swing than that puts, the max delta in D2 at about 256k, which means (8.84e6+256e3)/8.84e6 gives a max swing of about 2.9%. So... maybe temperature compensation is unnecessary? Anyone got any rabbits to pull out of hats here? |
Okay, I think I am getting pretty good results with: Compensation = ((D2 - D2f)^2*K0 + (D2-D2f)*K1 + K2) * (Pressure + K3) I think I have compdata dialed in to give accurate K0, K1, and K2. In fact, as a result of a bug, I was writing out the TE and static compensation values swapped. When I looked at the data, I could clearly see noise blips. After realizing, they were swapped from looking at the code, I switched them around in the config file, re-ran, and could no longer see clear noise blips. They were only 10% apart. I'm very happy about that. The question is what to do about K3... On my sensors, the values I am using are 15365 and 17322. If someone didn't want to measure K3, I'd suggest just using 16343.5 as a starting point. How critical is that difference... Well, let's take a look. At 44 kPa, that's an error of about 1.66% between using the average value and the correct value. On the other hand, I have a sample size of 2 here. The way I determined K3 was to connect static and TE lines (and I should have connected pitot but didn't -- don't be an idiot and do what I did) to a hand vacuum pump and dropped the pressure to some low level (44 kPa). I then started collecting data with intentional glitches (300ms). The vacuum setup wasn't perfect and did leak, which allowed pressure to slowly rise back to ambient. The data of interest was D1-D1f, and D1f based pressure. I filtered it such that I only looked for (D1(i)-D1f(i)-D1(i-1)+D1f(i-1)>(1500*Pressure/100e3). In other words I was only looking for the peak of the glitches, and only the intentional glitches that I injected. I then separated the data into three bins. One at low pressure (<50kPa), medium pressure (50<P<95), and normal pressure (>95kPa). I took the mean of (D1-D1f/(Pressure+X)) for each value of the filtered data in each bin, for all X values between 0 and 40e3 and then found X such that the low and high bins had the same mean and then double checked the medium bin. If I were to automate it, I'd likely do something similar but create more bins (variable based on how many data points I have, too few data points could cause the means to be unreliable), and then find X that minimizes the standard deviation of the means across the bins. Would anyone use such a tool, does anyone care? |
When I turn on my OpenVario, I will see a lot of +/- blips on the vario, every few seconds, with a magnitude in the neighborhood of .5 m/s.
I decided to record the data out of the MS5611s and found that upon boot up, I'd see a spike high (of about 35-40 Pascals) followed by an immediate low spike of perhaps 10-15 Pascals. The high pulse lasted about 2 samples, and the low pulse lasted perhaps 10 samples.
Interestingly, the effect occurred on BOTH MS5611s simultaneously. and it repeated every 92-94 cycles (it varied).
I killed the daemon and restarted it, and the blips mostly or completely went away. I recorded twice, once for over 20 minutes and never saw a blip, once for over a minute and only saw one blip. This seems almost completely repeatable for me.
Blue trace is the restart, the other 3 were what it does on bootup.
blips.zip
The text was updated successfully, but these errors were encountered: