Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MTi3-DK - Timeouts only on 100 Hz #43

Open
hittingray opened this issue Nov 30, 2018 · 6 comments
Open

MTi3-DK - Timeouts only on 100 Hz #43

hittingray opened this issue Nov 30, 2018 · 6 comments

Comments

@hittingray
Copy link

hittingray commented Nov 30, 2018

Hi,

I keep getting timeout messages on the MTi3-DK, but only when the update rate is set to 100 Hz. At 50 Hz, it works properly.

It still seems the board is giving me proper 100 Hz output, albeit the abnormally long fetch times. The values still seem appropriate though. This does not happen with the included examples in the MT SDK (when run on Windows 10 on the same machine, anyway).

It seems to only happen in modes 1 and 2 (i.e. raw data), and not in mode 3.
Edit: The irregular timing still seems to happen in mode 3, but the waitfor(length+1) does not timeout.

I added some extra code and found that it seems to happen approximately at 22-23 Hz, and that the apparent time to receive these messages is bimodal, with the two times being roughly 0.039 s and 0.045 s. This is obviously much longer than the interval for 50 Hz (0.020 s), which works correctly and has
a constant time between packets! The interval between these events looks fairly random but spreadout. I have attached an Excel spreadsheet with this data. Sheet 1 details apparently timed out messages, and sheet 2 details the time taken per message (for all messages).

Edit:
The sequence is something like 10 us, 10 us, 10 us, 10 us, 39 ms, 10 us, 10 us, 10 us, 10 us, 44 ms.
For mode 3, the sequence is something like 10 us, 10 us, 30 ms.

I have narrowed it down to being these two waitfor():

# second part of preamble
waitfor(3) 		
# read contents and checksum
waitfor(length+1) 

According to the code, there should be 2 waitfor() called per read_msg() if everything runs properly. In my spreadsheet, I see 200 calls per second, so this lines up appropriately.

I've had a look through the code and I'm not sure why exactly this behaviour occurs.

Timeout.zip
(For some weird reason, it straight up wouldn't let me upload an Excel file even though it says they're supported)


Edit:
I've done some more investigating, it seems that the it only happens in the waitfor(length+1). It seems to be stalling when reading data into the buffer for whatever reason. A seemingly arbitrary number of bytes seem to get read in before a stall. Once the stall is over, the rest of the bytes magically come flooding in all at once.

I forgot to mention, but I am using ROS Melodic on Ubuntu 18.04, and the MTi3 is running firmware 1.2.7. I have spent quite a bit of time digging, and I simply cannot find the culprit. I am using Python 2.7.15 and PySerial 3.4.

@Steven-GH
Copy link

Hello,

We suspect that this is related to delays caused either by the ROS node itself, or the PC (e.g. when using a virtual machine). I recommend trying the following:

  • increase the MID.additionalTimeOutOffset parameter in mtdef.py.
  • Strip the source code by removing unnecessary content. The current version of the ROS node contains quite some overhead that you might want to get rid of in order to improve dataflow.

@hittingray
Copy link
Author

Hi, I have already increased the timeout parameter to just prevent it from constantly outputting warning messages. This doesn't prevent the non-constant time between readings though, which is what we're after.

I am not using a virtual machine, and the computer I am using should be much more than capable.

I've already had quite a look through the source code, and it seems it gets stuck at that waitfor(length+1) only. Even when this line is removed, the data still takes longer to come into the buffer (which makes sense, since waitfor() just checks the buffer to see when it has filled up).

			c = self.device.read()  # read 1 byte
			while (not c) and ((time.time()-new_start)<self.timeout):
				c = self.device.read()
			if not c:
				rospy.logwarn("Timeout waiting for message - asked for read but no message received")
			if ord(c)<>0xFA:
				continue
			# second part of preamble
			waitfor(3)
			if ord(self.device.read())<>0xFF:	# we assume no timeout anymore, read 2nd byte
				continue
			# read message id and length of message
			mid, length = struct.unpack('!BB', self.device.read(2)) # read 3rd/4th byte, message id and length
			if length==255:	# read another 2 bytes if extended length
				waitfor(2)
				length, = struct.unpack('!H', self.device.read(2))
			# read contents and checksum
			waitfor(length+1)  # wait for remaining bytes to come in												
			buf = self.device.read(length+1)  # read them into buffer
			while (len(buf)<length+1) and ((time.time()-start)<self.timeout):  # continue reading while buffer isn't full
				buf+= self.device.read(length+1-len(buf))

The preamble, BID, MID and length (i.e. first 4 bytes) read correctly with no timeout. It is only the data bytes which seems to have some issue. My main suspicion would be something do with either the serial library or USB/FTDI drivers, and some kind of unusual buffering on their end (which doesn't really make sense).

So to re-iterate, the main thing which seems strange to me is at 100 Hz, the times between reads are something like:

10 us, 10 us, 10 us, 45 ms, 10 us, 10 us, 10 us, 39 ms, 10 us, etc.

But at 50 Hz, the times are nice and constant:

20 ms, 20 ms, 20 ms, 20 ms, 20 ms, etc.

I otherwise get the correct amount of readings per second; they just take very random amounts of time to come in at 100 Hz.

@Steven-GH
Copy link

Hello,

Yes, I can confirm this behavior. My apologies but we cannot find a solution to this problem right now.
It indeed seems to be related to the serial to USB conversion, as the same problem shows up for the MTi 10/100-series devices using serial to USB converter cables.
We are looking into solving this issue in a next release of the ROS node.

@hittingray
Copy link
Author

Hello,

Thanks for the confirmation. If you do find a solution for this, it would be much appreciated if you could post details separately here, as I have already modified the code a fair amount, and would rather implement the fix into my code (provided it is reasonably small, of course). Thank you.

@Steven-GH
Copy link

Hello,

After some further analysis we found that indeed it has to do with the USB/FTDI driver settings. We plan on tuning these settings in order to improve behavior, but unfortunately I cannot guarantee any fix date for this.

In case you are mostly interested in the sample time of the MTi itself (e.g. for further processing), we recommend using the sensor reported timestamps. These can be enabled in the xsens.yaml configuration file.

@Steven-GH
Copy link

@hittingray A possible fix for this issue can be found on Xsens' community forum:
https://base.xsens.com/hc/en-us/community/posts/360029341694

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants