Author Topic: Odd behavior of Gateway after last F/W update  (Read 282 times)

Nick_W

  • Full Member
  • ***
  • Posts: 215
    • View Profile
Re: Odd behavior of Gateway after last F/W update
« on: January 14, 2017, 10:49:21 am »
Update from Nick stuck in his time loop.

After monitoring this overnight (to see how frequent it is), I have a synopsis of my log files:
Note I cut out all the good data, this is just showing delays, so it's not as bad as it looks!

The first entry is example typical good data.

Code: [Select]
[I 2017-01-14 10:14:41,196] MCThings/000111C2/Data           : Decoded timestamp: 2017-01-14 10:14:38.556000
                                                               WARNING: Publishing Delay of : 2.640 s
--
[I 2017-01-14 10:14:59,743] MCThings/000111BC/Data           : Decoded timestamp: 2017-01-14 10:14:29.870000
                                                               WARNING: Publishing Delay of : 29.873 s
--
[I 2017-01-14 10:14:59,860] MCThings/000111BC/Data           : Decoded timestamp: 2017-01-14 10:14:59.919000
                                                               Publishing Delay of          : -0.059 s
--
[I 2017-01-14 10:15:40,380] MCThings/000111B4/Data           : Decoded timestamp: 2017-01-14 10:14:38.728000
                                                               WARNING: Publishing Delay of : 61.652 s
--
[I 2017-01-14 10:15:40,621] MCThings/000111B4/Data           : Decoded timestamp: 2017-01-14 10:15:38.777000
                                                               Publishing Delay of          : 1.844 s
--
[I 2017-01-14 10:16:59,822] MCThings/000111BC/Data           : Decoded timestamp: 2017-01-14 10:15:29.968000
                                                               WARNING: Publishing Delay of : 89.854 s
--
[I 2017-01-14 10:17:00,074] MCThings/000111BC/Data           : Decoded timestamp: 2017-01-14 10:15:59.017000
                                                               WARNING: Publishing Delay of : 61.057 s
--
[I 2017-01-14 10:17:00,184] MCThings/000111BC/Data           : Decoded timestamp: 2017-01-14 10:16:30.074000
                                                               WARNING: Publishing Delay of : 30.110 s
--
[I 2017-01-14 10:17:00,318] MCThings/000111BC/Data           : Decoded timestamp: 2017-01-14 10:17:00.123000
                                                               Publishing Delay of          : 0.195 s
--
.
.
. long period of the same good/delayed data
.
.
.
[I 2017-01-14 11:00:33,974] MCThings/000111BC/Data           : Decoded timestamp: 2017-01-14 11:00:04.399000
                                                               WARNING: Publishing Delay of : 29.575 s
--
[I 2017-01-14 11:00:34,099] MCThings/000111BC/Data           : Decoded timestamp: 2017-01-14 11:00:34.448000
                                                               Publishing Delay of          : -0.349 s
--
[I 2017-01-14 11:02:04,186] MCThings/000111BC/Data           : Decoded timestamp: 2017-01-14 11:01:34.546000
                                                               WARNING: Publishing Delay of : 29.640 s
--
[I 2017-01-14 11:02:04,315] MCThings/000111BC/Data           : Decoded timestamp: 2017-01-14 11:02:04.599000
                                                               Publishing Delay of          : -0.284 s
--
[I 2017-01-14 11:02:41,863] MCThings/000111B4/Data           : Decoded timestamp: 2017-01-14 11:01:40.128000
                                                               WARNING: Publishing Delay of : 61.735 s
--
[I 2017-01-14 11:02:41,987] MCThings/000111B4/Data           : Decoded timestamp: 2017-01-14 11:02:40.178000
                                                               Publishing Delay of          : 1.809 s
--
[I 2017-01-14 11:02:42,912] MCThings/000111C2/Data           : Decoded timestamp: 2017-01-14 11:01:40.924000
                                                               WARNING: Publishing Delay of : 61.988 s
--
[I 2017-01-14 11:02:43,046] MCThings/000111C2/Data           : Decoded timestamp: 2017-01-14 11:02:40.973000
                                                               WARNING: Publishing Delay of : 2.073 s

Module BC is a battery monitor, so delayed messages don't really matter, so much. C2 and B4 are door monitors (running the same code), delays here are a problem (but the beacons arrive immediately, so there is a sort of workaround).

It seems to me there are a couple of possibilities (OK more than a couple):
  • This is behavior that was always happening, but messages were being lost, not cached, so in fact this is an improvement, in that the messages do eventually arrive.
  • This is some sort of continuation of my Gateway WiFi issues, that have not been properly explained yet, and messages that can't be sent immediately (via WiFi) are cached for later retry. I really need a second gateway to see if this has the same problem.
  • A fix for the "stops publishing" problem has introduced another problem where messages are now delayed (instead of the module stopping publishing) which is better, but still something of an issue.
  • This is a QoS issue, which I could fix if I understood QoS better. So far QoS seems to make no difference to the delay problem.
  • This is a LP network issue, which is now showing up as I'm sending longer (json) messages, and the retry mechanism is "resend failed messages on the next connection".

What I don't understand is whether the delayed message is cached on the module, or the gateway. If it was the gateway, it could resend immediately (and QoS would help). If it's cached on the module, does this indicate a LP network connectivity issue (in which case why do beacons arrive immediately, and mqtt messages don't?). Is this a new thing I'm seeing since I switched to sending json (with timestamps), because json is much longer to send than beacons/short data - ie just publishing a value works, but sending a long json string fails partway through and is queued for retry the next time the module sends? I'm tending to think this is what is happening (ie option 5 above).

Uptime continues to be fantastic, literally weeks without a reset (well done whoever figured that out).
Battery life on the 120 module is holding good for now at 2.881V after 3 days (I need a few weeks to be able to say for sure this problem is fixed), but so far so good.

I may actually be able to get my 120 module accelerator library finished, and unlimber my 205 device...

Keep up the good work guys, and sorry to be the reporter of "weird stuff".
« Last Edit: January 14, 2017, 11:02:31 am by Nick_W »