mc-Things

mc-Products => mc-Module => Topic started by: Nick_W on November 03, 2016, 09:12:31 pm

Title: Getting reboots during interrupts
Post by: Nick_W on November 03, 2016, 09:12:31 pm
Hi,

I'm using acceleratorInt1 to read "click" events (110 module). It's set quite sensitively, so tapping the module produces a lot of interrupts (I'm using the green led to show the event being triggered).

If I tap the module 20-30 times, the module reboots. I've put a 300ms delay in the interrupt routine, to slow the events Down, which seems to help, but this could be that it just takes longer to get 20-30 taps!

I think the module is running out of RAM (no run time errors in debugging, just "lost communications" when the module reboots).

Any suggestions, or is there a limit to how many times an interrupt can be triggered in a short time period?

Thanks,
Title: Re: Getting reboots during interrupts
Post by: mc-Abe on November 07, 2016, 10:17:11 am
All the memory used in the scripts are managed memory so if you run out of that memory you will receive a run-time error. We had some more issues with the garbage collection process on the managed memory, I believe these have been resolved and a new version is going up today. Could you try this so we can rule out the garbage collection as a source of the problem.

If we still have problem then, we will look further into what might the cause be.
Title: Re: Getting reboots during interrupts
Post by: Nick_W on November 07, 2016, 10:35:22 am
Will do,

It's weird, the module can sit there for 9 hours just publishing every 30 seconds, then you get one interrupt and bang! it reboots.

On my bench, the module works fine for several hours, interrupts etc. no problem. But installed, overnight it fails on the first interrupt. Bit hard to test for - you can't really wait 9 hours to test out every time!
Title: Re: Getting reboots during interrupts
Post by: Nick_W on November 08, 2016, 07:49:00 am
Initial test (sample of one, so use with caution).

So far, so good.

I have been running two modules overnight, one rebooted during the night (don't know why), but the other ran all night, and is still publishing (I have a third module another running different code that stopped publishing a couple of hours after booting, so the publishing problem is still there).

So testing the one module that has been publishing for 10 hours - reedswitch and accelerator interrupts no longer cause reboots.

Here are my logs from this morning (wife leaving for work):

Code: [Select]
MCThings/000111C2/BatteryVoltage 2910
MCThings/000111C2/Temperature 17.93
MCThings/000111C2/Rssi -93
MCThings/000111C2/KnockEnable True
MCThings/000111C2/PublishEnable True
MCThings/000111C2/Door False
MCThings/000111C2/Door 0.0
MCThings/000111C2/Rssi -86
MCThings/000111C2/Uptime 36778
MCThings/000111C2/UptimeString 0:10:12:58
MCThings/000111C2/Door 0.0
MCThings/000111C2/Rssi -90
MCThings/000111C2/Door True
MCThings/000111C2/Door False
MCThings/000111C2/Door 0.0
MCThings/000111C2/Rssi -91

Just to explain this log:

These:
Code: [Select]
MCThings/000111C2/Temperature 17.93
MCThings/000111C2/Rssi -93
Are decoded beacons. These:
Code: [Select]
MCThings/000111C2/KnockEnable True
MCThings/000111C2/PublishEnable True
MCThings/000111C2/Door False
Are actual publishing events.

So this:
Code: [Select]
MCThings/000111C2/Door 0.0
MCThings/000111C2/Rssi -86
is the door state decoded from a beacon (not a publishing event, so it's not real-time).

This is how I can tell that publishing has stopped (or is still going as it is in this case), but the program is still running, as I just get the decoded beacons in that case. Uptime over 32768 is not encoded in the beacons, (as I only have 2 bytes to store the value in), so if publishing has stopped, but uptime is over 9 hours, I can't tell how long it has been running. I mention this because that is the state of my third module currently.

The last Door True/Door False is her leaving, and would normally have caused a reboot (without publishing). Notice uptime at 10 hours, 12 minutes 58 seconds. This is good. The module did miss a transition a few minutes later (when she came back for something), but this is just a failure to publish, the module continued to work afterwards.

I did notice the signal strength was a little low, so I repositioned the gateway slightly, and conducted a few more test. This is my log of some door open/close tests and a "knock" (accelerator interrupt) test:

Code: [Select]
MCThings/000111C2/BatteryVoltage 2910.0
MCThings/000111C2/Rssi -77
MCThings/000111C2/Door True
MCThings/000111C2/Door False
MCThings/000111C2/Temperature 18.187500
MCThings/000111C2/Door True
MCThings/000111C2/Door False
MCThings/000111C2/Doorknock True
MCThings/000111C2/Door 0.0
MCThings/000111C2/Rssi -78
MCThings/000111C2/Door 0.0
MCThings/000111C2/Rssi -79
MCThings/000111C2/Door 0.0
MCThings/000111C2/Rssi -79
MCThings/000111C2/Door False
MCThings/000111C2/KnockEnable True
MCThings/000111C2/PublishEnable True
MCThings/000111C2/Door 0.0
MCThings/000111C2/Rssi -81
MCThings/000111C2/Door 0.0
MCThings/000111C2/Rssi -79
MCThings/000111C2/Uptime 38240
MCThings/000111C2/UptimeString 0:10:37:20

I opened/closed the door twice, waited then "knocked" on the door to generate an accelerometer interrupt ("Doorknock True"), and waited for the uptime report to prove the module had not rebooted.

I have another module running the same code (but that is the one that rebooted during the night). That also worked correctly, but has only 4 1/2 hours uptime.

So altogether much better results with the new version of firmware.

Keep up the good work! Now if we could just get the "stop publishing" bug fixed...
Title: Re: Getting reboots during interrupts
Post by: mc-Abe on November 14, 2016, 10:53:54 am
I am glad to see that things are improving. We are aware of the random reboot issue and are working on it.

As for the publishing stopping, we found a big problem that was solved in version v0.7-417 of the Gateway Host Processor which was released on Nov 10th. Could you try this improved version to see if your results are better?
Title: Re: Getting reboots during interrupts
Post by: Nick_W on November 18, 2016, 07:56:27 am
Tried the new gateway code, but still have the same problem where the modules stop publishing after a while.

My workaround (slow publishing down to no more than once every 3 seconds) does seem to work though. So re-implementing this for the moment.