AP Randomly Restart

  • 2
  • Question
  • Updated 2 years ago
Hi all Wireless Gurus,

I just want to share some problem on our WLAN and looking for suggestion that will help us to solve this problem. We are using ZD3000 and having about 60 AP. We always experiencing randomly restarting of AP with seconds interval. Most of our AP is 7363 model and the FW of our ZD is 9.8.2.0 build 15. It gave us a headache causing some user to raise ticket for slowness or intermittent in connection. Hoping someone would help us. 

Regards,

4Jonjon20
Photo of 4Jonjon20

4Jonjon20

  • 7 Posts
  • 0 Reply Likes

Posted 2 years ago

  • 2
Photo of Andrea Coppini

Andrea Coppini

  • 66 Posts
  • 29 Reply Likes
Check the ZD logs, they will give you an idea of why the AP is rebooting. Share it here in the forum
Photo of 4Jonjon20

4Jonjon20

  • 7 Posts
  • 0 Reply Likes
Hi Andrea,

Here is some logs of our ZD:



Jan 11 15:38:35 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-17th [email protected]:01:7c:37:0b:d0] heartbeats lost Jan 11 15:41:14 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-17th [email protected]:01:7c:37:0b:d0] joins with uptime [63] s and last disconnected reason [AP Restart : application reboot] 
Jan 11 15:41:19 AEV-NAC-Ruckus-ZD stamgr: stamgr_update_reboot_info():AP[c4:01:7c:37:0b:d0] reboot detail:The apmgr timer thread stock,and let rsmd reboot the AP   
Jan 11 16:07:15 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-16th [email protected]:0c:90:00:1f:50] heartbeats lost 
Jan 11 16:07:27 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-17th [email protected]:01:7c:37:0d:10] heartbeats lost 
Jan 11 16:07:29 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-12th [email protected]:0c:90:04:73:70] heartbeats lost 
Jan 11 16:09:37 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-12th [email protected]:0c:90:04:73:70] joins with uptime [283369] s and last disconnected reason [Heartbeat Loss] 
Jan 11 16:10:06 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-17th [email protected]:01:7c:37:0d:10] joins with uptime [59] s and last disconnected reason [AP Restart : application reboot] 
Jan 11 16:10:07 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-16th [email protected]:0c:90:00:1f:50] joins with uptime [62] s and last disconnected reason [AP Restart : application reboot] 
Jan 11 16:10:11 AEV-NAC-Ruckus-ZD stamgr: stamgr_update_reboot_info():AP[c4:01:7c:37:0d:10] reboot detail:The apmgr timer thread stock,and let rsmd reboot the AP   
Jan 11 16:10:11 AEV-NAC-Ruckus-ZD stamgr: stamgr_update_reboot_info():AP[8c:0c:90:00:1f:50] reboot detail:The apmgr timer thread stock,and let rsmd reboot the AP   
Jan 12 08:25:45 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[Ruckus 7363 [email protected]:3d:37:3c:89:70] detects excessive probe requests on radio [11b/g]. 
Jan 12 08:27:48 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[Ruckus 7363 [email protected]:3d:37:3c:89:70] detects excessive probe requests on radio [11b/g]. 
Jan 12 08:38:01 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-21th [email protected]:0c:90:04:75:20] heartbeats lost 
Jan 12 08:38:10 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-17th [email protected]:01:7c:37:0b:d0] heartbeats lost 
Jan 12 08:39:58 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-17th [email protected]:01:7c:37:0b:d0] joins with uptime [61189] s and last disconnected reason [Heartbeat Loss] 
Jan 12 08:40:27 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-21th [email protected]:0c:90:04:75:20] joins with uptime [59] s and last disconnected reason [AP Restart : application reboot] 
Jan 12 08:40:32 AEV-NAC-Ruckus-ZD stamgr: stamgr_update_reboot_info():AP[8c:0c:90:04:75:20] reboot detail:The apmgr timer thread stock,and let rsmd reboot the AP   
Jan 12 09:26:28 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-20th [email protected]:93:96:05:32:60] heartbeats lost 
Jan 12 09:27:27 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-16th [email protected]:0c:90:04:03:d0] heartbeats lost 
Jan 12 09:27:46 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-21th [email protected]:0c:90:04:75:20] heartbeats lost 
Jan 12 09:28:47 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[Ruckus 7363 [email protected]:3d:37:3c:89:70] detects excessive probe requests on radio [11b/g]. 
Jan 12 09:30:24 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-16th [email protected]:0c:90:04:03:d0] joins with uptime [62] s and last disconnected reason [AP Restart : watchdog timeout ] 
Jan 12 09:30:35 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-21th [email protected]:0c:90:04:75:20] joins with uptime [60] s and last disconnected reason [AP Restart : watchdog timeout ] 
Jan 12 09:30:57 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[Ruckus 7363 [email protected]:3d:37:3c:89:70] detects excessive probe requests on radio [11b/g]. 
Jan 12 09:46:29 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():Lost contact with AP[NAC-Wifi-20th [email protected]:93:96:05:32:60] 
Jan 12 09:51:47 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-16th [email protected]:0c:90:00:1f:50] heartbeats lost 
Jan 12 09:52:21 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-17th [email protected]:01:7c:37:0d:10] heartbeats lost 
Jan 12 09:52:28 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-20th [email protected]:93:96:1b:8e:00] heartbeats lost 
Jan 12 09:54:04 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-20th [email protected]:93:96:1b:8e:00] joins with uptime [73742] s and last disconnected reason [Change State Response Loss] 
Jan 12 09:54:22 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-17th [email protected]:01:7c:37:0d:10] joins with uptime [52] s and last disconnected reason [AP Restart : application reboot] 
Jan 12 09:54:27 AEV-NAC-Ruckus-ZD stamgr: stamgr_update_reboot_info():AP[c4:01:7c:37:0d:10] reboot detail:The apmgr timer thread stock,and let rsmd reboot the AP   
Jan 12 09:57:01 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-20th [email protected]:93:96:05:32:60] joins with uptime [64] s and last disconnected reason [AP Restart : watchdog timeout ] 
Jan 12 09:57:47 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-16th [email protected]:0c:90:00:1f:50] joins with uptime [64124] s and last disconnected reason [Join Fail] 
Jan 12 10:18:48 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-21th [email protected]:0c:90:04:75:20] heartbeats lost 
Jan 12 10:21:10 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-21th [email protected]:0c:90:04:75:20] joins with uptime [59] s and last disconnected reason [AP Restart : application reboot] 
Jan 12 10:21:15 AEV-NAC-Ruckus-ZD stamgr: stamgr_update_reboot_info():AP[8c:0c:90:04:75:20] reboot detail:The apmgr timer thread stock,and let rsmd reboot the AP   
Jan 12 10:58:54 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-19th [email protected]:01:7c:35:fb:60] heartbeats lost 
Jan 12 11:01:52 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-19th [email protected]:01:7c:35:fb:60] joins with uptime [60] s and last disconnected reason [AP Restart : application reboot] 
Jan 12 11:01:56 AEV-NAC-Ruckus-ZD stamgr: stamgr_update_reboot_info():AP[c4:01:7c:35:fb:60] reboot detail:The apmgr timer thread stock,and let rsmd reboot the AP   
Jan 12 11:14:23 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-12th [email protected]:0c:90:04:73:70] heartbeats lost 
Jan 12 11:14:23 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-21th [email protected]:0c:90:04:75:20] heartbeats lost 
Jan 12 11:15:40 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-21th [email protected]:0c:90:04:75:20] joins with uptime [3328] s and last disconnected reason [Heartbeat Loss] 
Jan 12 11:15:41 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-12th [email protected]:0c:90:04:73:70] joins with uptime [352135] s and last disconnected reason [Heartbeat Loss] 
Jan 12 12:25:29 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-21th [email protected]:0c:90:04:75:20] joins with uptime [59] s and last disconnected reason [AP Restart : application reboot] 
Jan 12 12:25:33 AEV-NAC-Ruckus-ZD stamgr: stamgr_update_reboot_info():AP[8c:0c:90:04:75:20] reboot detail:apmgr, Receives reset command from ZD in Run state   
Jan 12 13:14:22 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-9th [email protected]:0c:90:03:ab:50] heartbeats lost 
Jan 12 13:14:42 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-21th [email protected]:0c:90:04:75:20] heartbeats lost 
Jan 12 13:16:33 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-21th [email protected]:0c:90:04:75:20] joins with uptime [3123] s and last disconnected reason [Heartbeat Loss] 
Jan 12 13:16:42 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-9th [email protected]:0c:90:03:ab:50] joins with uptime [99425] s and last disconnected reason [Heartbeat Loss] 
Jan 12 13:52:10 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-21th [email protected]:0c:90:04:75:20] heartbeats lost 
Jan 12 14:01:05 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-21th [email protected]:0c:90:04:75:20] joins with uptime [239] s and last disconnected reason [AP Restart : power cycle] 
Jan 12 14:12:18 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-16th [email protected]:0c:90:00:1f:50] heartbeats lost 
Jan 12 14:14:34 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-16th [email protected]:0c:90:00:1f:50] joins with uptime [54] s and last disconnected reason [AP Restart : application reboot] 
Jan 12 14:14:39 AEV-NAC-Ruckus-ZD stamgr: stamgr_update_reboot_info():AP[8c:0c:90:00:1f:50] reboot detail:The apmgr timer thread stock,and let rsmd reboot the AP   
Jan 12 14:38:39 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-9th [email protected]:0c:90:03:ab:50] heartbeats lost 
Jan 12 14:38:55 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-21th [email protected]:0c:90:04:75:20] heartbeats lost 
Jan 12 14:39:23 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-17th [email protected]:01:7c:37:0d:10] heartbeats lost 
Jan 12 14:39:28 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-15th FLr-[email protected]:0c:90:04:72:a0] heartbeats lost 
Jan 12 14:39:49 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-14th [email protected]:01:7c:37:0d:00] heartbeats lost 
Jan 12 14:41:14 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-17th [email protected]:01:7c:37:0d:10] joins with uptime [17261] s and last disconnected reason [Heartbeat Loss] 
Jan 12 14:41:14 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-15th [email protected]:0c:90:04:72:a0] joins with uptime [4068024] s and last disconnected reason [Heartbeat Loss] 
Jan 12 14:41:41 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-9th [email protected]:0c:90:03:ab:50] joins with uptime [57] s and last disconnected reason [AP Restart : application reboot] 
Jan 12 14:41:44 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-21th [email protected]:0c:90:04:75:20] joins with uptime [59] s and last disconnected reason [AP Restart : application reboot] 
Jan 12 14:41:44 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-14th [email protected]:01:7c:37:0d:00] joins with uptime [17169643] s and last disconnected reason [Heartbeat Loss] 
Jan 12 14:41:45 AEV-NAC-Ruckus-ZD stamgr: stamgr_update_reboot_info():AP[8c:0c:90:03:ab:50] reboot detail:The apmgr timer thread stock,and let rsmd reboot the AP   
Jan 12 14:41:48 AEV-NAC-Ruckus-ZD stamgr: stamgr_update_reboot_info():AP[8c:0c:90:04:75:20] reboot detail:The apmgr timer thread stock,and let rsmd reboot the AP   
Jan 12 15:05:37 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-17th [email protected]:01:7c:37:0c:70] heartbeats lost 
Jan 12 15:05:38 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-16th [email protected]:0c:90:00:1f:50] heartbeats lost 
Jan 12 15:05:38 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-20th [email protected]:93:96:1b:8e:00] heartbeats lost 
Jan 12 15:05:55 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-21th [email protected]:93:96:05:2c:e0] heartbeats lost 
Jan 12 15:07:58 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-21th [email protected]:93:96:05:2c:e0] joins with uptime [87976] s and last disconnected reason [Heartbeat Loss] 
Jan 12 15:08:24 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-16th [email protected]:0c:90:00:1f:50] joins with uptime [51] s and last disconnected reason [AP Restart : application reboot] 
Jan 12 15:08:28 AEV-NAC-Ruckus-ZD stamgr: stamgr_update_reboot_info():AP[8c:0c:90:00:1f:50] reboot detail:The apmgr timer thread stock,and let rsmd reboot the AP   
Jan 12 15:08:30 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-17th [email protected]:01:7c:37:0c:70] joins with uptime [51] s and last disconnected reason [AP Restart : application reboot] 
Jan 12 15:08:31 AEV-NAC-Ruckus-ZD syslog: eventd_to_syslog():AP[NAC-Wifi-20th [email protected]:93:96:1b:8e:00] joins with uptime [60] s and last disconnected reason [AP Restart : application reboot] 
Jan 12 15:08:34 AEV-NAC-Ruckus-ZD stamgr: stamgr_update_reboot_info():AP[c4:01:7c:37:0c:70] reboot detail:The apmgr timer thread stock,and let rsmd reboot the AP   
Jan 12 15:08:35 AEV-NAC-Ruckus-ZD stamgr: stamgr_update_reboot_info():AP[58:93:96:1b:8e:00] reboot detail:The apmgr timer thread stock,and let rsmd reboot the AP   
Photo of Andrea Coppini

Andrea Coppini

  • 66 Posts
  • 29 Reply Likes
APs are rebooting due to a 'heartbeat loss' which means they lost contact with the ZD. Check your cabling. I see most of the APs are upper floor APs, could it be your cables are too long (over 80m?). What about your uplinks? Do you have any kind of broadcast storm control enabled on your switches? Finally replace the patch lead going into the ZD.

I also see some alerts related to 'excessive probe requests'. This would cause the AP to temporarily blacklist that client as a form of protection. If the issue persists, either seek-and-destroy the client or turn off protection from the WIPS section of the ZD.
Photo of Sean

Sean

  • 346 Posts
  • 88 Reply Likes
Re your comment:
Then you must have network saturation somewhere between your AP and ZD
Correct, this is what I am saying.

When its busy the heartbeat frame gets dropped, so it's best to set it to another value.

I have over 20,000 ruckus AP's in my network varying from ZD to SCG deployments and I see it all the time.
Photo of 4Jonjon20

4Jonjon20

  • 7 Posts
  • 0 Reply Likes
Yeah, you're right.. When there is small amount of user connecting to the WLAN, the AP is not experience this problem, but when it goes to regular working schedule, that the problem existed. If the network saturation is the root cause of this problem, what is your recommended solutions? Thank you all for trying to help me. 
Photo of 4Jonjon20

4Jonjon20

  • 7 Posts
  • 0 Reply Likes
Yeah, you're right.. When there is small amount of user connecting to the WLAN, the AP is not experience this problem, but when it goes to regular working schedule, that the problem existed. If the network saturation is the root cause of this problem, what is your recommended solutions? Thank you all for trying to help me. 
Photo of 4Jonjon20

4Jonjon20

  • 7 Posts
  • 0 Reply Likes
I forgot to mention also, that the reboot timeout of our ZD is set to 60mins. Thanks.
Photo of Andrea Coppini

Andrea Coppini

  • 66 Posts
  • 29 Reply Likes
An occasional 'heartbeat loss' error is normal -although not desirable- if the AP is very busy.  If the AP reboots, it means that it couldn't reach the ZD/SZ for an extended period which isn't normal.

Start by running Speedflex between the ZD/SZ and the AP.  Simply click on the speedo icon next to the AP in the AP list.  This will do a throughput test and should run at 0% packet loss and good wired throughput.  Then look at your switches - are your switchports all running at 1000/FDX or are some of them running Half Duplex?  Check the error counters on the switches.  In a good wired network you should see zero errors.