Smartzone 100 and R730 Reboots/ Offline Issue

  • 1
  • Question
  • Updated 4 days ago
  • Acknowledged
Is anyone having issues with R730's randomly going Offline and requiring the AP to be rebooted in order to get it back online?  We have been seeing this issue for about 30 days now and support has been unable to tell us why the R730's are doing this.  The AP's are inaccessible most of the time when the issue happens, but sometimes we are able to still ping the AP and actually SSH to the login prompt, but the admin account/password will fail.  Once we power cycle the AP it works fine again.   When the issue is happening the AP is still accepting clients but it has no network access so those clients are broken.  It is very frustrating and Ruckus support has been no help.  The R730's are connected to Ruckus ICX 7650's via 5Gb multigig ports.  The switches report no problems and there are other AP's on the same switch at the time that have no issue, so the problem is just random Access Point specific.  The issue is completely random, no pattern can be found, other than support telling us they are seeing AP kernel panics and that they can't tell us why or how to make stop.   

Smartzone 100 version is 5.1.2.0.302 - which support had us upgrade to as they said that would fix the kernel panics - It has not

R730 version -  5.1.2.0.373

A few of the R730's have not been able to recover from this issue after a reboot and have had to be RMA'd.  Some of them will automatically reboot after 15-30 mins, but if we manually reboot them they typically come back online and work.  Was curious if any else is experiencing this issue with R730's, Smartzone 100's, and ICX 7650's?  

Photo of Kevin

Kevin

  • 11 Posts
  • 0 Reply Likes
  • frustrated with support and Ruckus hardware

Posted 2 weeks ago

  • 1
Photo of Mario

Mario

  • 11 Posts
  • 1 Reply Like

Hi Kevin, I have a similar situation, but the reference of AP's and controller is different, however, the behavior is the same, and like you I could not have a diagnosis by Ruckus support.
Photo of Kevin

Kevin

  • 11 Posts
  • 0 Reply Likes
Thanks Mario.  Would you mind telling me the model AP's and version of AP and controller you are running.  It may help me argue with support more as they have been very unhelpful and just keep asking for more and more logs.  Which the logs are all the same and show their AP's doing kernel panics and then the behavior is either a complete lockup of the AP or a reboot.  3 times the units have bricked themselves and even though we ask for the reason on the RMA we get back that they destroyed the units and can't provide a reason as we have to ask for a reason when we open the RMA.  We do that and still don't get a reason.  Horrible support.
Photo of Mario

Mario

  • 11 Posts
  • 1 Reply Like

That's right Kevin, I hope we have positive news soon to solve the issue, best regards.

Photo of Sven Kessler

Sven Kessler

  • 5 Posts
  • 1 Reply Like
We have a similar issue with R730 APs. Everything is working fine for some days but suddenly, clients do no have network access when connected to these APs. Only an AP reboot fixes the issue for some time until it happens again. We have APs on firmware 5.1.1.0.624 and vSZ-H on 5.1.1.0.598.
Right now, we replaced the APs with R510 and everything i working without issues,
Photo of Kevin

Kevin

  • 11 Posts
  • 0 Reply Likes
Sven, do you have a case open with support?  If you don't could you please open one so they understand they are impacting multiple customers.  They tried telling us we are the only customer with this issue.  I am seeing now it appears to be anyone running Smartzone 5.x and AP's of 610 and higher.   I don't have the luxury of replacing all our brand new R730's with anything so I need Ruckus to fix the issue.  I believe the issue is related to Wifi6 (11ax) since there are more and more 11ax devices coming online in the past 30 days, especially with iPhone 11.  I think Ruckus has a memory leak that they don't know how to fix and I have little faith they will fix this soon as Mario posted of the problem 4 months ago and they still have not resolved it.  Their customer support for issues like this is horrible.
Photo of Sven Kessler

Sven Kessler

  • 5 Posts
  • 1 Reply Like
We do not have an open case regarding this right now, because I'm afraid the "standard solution steps" will take so much time compared to the outcome, so that we currently wait for a fix hopefully within the next firmware update.
But I get your point: Is nobody opens a case, there will be no solution. 
I will be on vacation next week and open a call after that.
Photo of Mario

Mario

  • 11 Posts
  • 1 Reply Like
Hi Sven,

Thank you very much for the contribution, I think like Kevin, we hope you help us by opening a case to support from Ruckus since that will show that it is a general problem and will help to give a quick solution.


Thank you,
Photo of Michael Brado

Michael Brado, Official Rep

  • 3012 Posts
  • 425 Reply Likes
Mario, Kevin, Sven, please tell me your case numbers.  We ought to be able to collect logs and AP support info to identify the cause of any problem(s), especially if you see it happening frequently.
Photo of Michael Brado

Michael Brado, Official Rep

  • 3012 Posts
  • 425 Reply Likes
Mario, I saw your ticket 985751, and it says you performed an SZ 5.1.2.0.302 upgrade.  Please let us know if you see another situation, and then try to grab AP support info and SZ logs for tech support, thanks!
Photo of Kevin

Kevin

  • 11 Posts
  • 0 Reply Likes
Ticket 965817 - as of late yesterday our issue has finally been escalated and we are scheduled to work with an escalation engineer this afternoon who is planning on enabling some additional debugging/logging on the R730's and Smartzone.  Hoping they are able to figure this out soon as we still have R730's randomly rebooting.  I will provide the post updates as we make progress so others hopefully don't have to deal with this issue.
Photo of Mario

Mario

  • 11 Posts
  • 1 Reply Like
Best regards to all,

I have not yet shared information since the WiFi network is in a university, in Colombia there was a holiday bridge and during the weekend there is no work there, for this reason it is not possible to have conclusions about the provision.


Thank you
Photo of Kevin

Kevin

  • 11 Posts
  • 0 Reply Likes
R730, reboot, kernel panic, apHearbeatLost, smartzone - Just adding some of the keywords that I was using to search for others with this problem so that hopefully anyone else with this random access point issue might add a comment so Ruckus is aware of all the customers impacted by this issue.
Photo of ian johnson

ian johnson

  • 2 Posts
  • 0 Reply Likes
This is very interesting, I have 3 * R730s, two display uptime of 38days, one restarts daily. Ive been meaning to troubleshoot and this thread prompts me to do that sooner now. 
Photo of Kevin

Kevin

  • 11 Posts
  • 0 Reply Likes
Thank Ian.  Please open a case with them and tell them to reference my case 965817 for details.  They are now trying to capture memory and cpu state with a custom script of the AP's prior to the reboots.  Hopefully the more customers they see opening cases for the R730's rebooting they will figure out the issue and fix it.  We ran R610's and R710's for 3 years without ever experiencing a random reboot.  We were also running ZoneDirector instead of Smartzone so I am not sure which is the actual culprit.
Photo of RF0V1K

RF0V1K

  • 6 Posts
  • 1 Reply Like
Hey Kevin, the R730 I was having trouble at was powered by a secondary switch which doesnt appear to have been able to provide enough power to keep the R730 happy. Moving it to be powered by my primary Juniper ex4300-48p has resolved my reboot issues. I know this doesnt help with your issue but wanted to follow up anyhow. 
Photo of Leonardo Ferreira

Leonardo Ferreira

  • 1 Post
  • 0 Reply Likes
I am also having some problems with the AP R730.
I have two distinct localities with the following problems:
In the first one with the same vSZ version and same firmware, in Access Point, in the Trafic tab, the graph is constantly showing me that the clients are disconnecting and connecting to the AP. Giving me the feeling of false positive.
In the second case, already with vSZ-H with firmware 5.1.1 and APs 5.1.1, all clients in 2.4 are disconnected and connected, this drop lasts a maximum of 3 seconds.
In neither case did I have a support solution, just log collection, and more collections and no solution.
Photo of Mario

Mario

  • 10 Posts
  • 0 Reply Likes
Hello Leonardo,

It is indeed a situation similar to that of everyone in this forum, if it works with the update we do I share to see if it can be solved, if you have an open case with Ruckus please share it since Michael Brado is collecting the cases to analyze from support.

Thank you
Photo of Malcolm Chai

Malcolm Chai

  • 1 Post
  • 0 Reply Likes
We are running vSZ-H 5.1.1.0.589 and R730 5.1.1.0.3028

We had a known memory leak causing APs to reboot randomly.  They just had a temp patch fix about a week ago.

Now we are dealing with random disconnecting.  Student and staff machines will have either a true disconnect from the AP, or will be connected to the AP but not be able to access the internet.  Still working with Ruckus on this.



We are still working on this as we are not sure if this is a compatibility issue with new hardware or another problem in the AP.  


Please continue to update as I am interested in seeing how everyone's problems get resolved.


Malcolm
Photo of Mario

Mario

  • 11 Posts
  • 1 Reply Like

Best regards to all,

 I tell you that with version 5.1.2.0.373 for AP and 5.1.2.0.302 for VSZ the service has been stable, if you want you can try to update your services to these versions and tell us how they are doing.

 Happy day.
Photo of Kevin

Kevin

  • 11 Posts
  • 0 Reply Likes
They provided us 5.1.2.0.1013 for the AP's on Wednesday and hoping that fixes our AP Kernel Panic issues.  I believe there are still 2 other issues we are aware of that Ruckus support is trying to figure out a fix for (apHeartbeatLost alerts that are not accurrate and AP's going Offline but they still have connectivity to the Smartzone per the debug logs they were able to capture on Wednesday).   We did not have any reboots yesterday while running Smartzone 5.1.2.0.302 and AP 5.1.2.0.1013.  We were definitely having problems with AP version 5.1.2.0.373.
Photo of Mark Channer

Mark Channer

  • 1 Post
  • 0 Reply Likes
Any updates with the latest firmware version on this issue? I'm delaying my purchases until I know the 730 is working as expected. Thanks and sorry for the issues all of you have experienced with these APs.
Photo of Michael Brado

Michael Brado, Official Rep

  • 3008 Posts
  • 424 Reply Likes
Have you tried SZ 5.1.2.0.302 (MR2), like Mario above?
Photo of Kevin

Kevin

  • 10 Posts
  • 0 Reply Likes
Yes and thats when the issues seemed to get even worse.  They provided us an AP patch firmware version 5.1.2.0.1013 Wednesday which we applied to all our AP's and yesterday we had NO AP's reboot but still had several alert from "apHeartbeatLost".  They were also able to capture additonal debug logs from an AP on Wednesday that went Offline and confirmed that it still had an SSH tunnel connected to the Smartzone so it obviously had network connectivity.  We have not been provided a fix for that issue but were told that the 5.1.2.0.1013 does fix one of the R730 kernel panic issues.  We are just waiting for the Offline issues to happen or some other R730 problem and hopefully will provide them more logs and hope they come up with a fix.  We currently have Smartzone 5.1.2.0.302 and AP's all on 5.1.2.0.1013.