Having major random drops/slow down with R700's

  • 1
  • Question
  • Updated 2 years ago
  • (Edited)
Alright let me begin to explain a problem we are having with our Ruckus system. I'm just going to start at the very beginning and share all our experiences so I can get the best help on here. 

So I started a job about 8 months ago at a school and one of the problems they have apparently always had was the wifi. It was slow, disconnect, etc. Well the school finally gave up some money for us to buy a decent system. We purchased a bunch of R700's to go with our Ruckus system.

At the very start of the year we were having really really bad connection issues with everyone. We updated our firmware and then everything was solved. It went 2 months without hardly a single problem! I was very impressed. Then after those 2 months problems started popping up again. For the past two weeks we are having an issue where the computer is showing that it is "connected" but has a yellow triangle with an exclamation mark over the wifi symbol.  So what happens is when that triangle shows up the access point acts like it's flooded or can't communicate. I trying pinging any other device on the network and many of the packets are lost, sometimes they take 500ms to 4000ms, and then other times they go fast and I'll get a 20ms response. All of that can happen in one ping. I have seen this happening when there were less than 5 devices on the access point I was connected to.

At first it was only reported by one person that they were having issues. I changed out the access point with a different one and everything seemed solved. Well it wasn't. The problems were happening all over the place the next day. 

So our first thought is the network is being flooded somewhere or maybe there is a loop. So as the connection problem was happening on my laptop, I turned the wifi off and plugged it in via ethernet to the same switch that the access point was on. Everything worked great if I was connected on the wire. Sometimes I was getting pings of <1ms! We went to every switch we had and looked at the logs on the switches. There was nothing leading us to find any heavy amount of traffic. Every person we asked that was using an ethernet connection had no problem.

Another thing to note is the problem comes and goes seemingly randomly. We are struggling to understand why this problem happened when we didn't change anything with our wireless or the rest of the network.

Channelfly is currently set to turn off after the access points have an uptime of 5 mins. So it's not like the access points are switching channels. Also I understand that interference can cause problems like this but it's happening not only to access points that have neighbors but also to access points that have no neighbors and are completely isolated. Also considering that we were working so great for 2 months has me wondering if it was interference why would it only show up now? One more thing. The connection problem we are having now is not the same as we were having at the start of the year. At the start of the year you just could even connect up and some access points would "freeze up" until your rebooted them. This is a very consistent problem in a sense that it has the same issue occur when it happens on all devices.

We have also rebooted several of the access points and even the Zone Director after hours in case that would fix anything. Tried updating to the latest recommended firmware as well. No change.

If there are any suggestions let me know and if you have any questions or want me to troubleshoot something let me know. I didn't include every troubleshooting step we have done because there is too much to include. Please help.... :(

EDIT: Forgot to include that when the problem happens we aren't solely relying on ping results to show the problem. We are trying to load web pages etc. I understand pings are not a high priority and thus can be slow to respond if other traffic is being handled. In this case though they are showing a good example of what we are seeing.
Photo of Random Person

Random Person

  • 7 Posts
  • 0 Reply Likes
  • like quitting my job if these problem can't be fixed.

Posted 2 years ago

  • 1
Photo of Dave Watkins

Dave Watkins

  • 64 Posts
  • 13 Reply Likes
Can you include some more info, things like what model ZoneDirector you're using and the firmware version, how many AP's in total, how they are powered (from POE switches or via injectors). Is this restricted to one SSID or multiple SSID's and finally how many clients are on the AP when this happens? Any idea if this happens to both 2.4 and 5Ghz bands or just one of them?

Have you looked at the bandwidth graphs of the AP when this occurs?
Photo of Random Person

Random Person

  • 7 Posts
  • 0 Reply Likes
It is a ZoneDirector 3000 running 9.8.3.0.14. There is about 36 total access points in two different buildings. The buildings are about 1 mile apart but have a fiber connection so they are on the same LAN. In the one building we have R500 AP's and nobody has mentioned anything about the wifi being an issue there but there aren't many people consistently using it. All the AP's are powered by POE. The issue occurs on any of our SSID's (there are 3 in total). The amount of clients can range from a full classroom (30 students) or in some cases I'm the only one connected testing and it still happens. The 2.4Ghz and 5Ghz are both affected. The 2.4Ghz is usually worse but both have had the issue occur. I will monitor the bandwidth graphs more extensively tomorrow and look for anything suspicious.
Photo of Dave Watkins

Dave Watkins

  • 64 Posts
  • 13 Reply Likes
You may want to look at the release notes for 9.10 and 9.12 and see if anything stands out. I don't have R700's here, but I remember seeing various R700 fixes in one of the more recent major releases
Photo of Random Person

Random Person

  • 7 Posts
  • 0 Reply Likes
Alright. We have some extra R500's. We are going to test by swapping out some of the R700's in strategic spots to know if they have the same issue or not. If not we know something is going on with the R700's. I'll make sure to post our findings here.
Photo of Andrea Coppini

Andrea Coppini

  • 66 Posts
  • 29 Reply Likes
It's probably a broadcast or multicast storm on your wired lan.

Plug in a laptop into a switch and run Wireshark, you will probably see lots of packets with a destination of 255.255.255.255 or ff:ff:ff:ff:ff:ff when the issue happens.
Photo of John D

John D, AlphaDog

  • 499 Posts
  • 137 Reply Likes
I've seen these types of symptoms on 5GHz before on R700 with 9.8.0 and 9.8.1 firmware, but my understanding is that's been fixed in 9.8.3. Furthermore, hearing that both 2.4 and 5GHz are affected makes me think the problem you are experiencing is different and is more along the lines of heavy broadcast/multicast traffic.

Nonetheless it might be worth trying the 9.12 or 9.10 MR releases. With 802.11ac AP's, I've noticed much more consistent performance on the recent 9.12 MR2 release.
Photo of Random Person

Random Person

  • 7 Posts
  • 0 Reply Likes
We are currently testing the R500's in areas where we swapped them in. So far there have been no reported problems for people connected to them. We have had a computer sit on one of the R500's and do a ping 150 times. Not a single packet lost. Go to a spot with an R700 and about 18-20% packets are lost. We have checked all our switching closets and logged into our switches and see no sign of a broadcast storm. Tried using Wireshark and didn't find anything strange. Also our separate building (still on the same LAN and using the same ZoneDirector) that has only R500's has not had any issues. I'm starting to think there is something going on with the R700's.
Photo of Random Person

Random Person

  • 7 Posts
  • 0 Reply Likes
Also to clarify more, we put in the R500's in the exact same spot using the same wiring, ethernet connection, etc where the R700's were.
Photo of Michael Brado

Michael Brado, Official Rep

  • 2183 Posts
  • 301 Reply Likes
You're limited to 9.8.3 firmware if you have 7962 or 7025 model APs in use.

Otherwise, the latest R700/R710 enhancements are in both 9.10.2 and 9.12.2 maintenance releases.
Photo of Random Person

Random Person

  • 7 Posts
  • 0 Reply Likes
We do not have any of those. Only R500's and R700's. We are going to test more and see to truly make sure the R500's are having a positive effect. If so we will upgrade our firmware and see if the R700's get better. Is there any recommendation if we have only R700's and R500's? I just don't want to jump to another version that has it's own set of problems.
Photo of Random Person

Random Person

  • 7 Posts
  • 0 Reply Likes
Well it's hard to tell if the R500's helped or not. A user still reported that the same issue occurred and we verified that they were connected to one of the R500's. We are going to follow the upgrade path given to us by support and go to 9.10.2.0.11. I will report back how things go after that for future readers experiencing the same issues.