Skip to main content

116 Messages

 • 

2.3K Points

Mon, Mar 9, 2020 12:10 PM

Radius server unreachable events

vSZ version 5.2.0.0.699 with 411 R710s APs on campus. Every few days we are getting a bunch of radius server unreachable events. What is odd is the details of the event do not even point to our radius server. All events are reported like this

AP [[email protected]:E7:1E:2A:A4:40] is unable to reach radius server [127.0.0.1].

Of course 127.0.0.1 is not our radius server. We are using Cloudpath as our radius server. Our wifi is rock solid so there are no other symptoms other than a rash of these events every few days.

Any ideas what causes this?

Responses

178 Messages

 • 

2.9K Points

9 months ago

We get the same thing, except for it does show the correct RADIUS server (Windows NPS) in the message as well as we are on vSZ 5.1.2.

We have noticed that VM stun occurs sometimes when backups are being performed, but it will at other random times show APs that are unable to reach the server.

74 Messages

 • 

1.2K Points

8 months ago

We're seeing the same thing. All our Radius requests are proxied through the vSZ so there shouldn't even be radius processing on our AP's. It _appears_ to be causing radius server failovers for us but it's not anything the end user notices and we're busy enough at the moment I haven't had time to dig any further into it.

It started at the time we upgraded to 5.2.0.0.699 so is a bug in that code (or the relevent AP code) I'd say.

178 Messages

 • 

2.9K Points

8 months ago

After upgrading to 5.2.0.0.699, we are seeing the same thing as well. It's not causing a server fail-over because we don't see attempted RADIUS connections on our secondary server (we log and graph this).

116 Messages

 • 

2.3K Points

8 months ago

This past Sunday, April 12th starting at 2:20am and continuing to 12:45pm (about 10 hours) I received well over 8,000 emails from the controller about radius server unreachable events. These are scattered across nearly all of our APs (411 total) which are located in 6 separate buildings, attached so 22 different switch stacks. These email stopped Sunday afternoon. I then opened a ticket with Ruckus on this. Working the ticket now, no resolution yet

178 Messages

 • 

2.9K Points

Would be curious to know the outcome of said ticket.

111 Messages

 • 

2.4K Points

8 months ago

The problems with version 5 seem to surface on larger networks. We have several clients with very large production networks, each with several hundred sites, thousands of APs, and multiple multi-node clusters worldwide.  We are responsible for managing and maintaining these production networks so we are very cautious with upgrades.  We do our own testing and we've not found a single version 5 release that we like.  We've kept all client production networks on 3.6.2 with one exception - a single 4-node cluster managing APs and switches that's running 5.1.0.0.496 (the version we dislike the least).

We very much look forward to having a stable and tolerable v5 release one of these days, but because our neck is on the line, we intend to keep the our clients on 3.6.2 until there a there is a release that passes our testing.  If you check the Ruckus support site, you'll also find that TAC's recommended version for SZ or vSZ is 3.6.2.0.222.  

74 Messages

 • 

1.2K Points

8 months ago

Sadly we started our vSZ journey on 5. And we need support for new AP's so need to keep up to date. We're at a couple of hundred AP's and I've found issues on every release but need the new AP support so need to upgrade. I should go an look if they have fixed the email address fields to accept gTLD's longer than 6 chars or if they have fixed the broken multiple realm support of radius based admin logins that they broke with the previous release.

116 Messages

 • 

2.3K Points

7 months ago

I was the one who opened the ticket with support about this issue and they still have not resolved it. We have 411 R710 APs and are seeing two things. Occasionally, maybe once or twice each week, we get a dozen or so radius server cannot be reached events. Twice now though this has been a cascade of these events. Just yesterday I received literally thousands of emails with this same radius servers cannot be reached event. These continued overnight with thousands of more events. The last time this happened the only way I could stop them was to reboot both of my vSZ controllers which I am in the process of doing right now.

I think this is bug in the code but have not heard this from support

74 Messages

 • 

1.2K Points

7 months ago

I've done a lot of digging on this one and I _think_ our primary issue is that NPS ignores/discards packets with attributes it doesn't support/understand. There isn't any way you can tell it to send reject messages to these requests and so, if you deal with a lot of BYOD devices that aren't configured correctly you're at their mercy. Enough of these request cause failovers

Longer description here
https://community.jisc.ac.uk/groups/eduroam/article/improving-reliability-microsoft-nps-authentication-provider-eduroam

Sadly we're also seeing GPO configured windows clients not being assinged the right VLAN when they roam between AP's which might be related to Radius failovers. They drop down to the default VLAN assigned on the SSID not the radius assigned one. Which if course changes their IP and causes havoc.

3 Messages

 • 

80 Points

6 months ago

I'm seeing the exact same problem with our vSZ with version 5.2.0.0.699. 
Unable to reach radius server (127.0.0.1).

After vSZ-reboot it will work again but after a while it will be the same again.
This really need to be solved asap!

116 Messages

 • 

2.3K Points

6 months ago

I have had a case open for weeks about this. Ruckus support told me two things just a few days ago
1. This is a bug in vSZ firmware 5.2.0.0.699 which is listed as GA (General Available). This bug manifests itself in the AP trying to reach our radius server which happens to be Cloudpath. The AP should only be trying to reach our vSZ and not directly trying to reach Cloudpath
2. I should be using an MR (Maintenance release) of vSZ which as the name implies a release that has had more bug worked out

On our call a few days ago the engineer disable the emailing of radius unreachable events. I have since reached back out to them (should have asked on the call but did not) if I should back rev our controllers to an MR release. I should hear an answer this week

3 Messages

 • 

80 Points

Yeah I know about the GA and MR releases but Radius server support is pretty essential and "should" work correctly even in a GA release in my oppinion.

Anyway - if you get some more info in this case about solutions or updates/patches coming soon, please put it up here

74 Messages

 • 

1.2K Points

6 months ago

I find is somewhat surprising you've been directed to use an MR release. How exactly are you supposed to use new AP's on an MR? As far as I'm aware the R650 is only supported on 5.2.

Also, GA is GA, it's released to the public. Sure, an MR is going to have issues fixed, but any GA release shoudl have been based off the previous MR. It's like they are building each GA from the fround up with new code.

Just disappointing really. I keep upgrading as I keep hoping more bugs are fixed than are introduced. So far, not a lot of luck in that regard

129 Messages

 • 

2.4K Points

6 months ago

On my side, APs not using proxy are also showing the alarm (radius server unreachable - with the radius IP). From a user experience, it _seems_ to be ok but the alarms are quite annoying to say the least.

It _seems_ as the newer code is very agressive with the response latency for radius.  Also, they don't perform any liveliness checks (at least not in non-proxy mode) which other devices usually do.

Plus, in direct mode at least, the timeouts / retries are not configurable at all.

116 Messages

 • 

2.3K Points

6 months ago

I was told by Ruckus support that this is a big in the newest vSZ code and will be rectified when an MR release to the 5.2 code comes out. He was not sure of the timeframe

116 Messages

 • 

2.3K Points

5 months ago

Still waiting for the next release of firmware for vSZ which I was told was going to fix this issue

116 Messages

 • 

2.3K Points

5 months ago

Still waiting for the next release of firmware for vSZ that I was told would fix this issue