Is Channelfly usable in real world scenarios?

  • 3
  • Question
  • Updated 2 years ago
  • Answered
While ChannelFly is a truly remarkable technology, I'm wondering if people have found that it's usable in real-world scenarios.

For better or worse, Wi-Fi environments tend to be BYOD almost by definition, so we don't have too much control over what users show up with. And while one could work with people individually to update drivers, or have discussions about "It's not us it's them", the end result, in our experience has been end users saying "your Wi-Fi sucks", which is obviously the opposite of what we're trying to accomplish with a Ruckus deployment. (And according to Tom's Hardware it's supposed to be the other way around). In public Hotspot scenarios we'll never even hear from people whose connections didn't work, they just won't come back....

So I'm looking for guidance here from both Ruckus and from other VARs/Customers. Have you found that in anything but a completely controlled environment you safely turn on ChannelFly and have a high success rate with end users? If not, do you see this changing over time?

Either way it seems like it might be a good idea for Ruckus to put a whitepaper out there with this discussion. Customers or new VARs that are deploying this for the first time who end up with ChannelFly on by default are going to get an extremely negative first impression of Ruckus, which is really too bad because the technology overall is truly remarkable, but this one checkbox, in our experience, can change the overall experience dramatically.

Jeff
Photo of Jeff Roback

Jeff Roback

  • 25 Posts
  • 8 Reply Likes

Posted 4 years ago

  • 3
Photo of Primož Marinšek

Primož Marinšek, AlphaDog

  • 413 Posts
  • 48 Reply Likes
AFAIK CF works fine. We've just had a big conference this weekend. 3APs in a basketball stadium, 2000 people, at peak 300 concurrent users and no calls for support. CF was ON the whole time.

But I too was not fond of CF when it was first implemented (FW 9.4 i think). It just didn't seem to work as it should. So I'd suggest using FW 9.6.1.

But here's a quote from Keith Parsons (CWNE #3). He's like a WLAN god of gods.

"Lack of end-user complaints about WLAN do not show how well WLAN works no more than complaints show it is a WLAN issue.
Use Actual Metrics!"

and

"Users complaining is a metric to know SOMETHING - but not WHAT is wrong. They blame Wireless first because it is unseen."

What kind of metrics have you used to say that RW is at fault. I can tell you from many personal experiences that it's almost always something but RWs fault.
Photo of Jeff Roback

Jeff Roback

  • 25 Posts
  • 8 Reply Likes
Thanks for sharing your experience with this. It's great to hear this is working better in 9.6.1. I've was scared off since the early revs and haven't tried it again since. Are you using it in 2.4 and 5.0 bands or just 5.0?

I agree with your complaints vs. metrics thoughts, but if end users are getting disconnected, I think fault is less relevant than options: It's kind of like kid proofing a home; if I can stick a cover over an electrical outlet to keep the kids from getting zapped, that's a win, even if it does decrease performance (take longer to use the outlet). So if I have a 'setting' to use (outlet cover) that keeps the kid from getting zapped, I'd rather do that than explain to mom that it's the kid's fault and the outlet is working as designed :-)

So my overall thought is that if turning on channelfly will cause a certain % of byod users to have a bad experience, then I have to leave it off until a critical mass of end-users have the right drivers, etc to make it work. But based on your feedback it sounds like this isn't the case anymore, which is fantastic. Will be curious to hear feedback from others.

On a side-note, recent Intel driver releases seem to be so buggy (random disconnects, etc) that people are being forced to get current drivers anyway just to stay connected to any WiFi, so perhaps this will get this problem solved....

Jeff
Photo of Bill Burns

Bill Burns, AlphaDog

  • 203 Posts
  • 38 Reply Likes
I use this command on syslogged traffic to see AP channel-change events.
fgrep -C0 -i -e interference -e "switch from channel"
Often what I see is an individual AP hopping back and forth between channels.

I do not use channelfly and I still have issues with APs changing channels. (due to "interference")
I have about 66 APs and I see 14 AP channel changes so far today. (between 4:30am and 1:50pm)

As far as I can tell, the Ruckus controller is not consulted and/or not very aware of where the APs are in relation to each other.
If it was, I'd prefer the controller to come up w/ a reasonable channel-plan and only make changes when the wireless environment changes significantly.
(When new APs show up or disappear)

Instead, whether ChannelFly is enabled or not, APs seem to make these channel-hopping decisions on their own, without regard to what the other APs on the controller are doing.
ChannelFly changes can be made in response to relatively short-term bandwidth usage changes. (like a large video download)
This results in unstable channel assignments where one APs channel change can trigger a channel change in an adjacent AP and in turn, that AP's channel change can trigger a change in the next AP, and the next.

(if anyone has a different understanding of this, please let me know)

I already hard-coded channel assignments in 8 APs in one building to cut down on this.

Once I turned on channel-fly (for a few hours) and the number of AP channel changes became ridiculous.

I did not get any end-user compaints but (as Keith points out) this is no proof that problems did not result.

Ruckus says this should calm down after a day, or a few days.

IMHO: ChannelFly is more of a "carrier" type feature, useful in metro-wifi deployments, or in known "dirty" wifi environments where you'd expect to be competing w/ other wireless APs, where momentary client outages are expected/reasonable, and having your APs pick non-standard wifi channels is a reasonable strategy.

In a "cleaner" "corporate" "enterprise" type environment, ChannelFly seems undesirable.

Speaking of metrics... I did a search on keith's name and "metrics" and got no significant results.
Also, I talked to a few resellers before making my Ruckus purchase re: some way (any way) to get an objective measure of how Ruckus wifi performed in my environment vs how the existing cisco stuff performed.

Other than the "you should try a whole bunch of clients at the same time" style of stress test I got no useful feedback.
(for example, how do I *know* that an auditorium full of clients is working or not when I *know* I can't count on any of them bothering to report a problem?)

If anyone has suggestions re: gathering metrics for Ruckus, cisco or other wifi networks, please let me know.

-Bill
Photo of Keith - Pack Leader

Keith - Pack Leader

  • 860 Posts
  • 50 Reply Likes
Urgh? "Speaking of metrics... I did a search on keith's name and "metrics" and got no significant results. "

I'm 1.9 meters tall for the record.
Photo of Bill Burns

Bill Burns, AlphaDog

  • 203 Posts
  • 38 Reply Likes
I was searching "Keith Parsons", but thanks for the initial data point.

When comparing to other manufacturers, should I prefer wireless reps who are taller or shorter than 1.9 meters?
Photo of Keith - Pack Leader

Keith - Pack Leader

  • 860 Posts
  • 50 Reply Likes
I think you want: http://www.wlanpros.com/wi-fi-stress-...

(Further to the record I am slightly taller than the other Keith..but he is much wiser)
Photo of Bill Burns

Bill Burns, AlphaDog

  • 203 Posts
  • 38 Reply Likes
Thanks for the link.
That's useful info. (which I've seen before)

But, as-per the doc:
"This test is NOT the best possible way to evaluate access points"
"This is a simple test"
"This isn’t reflective of real-world"

Did one of my posts get lost here?...
Let me try to reconstruct it:

What I'm really looking for (re: metrics) is something that will tell me how *my* wifi network is working.

I gather a lot of data, but actionable results are hard to come by.

Ideally (when I was comparing products and asking resellers about this) I would have found a tool that provided information like:
how many clients are associated, how many clients are trying to associate (but can't), how many clients are sleeping/power-saving/not-talking, how many clients are working well, how many clients are getting poor throughput/performance.
Bonus points for protocol or spectrum analysis that was correlated to these things.

Ideally, I'd be able to take a 3rd party tool like this, test w/ one set of AP gear, then test w/ another set of AP gear, compare results and then make a purchasing decision.
I'm pretty sure I'd have given my business to whatever reseller could have provided an objective testing method like this.
(as opposed to the "why don't you test with a lot of clients" kind of stress test)

I hear that wireless intrusion prevention tools may provide some of these features, but I have no specific info.
(and would like to hear more about it if anyone has this knowledge)

Now that I'm post-purchasing-decision, I'd still like to gather information to monitor the health+performance of my network because I know I can't count on my users to report wifi problems.
Photo of Jeff Roback

Jeff Roback

  • 25 Posts
  • 8 Reply Likes
Unfortunately, I'm not sure any sort of predictive analysis is possible.... no matter what you capture during testing the air space won't be a constant and so there's no way to exactly simulate during a test what will happen druing live scenarios. Unless it's a high security installation where people check all RF devices at the door, there's no way to predict what will the air will look like one day to the next in terms of people showing up with MiFi AP's, Bluetooth, etc.. Or when everyone in the room will suddenly turns their ipad sideways because the presenter asks them too.... And then there's the variable of Mac/Win driver versions and ios/android versions... each of these is going to change hour by hour, forever. And of course in the process, Heisenberg's Uncertainty Principle kicks in: by the very act of poling individual devices or installing log capture agents you're modifying the environment so you're still not capturing what would have been happening if you weren't monitoring.

But more importantly, I don't think it would really be possible to even do a definitive ongoing analysis of the environment. it would require an agent on every device to capture both client and AP side logs and/or SNMP data and then to cross-reference them which would be impractical in all but the most controlled environments.

In my experience in a variety of different environments, WiFi has become very much an Art+Science endeavor again. The best we've been able to do is do extensive testing during installation to confirm throughput (ruckus ios tool works well for this). Post install we regularly look for clients with poor RSSI/SNR or to ensure that we've got sufficient coverage over the noise floor where the clients are operating.

Ruckus has posted a detailed SNMP monitoring best practices document, but we haven't spent too much time investigating this yet. Syslog parsing would be interesting, but there's currently a bug/feature that spams the syslog with cluster events, so that's tough to do. Hoping that gets fixed someday. (Right now there's a funny tech note that says if you don't want 25,000 messages per hour telling you the cluster is OK, just set the log level to critical only so you don't see error messages https://support.ruckuswireless.com/an... ).

All this being said, we've been EXTREMELY successful with well configured Ruckus gear. Where other vendor's solutions have left end users with constant frustration, post Ruckus deployment the wireless just works.
Photo of Bill Burns

Bill Burns, AlphaDog

  • 203 Posts
  • 38 Reply Likes
re: predictive analysys:
That's not what I'm looking for.

I want point-in-time analysis, either by SNMP/syslog or by way of some external tool. An external tool could be a wifi sniffer type device, and/or something that analyzes traffic coming through the wired ports the APs are connected to.

re: RSSI
Unless there's been a very recent update, the reported "RSSI" values are actually SNR values, which hoses-up geo-location if you're expecting real RSSI.
(maybe that's what you were referring to by "RSSI/SNR"?)

re: SNMP best practices:
Huh? What?

I'm not aware of an SNMP best practices document.
Is this included in some other document?
Please send me a link if you have one.

I've been using the scripts here:
https://github.com/bot779
to gather SNMP data and analyze logs.

ruckus-zdroamers (reads syslog data, takes a long time to run) and shows me what devices are roaming the most.
There's certainly a large volume of syslog messages, but aside from having to wait for results, I haven't found that to be an issue.
If I first specify an individual client or AP for analysis, the results appear very quickly.

getruckus (uses snmp) and shows me what clients are associated to what APs.
I schedule this via cron to log the number of clients I have per AP over time.

ruckusconf and buildclientos.txt
automated the downloading of arbitrary ruckus command output and builds a list of client MAC addresses, showing the operating system running on each client.

The ruckus wifi situation here is relatively good (judging by user feedback) but I'm still chasing methods to "prove" that's the case since I can never count on users giving adequate feedback.

(example being an auditorium of 200 to 300 ppl w/ flakey pre-ruckus wifi but no "complaints")
Photo of Rob Coote

Rob Coote

  • 37 Posts
  • 6 Reply Likes
I too experienced some problems with the early iterations of CF, most notably AP's changing channels way too frequently (20-30 times per hour). It has since been disabled and I may turn it back on now that we are on version 9.7 just to see if any improvement has been made.
Photo of GT Hill

GT Hill, Employee

  • 1 Post
  • 0 Reply Likes
ChannelFly is definitely not something that should or needs to be turned on in every situation. It really shines in heavily congested Wi-Fi areas. For example, when there are a lot of other access points / networks it really rocks. In cleaner, non-high density environments it typically doesn't need to be enabled.

A new advancement in CF is a MTBC (Mean Time Between Change) setting. You can set it to not change channels as often.

My opinion is that if your network is operating fine without it, don't turn it on. But in cases where interference is prevalent, ChannelFly is a network saver.

GT Hill
Photo of Primož Marinšek

Primož Marinšek, AlphaDog

  • 413 Posts
  • 48 Reply Likes
For 24/7 network monitoring and performance I would suggest you check this out

http://www.7signal.com/
Photo of Bill Burns

Bill Burns, AlphaDog

  • 203 Posts
  • 38 Reply Likes
I opened another question here:
https://forums.ruckuswireless.com/ruc...

Please add whatever experiences you have with 7signal there.

Thanks.
Photo of Keith - Pack Leader

Keith - Pack Leader

  • 860 Posts
  • 50 Reply Likes
There may be a "best of both worlds" opportunity in that 9.7 adds a feature to have ChannelFly take effect when the AP first boots and then shut down after n minutes (user selectable). See below

Photo of Monnat Systems

Monnat Systems, AlphaDog

  • 704 Posts
  • 150 Reply Likes
there should be something similar for standalone AP's too...
Photo of DSE

DSE

  • 59 Posts
  • 3 Reply Likes
I use this option for a while like Keith suggested with good results. I use 1 hour
Photo of ThX

ThX

  • 128 Posts
  • 2 Reply Likes
I cannot find the "Turn off channelfly if APs uptime is more...." option in 9.8.2.0 build 15

Anyone?
Photo of Max O'Driscoll

Max O'Driscoll, AlphaDog

  • 301 Posts
  • 70 Reply Likes
In ZD GUI Configure, Acess Points, Access Point Groups...choose group, edit...18th item is ChannelFly setting.

Photo of Bob Williamson

Bob Williamson

  • 21 Posts
  • 2 Reply Likes
Keith,

I have enabled that "turn off channelfly if APs uptime is more than x minutes" and it works fine.

What is not clear to me is:

Does the AP continue to use background scanning after channelfly is turned off?

Do the APs benefit from Channelfly and choose the best channel based on the channelfly statistics?

etc.

Thanks,
Bob
Photo of Bill Burns

Bill Burns, AlphaDog

  • 203 Posts
  • 38 Reply Likes
Here's my unofficial (best guess) info:

Information from background scanning of other channels (if it even occurs) is not used to initiate a channel change when channelfly is turned off.
(note: I believe that channelfly does attempt to do some predictive stuff. i.e: if channel "n" was congested for the last 2 days at noon, then avoid being on that channel today at noon)

However, when "interference" is detected, an AP might decide to do a channel change and then do a quick background scan before making that change.

AFAIK: these decisions are made entirely by each AP without consulting other APs or the controller.

Time for another entry on the wish list:
I wish there was a mode where the controller had more "control".
Photo of Chris Roberts

Chris Roberts

  • 3 Posts
  • 0 Reply Likes
In 9.7 after channel fly runs for 10 minutes does background scanning start?
Photo of JB

JB

  • 2 Posts
  • 1 Reply Like
I think the new ability to turn ChannelFly off after a period of time may be a good solution. You should now be able to get the benefit of ChannelFly, but then effectively disable it from continuing (unless you want it to). My experience in hospitality has shown that still too many client devices do not always have 802.11h capability, which affects their ability to work well with the sometimes rapid channel changes caused by ChannelFly.

I am a big fan of ChannelFly, but find old client devices appear to be the weak link and those are the ones typically complaining of connectivity drop-outs when CF is running.

With this new control, you should be able to have your cake and eat it too. Leave CF run for a measured period of time, then disable it without having to remember to do it manually.
Photo of Jeff Roback

Jeff Roback

  • 25 Posts
  • 8 Reply Likes
I'm wondering if some of this behavior is hardware dependent. I've found recently that the 7363's seem to completely ignore the mtbc command. This is a big deal in standalone units since you can't turn channelfly off. We run into this when we're doing 1 or 2 AP's in very small SMB installs or executive residential installs where a ZD isn't practical.

Is there a CLI equivalent for the turn off after x minutes in the standalone AP's? If so I'll give it a try.
Photo of Keith - Pack Leader

Keith - Pack Leader

  • 860 Posts
  • 50 Reply Likes
Doesn't look like there's a CLI equivalent. Are you running the latest release? You can assign static channels - which has the effect of turning ChannelFly off.