R500 APs keep losing heardbeat, disconnecting and then reconnecting each day at the same time, why?

  • 1
  • Question
  • Updated 8 months ago
  • Acknowledged
Hi.  My R500s keep disconnecting and then reconnecting each day at the same time (6:01pm) according to the monitor log.  It's really odd and they just started doing this recently.  Perhaps I made a change to a configuration or something that caused this?  I don't use mesh, have 3 R500s and a ZD 1200 connected to Verizon FIOS Gigabit internet and have setup separate 2.4gb and 5gb networks.  Thanks
Photo of Christopher Hayward

Christopher Hayward

  • 8 Posts
  • 1 Reply Like

Posted 10 months ago

  • 1
Photo of Michael Brado

Michael Brado, Official Rep

  • 2990 Posts
  • 415 Reply Likes
It might take a wired trace between AP and controller, during the time these lost heartbeats are reported to get the clearest picture, otherwise, do you see anything in the controller logs?
(Edited)
Photo of Michael Brado

Michael Brado, Official Rep

  • 2990 Posts
  • 415 Reply Likes
From your ZD's Administer/Diagnostics page, click the link under AP Logs, and save the file.
Parse thru it for your AP's mac address with an editor like NotePad++ and you should see connect/disconnect events.
Photo of Jeronimo

Jeronimo

  • 337 Posts
  • 40 Reply Likes
exactly 6:01? everyday same time?
(Edited)
Photo of Michael Brado

Michael Brado, Official Rep

  • 2976 Posts
  • 415 Reply Likes
Need a wired network trace too... is that the time the File Servers at your company all do their backups, etc.  Sounds like something (multicast?) could be flooding the LAN at that time each day...?
Photo of Christopher Hayward

Christopher Hayward

  • 8 Posts
  • 1 Reply Like
Yes every day at the same time. So weird. Did I mess up a configuration or something? I will check the logs too. Thanks
Photo of Jeronimo

Jeronimo

  • 321 Posts
  • 37 Reply Likes
I think the same as Micheal.

Check for excessive multicast packets at that time though packet mirror.
Photo of Michael Brado

Michael Brado, Official Rep

  • 2864 Posts
  • 399 Reply Likes
And not likely your configuration, if you didn't change anything.
Next question: where are the APs located, relative to your ZD?  Are they on the same LAN in the same building, or remote over a WAN link? 
(meaning could it just be remote APs that are disconnecting?)
Or...If the ZD and APs are all on the same LAN at the same location, are there other servers/services using the same subnet that could be broadcasting or transmitting large files around 6pm?
Photo of Christopher Hayward

Christopher Hayward

  • 8 Posts
  • 1 Reply Like
My setup is simple at our home. The ZD1200 and 3 wired R500s. No mesh. Have been using it well for 3+ years. Only recent big change was upgrading to FIOS Giga internet speed which Is handled well by both my wired and wireless devices. Don’t have any daily 6pm file transfers or anything like that. Logs just say headbeat lost and then the APs reconnect quickly so no real service impact except for that one minute or so.
Thanks
Photo of Michael Brado

Michael Brado, Official Rep

  • 2955 Posts
  • 414 Reply Likes
Hi Christopher, yep, sounds like simple home LAN and not sure why APs would say they are disconnect/reconnecting.
Please pick a MAC address from one of your APs in the Monitor/Access Points page, and enter it in Administer/Diagnostics page, under Debug Logs section, with Access Points box checked, then Apply.
Tomorrow (or when you see a bounce) click the Save System Log under the System Logs section, for a file you can parse thru with NotePad++.
You can also 'click here' and look for your AP activity in the box.
We'd like to know what events are occurring around 6pm, if you take a look at 6:15 or so.
Photo of Christopher Hayward

Christopher Hayward

  • 8 Posts
  • 1 Reply Like
Hi.  So I decided to reboot or power cycle my ZD 1200 to see if it might help.  Everything came back up online as it should including the 3 wired APs.  But now my 3 R500 APs lose heartbeat and reconnect at the time I power cycled the ZD!  So strange.  I did the original ZD power cycle at 9:27am and now for the last couple days the APs go offline and come right back online at that time.  I can't understand this behavior - so odd.  Any ideas?  Thanks.
Photo of Christopher Hayward

Christopher Hayward

  • 8 Posts
  • 1 Reply Like
Hi.  The daily disconnecting and reconnecting of my 3 APs is still happening.  Each day precisely at the same time.  This is a home setup so nothing odd going on on my network. 
Example from my monitor logs:

2019/02/08  19:13:33Medium AP[Family Room] joins with uptime [2851541] s and last disconnected reason [Heartbeat Loss]
2019/02/08  19:13:31Medium AP[Susan Office] joins with uptime [2851512] s and last disconnected reason [Heartbeat Loss]
2019/02/08  19:13:22Medium AP[Harrison Room] joins with uptime [2851486] s and last disconnected reason [Heartbeat Loss]
2019/02/08  19:13:13Medium AP[Family Room] heartbeats lost
2019/02/08  19:13:11Medium AP[Susan Office] heartbeats lost
2019/02/08  19:13:02Medium AP[Harrison Room] heartbeats lost
2019/02/07  19:13:16Medium AP[Susan Office] joins with uptime [2765099] s and last disconnected reason [Heartbeat Loss]
2019/02/07  19:13:15Medium AP[Family Room] joins with uptime [2765124] s and last disconnected reason [Heartbeat Loss]
2019/02/07  19:13:03Medium AP[Harrison Room] joins with uptime [2765069] s and last disconnected reason [Heartbeat Loss]
2019/02/07  19:12:56Medium AP[Susan Office] heartbeats lost
2019/02/07  19:12:53Medium AP[Family Room] heartbeats lost
2019/02/07  19:12:43Medium AP[Harrison Room] heartbeats lost


Any ideas?
Many thanks.
Photo of Tony Heung

Tony Heung, Official Rep

  • 11 Posts
  • 3 Reply Likes
Hi Christopher,

From reading the above logs, checking the uptime, eg AP (Susan Office) on Feb 7th, is 2765099 .  This equals to 32 days uptime, meaning the last time you rebooted the ZD and AP was 32 days ago (from Feb 7th) at 19:13 (assuming you had NTP and timezone setup correctly).

And the next event of the same AP has uptime 2765099 which is 33 days, aka ~86400 seconds after the first event.

It would interpret the AP is up and running serving client with no service interruption (unless you do tunnel mode otherwise local breakout traffic would not be impacted by the AP mgmt ip issue), and it is just the ping to the AP mgmt IP address is somehow lost.

As it is almost a perfect number of 86400, I would make one guess which is the DHCP leasing time.  Are you running DHCP Server on the ZD to provide IP address to the AP?  If so, please check the DHCP parameter particular the lease time.  You can find it from Configure -> System -> DHCP Server -> Lease Time.  The options available is 6-hr, 12-hr, 1-day, 2-day, 1-week or 2-week.  If it is currently set to 1 day, it is highly likely the lost of heartbeats is due to the expiry of the dhcp lease to the AP IP address.

If it is the case, I suppose you have a flat network, and all your home devices are using ZD as the dhcp server, then check the pool size and if it is big enough even though at the end of the lease expiration, it should still allow the AP to maintain the original IP address rather than force it to renew.

Let's see if it is moving to the right direction of the root cause, before we discuss to come up the best option moving forward including options to use external dhcp server (eg: the verizon router).

--tony