Skip to main content

76 Messages

 • 

1.1K Points

Mon, Aug 21, 2017 11:01 AM

Creating 2- Node SCI Setup ( Master Node + Data Node)

Hi, 
I have few questions about 2-Node SCI setup where we have more than 3000 APs.

1.) Do both SCI nodes (master and data) have web GUI? 
2.) Do we have to log into both SCI nodes to query data? 
3.) How do controller select SCI node to send data? 
4.) Do these two nodes have replicated data or unique data?
5.) Once AP is sending data to either of SCI nodes, is it fixed for all the time? 
6.) Can we have master and data nodes in two geographical locations?
7.) Do we have to configure both SCI nodes in controller? 

Thanks
Pamuditha

Responses

43 Messages

 • 

780 Points

3 years ago

Hi Pamuditha,

Quick reply below.

1.) Do both SCI nodes (master and data) have web GUI? 
   => No, only the master node has a GUI. Data node is a headless node to parallelise the load (though the data node has a simple GUI for setup).

2.) Do we have to log into both SCI nodes to query data? 
   => No. You only have to access the master node.

3.) How do controller select SCI node to send data? 
   => Job distribution between nodes is handled automatically by SCI, so it's transparent to the controller.
4.) Do these two nodes have replicated data or unique data?
   => They will have replicated data for redundancy. Replication factor is 3 for SCI 3.1.

5.) Once AP is sending data to either of SCI nodes, is it fixed for all the time? 
   => AP does not send data directly to SCI. Data is aggregated at the controller, and SCI will retrieve the data from the controller (for ZD and SZ below 3.4.1), or controller will push the data to SCI (SZ 3.5 and above).

6.) Can we have master and data nodes in two geographical locations?
   => Yes, as long as there are no issues with network connectivity. Please refer to the Installation Guide for full list of ports that have to be opened.

7.) Do we have to configure both SCI nodes in controller? 
   => No, don't have to.

76 Messages

 • 

1.1K Points

Hi See Ho Ting, 

Replication factor 3 means, copy same data between 3 nodes or having replication(something in like RAID) ? 
What I am thinking is if I keep 1 physical server with (Master + data) and 1 physical server with (Data + Data)  is fine for data redundancy? 

Thanks

43 Messages

 • 

780 Points

You are correct on the replication. From SCI 3.2 onwards, we have reduced the replication factor to 2, mainly to reduce storage requirements.

>>What I am thinking is if I keep 1 physical server with (Master + data) and 1 physical server with (Data + Data)  is fine for data redundancy? 
This is correct for SCI 3.1 and below since the replication factor is 3. For SCI 3.2 onwards, with a factor of 2, the answer is yes and no. This is because HDFS may assign the same data block to be stored on two nodes which are on the same physical server.

76 Messages

 • 

1.1K Points

3 years ago

Great and Thanks See Ho Ting. It really helps. 
One more thing. This question may have dependency on the no of APs within the system. Is there any recommended average value per AP or so for below. 
What is the expected Bandwidth between master node and data node communication ? ( 
What is the expected Bandwidth between SCI and vSZ communication ? 

Thanks

43 Messages

 • 

780 Points

The bandwidth requirements will be about 15kB per 15 mins per AP. Of course this is just a ballpark number. The actual number will depend on the number of clients, sessions, etc.

76 Messages

 • 

1.1K Points

Thanks See Ho Ting. 

76 Messages

 • 

1.1K Points

3 years ago

1 additional query as well. With this kind of setup, what happens if master node fails? 

43 Messages

 • 

780 Points

The master-data node cluster is not a HA setup, though it provides data redundancy. So in the event that the master node fails, SCI will stop collecting data, but the data in the data node can be retrieved once a separate master node is setup.

76 Messages

 • 

1.1K Points

So all the communication with node only happens through master node? 
All the data fetching from controller comes to master node and from there it is distributed between master and data node DBs? 

43 Messages

 • 

780 Points

No, the communication with the controller happens in all nodes as the system will automatically balance the load between the nodes.