STFC SOC Project: network meeting

Europe/London

Tuesday 21 Sept 2021   9:30 – 10-40

 

STFC SOC technical Meeting Minutes

Attending

[David Crooks - DC]                           [Jonathan Churchill - JC]

[Greg Corbett - GC]                           [Alastair Dewhurst - AD]

[James Adams - JA]                           [Anish Mudaraddi - AM]

[Philip Garrad - PG]                           [Ian Colier - IC]

[Olivier Restuccia - OR]

Announcements

This meeting will be weekly meeting with a rotating set of people depending on needs

Agenda for today: go through where we are and what needs to be done next

Discussion

Where the rack will go and how to access it:

DC: bringing taps to HPD room, use of rack 146 in UPS room for hardware

PG: if can get to HPD should be able to get to UPS room as well yes

DC: rack location is final thing to decide with AD and Martin, need to check that there is capacity to run the compute load from the UPS room

JA: the rack should have different locks to the other racks and there should be a procedure with key handling

PG: good idea to have an audit trial, room is fairly CCTV covered

AD: rack 146 is under Martin Bly’s control, nothing powered on in that rack, in a good location, is empty, would add the issue of space if HPD was chosen. Are there any power limits to follow in UPS room – looking at 5-6kW for the rack?

JC: 6kW is at the top end maybe? Need to check with Ops

AD: the WN have power use of 2kW per node which may push power use higher

JC: thought plan as to use HPD rack – Jasmin releasing racks. In UPS, need to see if that rack is next to another conveniently empty one. Rack on end of row 9 and row 10

 

PG: what is scale for easiness in UPS room?

AD: based on previous Tier1 network knowledge and getting connections from HPD to site core, is mainly a logistical issue

PG: might be a balancing act, don’t rule out HPD room, fiber will need to be laid wherever they’re put

JC: J1 245 is full of ancient equipment, 269 opposite may also be free

PG: will look at ease of cabling to 146 in Ups and 245 in HPD room, that plus availability will be basis on which to make the decision of which rack to use. Info by end of week

 

Proposed rack layout:

 

DC: traffic brought in from taps to SN2700 then spilt to the 12 zeek WNs, now instead of 25 Gbit into each node this would allow each node to have 50Gbit full duplex. Each node also has 1Gbit link split into two virtual interfaces would go to S3048 and then the Core network firewall. Nothing going into SN2700 will be visible from the outside. Elasticsearch storage hosts currently being procured.

DC: also getting two contingency head nodes 

DC: consequence of using both 25Gbit ports in zeek WN for ingestions means they are unavailable for transfer to elasticsearch. What about using 1Gbit link instead?

JC: depends if connection to ES is in-band or not? If it is then could be fine but if not then no. Could create VLANs using the 25Gbit links instead?

JA: for the ones we are ordering probably, the ones we have now no.

DC: also think in context from plan with zeek WNs, worst case, use 25Gbit for ingest and 25 for internal comms, has implications on spec of new zeek WNs, may imply flexibility of these network cards

PG: could separate the 25Gbit into 2 VLANS

DC: need to maintain separation of traffic

AD: could buy extra 10Gbit network cards if does not cause issue in the config, as there should be plenty of space in the node.

DC: storage nodes are fine (waiting for final memory spec), only the new zeek WN need to be discussed: new network cards need to match the switch and what is needed for SN2700

JC: going into SN2700, want the 25Gbit port as makes it easier, if 1Gbit then need the S3048 port

AD: if going to SN2700, that has 100Gbit port and will get 25Gbit network card, S3048 will have 1Gbit network card

DC: SN2700 used for all internal traffic, S3048 only for admin and operator access, so add 25Gbit network cards to zeek WN spec

PH: SN2700 traffic only running in Layer2, and layer3 goes to S3048?

JC: SN2700 may need another port and VLAN if layer 3 info

AD: is this a good way to do things? Need confirmation from more experienced person

JC: even if you don’t use the network card, just get 25 Gbit card and 1Gbit card then you can’t go wrong even if you don’t use it, you have all the options available.

PG: agree with JC, should also draw physical diagram (with ports and fibers) and another one with layer 2 detail on it for security purposes

DC: current plan is that Tier1 will lend us set of WNs to use in 1st phase of deployment, then as equipment arrives, (storage hosts and headnodes) the block would belong to SOC and we would configure zeek WNs. End goal is that everything will be bought for purpose except the routers and will be badged for SOC use.

AD: sounds right, if not enough money then Tier1 hardware may become more permanent

 

Timescale for ordering hardware to use and pay them this FY

 

AD: order needs to be sent by mid-Oct at latest, with feedback from people by 1st Oct or may not arrive. About one weeks grace period to change details once order placed.

Essentially copied Nikhef’s hardware (slightly newer gen)

DC: have another meeting with Liviu to discuss hardware details before 1st Oct

 

Actions

 

AD: check if there is enough capacity to run the rack in the UPS room

AD: Add 25Gbit and 1Gbit network cards to zeek WNs

DC: set meeting with Liviu to discuss final hardware specs

Draw additional diagrams: 1 physical diagram and one with Layer use information. DC send current diagram in visio to PG

PG: check ease of cabling to 146 in UPS room and 245/269 in HPD room, answer by end of week

 

 

There are minutes attached to this event. Show them.
    • 1
      Summary of work so far
    • 2
      Network Planning
      • Philip is checking the fibre runs from R26 to the HPD room

        • If these are in place, can we run taps from the Janet links and OPN link to the HPD room
        • This would allow the SOC cluster to be located in the HPD room
      • IF SO: draw diagrams identify cabling etc.

      • IF TIME: SOC cluster rack network design

    • 3
      Next steps
      • Weekly technical meetings proposed for Tuesday 0930 (likely rotating attendance)
        • Starting Tuesday 21st
        • Final technical design review
      • Technical Design initial lock: Wednesday 22nd