Tuesday 21 Sept 2021 9:30 – 10-40 |
STFC SOC technical Meeting Minutes |
[David Crooks - DC] [Jonathan Churchill - JC]
[Greg Corbett - GC] [Alastair Dewhurst - AD]
[James Adams - JA] [Anish Mudaraddi - AM]
[Philip Garrad - PG] [Ian Colier - IC]
[Olivier Restuccia - OR]
This meeting will be weekly meeting with a rotating set of people depending on needs
Agenda for today: go through where we are and what needs to be done next
DC: bringing taps to HPD room, use of rack 146 in UPS room for hardware
PG: if can get to HPD should be able to get to UPS room as well yes
DC: rack location is final thing to decide with AD and Martin, need to check that there is capacity to run the compute load from the UPS room
JA: the rack should have different locks to the other racks and there should be a procedure with key handling
PG: good idea to have an audit trial, room is fairly CCTV covered
AD: rack 146 is under Martin Bly’s control, nothing powered on in that rack, in a good location, is empty, would add the issue of space if HPD was chosen. Are there any power limits to follow in UPS room – looking at 5-6kW for the rack?
JC: 6kW is at the top end maybe? Need to check with Ops
AD: the WN have power use of 2kW per node which may push power use higher
JC: thought plan as to use HPD rack – Jasmin releasing racks. In UPS, need to see if that rack is next to another conveniently empty one. Rack on end of row 9 and row 10
PG: what is scale for easiness in UPS room?
AD: based on previous Tier1 network knowledge and getting connections from HPD to site core, is mainly a logistical issue
PG: might be a balancing act, don’t rule out HPD room, fiber will need to be laid wherever they’re put
JC: J1 245 is full of ancient equipment, 269 opposite may also be free
PG: will look at ease of cabling to 146 in Ups and 245 in HPD room, that plus availability will be basis on which to make the decision of which rack to use. Info by end of week
DC: traffic brought in from taps to SN2700 then spilt to the 12 zeek WNs, now instead of 25 Gbit into each node this would allow each node to have 50Gbit full duplex. Each node also has 1Gbit link split into two virtual interfaces would go to S3048 and then the Core network firewall. Nothing going into SN2700 will be visible from the outside. Elasticsearch storage hosts currently being procured.
DC: also getting two contingency head nodes
DC: consequence of using both 25Gbit ports in zeek WN for ingestions means they are unavailable for transfer to elasticsearch. What about using 1Gbit link instead?
JC: depends if connection to ES is in-band or not? If it is then could be fine but if not then no. Could create VLANs using the 25Gbit links instead?
JA: for the ones we are ordering probably, the ones we have now no.
DC: also think in context from plan with zeek WNs, worst case, use 25Gbit for ingest and 25 for internal comms, has implications on spec of new zeek WNs, may imply flexibility of these network cards
PG: could separate the 25Gbit into 2 VLANS
DC: need to maintain separation of traffic
AD: could buy extra 10Gbit network cards if does not cause issue in the config, as there should be plenty of space in the node.
DC: storage nodes are fine (waiting for final memory spec), only the new zeek WN need to be discussed: new network cards need to match the switch and what is needed for SN2700
JC: going into SN2700, want the 25Gbit port as makes it easier, if 1Gbit then need the S3048 port
AD: if going to SN2700, that has 100Gbit port and will get 25Gbit network card, S3048 will have 1Gbit network card
DC: SN2700 used for all internal traffic, S3048 only for admin and operator access, so add 25Gbit network cards to zeek WN spec
PH: SN2700 traffic only running in Layer2, and layer3 goes to S3048?
JC: SN2700 may need another port and VLAN if layer 3 info
AD: is this a good way to do things? Need confirmation from more experienced person
JC: even if you don’t use the network card, just get 25 Gbit card and 1Gbit card then you can’t go wrong even if you don’t use it, you have all the options available.
PG: agree with JC, should also draw physical diagram (with ports and fibers) and another one with layer 2 detail on it for security purposes
DC: current plan is that Tier1 will lend us set of WNs to use in 1st phase of deployment, then as equipment arrives, (storage hosts and headnodes) the block would belong to SOC and we would configure zeek WNs. End goal is that everything will be bought for purpose except the routers and will be badged for SOC use.
AD: sounds right, if not enough money then Tier1 hardware may become more permanent
AD: order needs to be sent by mid-Oct at latest, with feedback from people by 1st Oct or may not arrive. About one weeks grace period to change details once order placed.
Essentially copied Nikhef’s hardware (slightly newer gen)
DC: have another meeting with Liviu to discuss hardware details before 1st Oct
AD: check if there is enough capacity to run the rack in the UPS room
AD: Add 25Gbit and 1Gbit network cards to zeek WNs
DC: set meeting with Liviu to discuss final hardware specs
Draw additional diagrams: 1 physical diagram and one with Layer use information. DC send current diagram in visio to PG
PG: check ease of cabling to 146 in UPS room and 245/269 in HPD room, answer by end of week