170 likes | 354 Views
LC/NPE Substrate Control: Substrate Control Daemon. Fred Kuhns fredk@arl.wustl.edu Applied Research Laboratory Washington University in St. Louis. Single Interface Example. LC. LC Ingress One queue per slice with reserved bandwidth ( really one per scheduler )
E N D
LC/NPE Substrate Control:Substrate Control Daemon Fred Kuhns fredk@arl.wustl.edu Applied Research Laboratory Washington University in St. Louis
Single Interface Example LC • LC Ingress • One queue per slice with reserved bandwidth (really one per scheduler) • One queue for best effort traffic to each GPE • One scheduler for CP with queues for reserved traffic plus BE • LC Egress • At least one scheduler for each physical interface • One queue for each active slice with MI defined for the associated scheduler • One best effort queue for each board (GPE, CP, NPE?) • Substrate sets scheduler rates according to aggregate allocations • Manage scheduler rates to control aggregate traffic to interfaces and boards. Ingress qxs1 wrr SchedNPE1 qxs2 ... Map fp:MI to Q {dip, dp, pr} qxsn ... CP qps1 qps2 SchedGPE1 GPE ... qpsn ... qBE wrr interface 1 ... GPE SchedCP NPE Egress wrr qs1 ... qs2 NPE SchedI1 ... qsn BWI1 qGPE src addr proto port/icmp ... qCP
Common Definitions and Types • See slides titled Types.ppt • For all commands the message context ID is used to identify the context within which a command is to be executed. • The special value of 0 (cid = 0) indicates a privileged operation performed by the substrate. • Otherwise the context ID is an identifier indicating a user specific context in the command is to be interpreted and executing. In most cases this is either the fastpath ID or an internal slice ID. • uint16_t miid : Meta-Interface number. • If fpid == 0 then miid is the endpoint ID (epid). • int32_t retCode_t : Most calls return success (0) or an error code. • If a command results in an error then the first 4-Byte of the reply message contains an error code • If a command doesn’t return any data then it should at least return a retCode_t code of 0, but this is not required.
VLAN 31 24 16 8 0 VLAN 31 24 16 8 0 not used copt (8b) GPE IP Address (tunnel) NPE IP Address (tunnel) Start of SRAM Block (Physical Address) SRAM Block Size Exception UDP DPort Exception UDP SPort Local UDP DPort Max Buffer Limit Local UDP SPort not used qm sid Exception qid Current Buffer Count not used qm sid Local Delivery qid Buffer Limit Exceeded Count 31 24 16 8 0 sid (Tunnel) Source IP Address Destination MAC Address, High Order 4 Bytes (Tunnel) GPE IP Address not used copt (8b) (Tunnel) NPE IP Address Start of SRAM Block (Physical Address) Dest MAC Not Used SRAM Block Size Exception UDP DPort Exception UDP SPort Local UDP DPort Max Buffer Limit Local UDP SPort not used qm sid Current Buffer Count Exception qid not used qm sid Local Delivery qid Buffer Limit Exceeded Count (Tunnel) Source IP Address VLAN Table GPE Info table Destination MAC Address, High Order 4 Bytes Dest MAC Not Used Encap Control Block NPE Tables 31 24 16 8 0 sid rate (Kbps/683) interface number rate (Kbps/683) interface number Scheduler Rate table 31 24 16 8 0 qid unused Length in Pkts (28b) Length in Bytes (32b) Threshold in Bytes (32b) Quantum (32b) unused Length in Pkts (28b) Length in Bytes (32b) Threshold in Bytes (32b) Quantum (32b) QParams Table
Line Card Tables or Both 31 24 16 8 0 sid 31 24 16 8 0 sid destination Mac (high order 4 Bytes) rate (Kbps/683) interface number dest MAC Not Used source Mac (high order 4 Bytes) src MAC Not Used rate (Kbps/683) interface number Scheduler Rate table destination Mac (high order 4 Bytes) dest MAC Not Used source Mac (high order 4 Bytes) src MAC Not Used Ingress/Egress Dynamic MAC Table 31 24 16 8 0 qid unused Length in Pkts (28b) Length in Bytes (32b) Threshold in Bytes (32b) Quantum (32b) unused Length in Pkts (28b) Length in Bytes (32b) Threshold in Bytes (32b) Quantum (32b) QParams Table
Fast Path Commands: SRM to SCD NPE Only • retCode_t set_fastpath(fpid, copt_t, VLAN, rcnts_t, Mem[]) Message context ID = 0 fpid : Fast-Path ID. Unique within SPP node. copt: Code option identifier, 0 is invalid, IPv4 = 1, I3 = 2 uint16_t VLAN: Ethernet VLAN tag, used in datapath to identify fastpath instance. rcnts_t = {cntr_t#Qs, #Fltrs, #Buffers, #Stats} struct {uint32_t offset, size;} Mem[2] : SRAM and DRAM allocation, values are relative to the start of an SRAM block set aside for fastpath use.Mem = {sram(offset, size), // SRAM: version 1 all same size{dram(offset, size)} // DRAM: not used in version 1 • Create fastpath specific mappings for queues, TCAM filters and the stats table. • Initialize queue parameters to “reasonable” values (threshold, weight). • Update VLAN Table with the code option ID (copt) and SRAM Address • SCD knows starting address of SRAM, so add offset • Assume there is an SRAM table read by the microengines which maps VLAN to buffer limits, this may be the VLAN table. • Treat as a transaction, either it completes fully or no changes are recorded. uint32_t retCode_t: {0 = Success, app defined}
Fast Path Commands: SRM to SCD NPE Only • retCode_t enable_fastpath(fpid_t) Message context ID = 0 fpid_t : fast path identifier uint32_t retCode_t: {0 = Success, app defined} • enable by setting code option field of VLAN table entry for fp • retCode_t disable_fastpath(fpid) Message context ID = 0 uint16_t fpid : fast path identifier retCode_t: {0 = Success, 1 = Pending, 2 = Invalid fpid, else error code} • First, disable fp by setting code option field of VLAN table to 0 • It is not an error to call disable an already disabled fastpath • SRM will periodically call disable_fastpath() until Success or Error. • Then check all fast path queue lengths • If any queue has non-zero length then return Pending • If all queues are empty then return Success • retCode_t rem_fastpath(fpid_t) Message context ID = 0 fpid_t : fast path identifier retCode_t: {0 = Success, 2 = Invalid fpid, else error code} • Fastpath must be disabled (disable_fastpath) before it can be removed. • Remove all allocations associated with fpid.
BW and Scheduler Control: SRM to SCD • Common: Message Message context ID = 0 uint16_t sid : Scheduler Id; There are 5 schedulers per QM and 4 QMs so 0 <= sid < 20, (NBO) uint8_t MAC[6] : 6 Byte Ethernet address LC and NPE schedule parameters • retCode_t set_sched_params(sid, ifn, BWmax, BWmin)int32_t ifn : Interface numberuint32_t BWmax : Kbps, Maximum scheduler rateuint32_t BWmin : Kbps, Minimum assignable rate for this scheduler. • If BWmin is different from a previous call then may need to reassign weights for all associated queues • Must convert the rate I send (Kbps) to that expected by the hardware. Value = X Mbps / 0.683 = BW Kbps / 683 (So no floating point needed) • Update scheduler parameters table in SRAM. The low 16-bit word is the converted rate value and the upper 16-bits is the interface number (ifn). The interface number must be the same value that TX uses. To set only the Rate (preserves the existing interface number):SParams[sid] = (SchedParams[sid] & 0xFFFF0000) | (0xFFFF & Value)To set the interface number (ifn, preserves the existing rate value): SParams[sid] = (SParams[sid] & 0xFFFF) | (ifn << 16)To set both: SParams[sid] = (ifn << 16) | (0xFFFF & Value) NPE Encap Control Block • retCode_t set_encap_cb(sid, srcIP, dMAC) • Must update the table which associates encapsulation header IP source addresses with a scheduler:SchedAddrs[sid] = {ipaddr, destination MAC} • retCode_t create_mi(fpid, mi, sid) • retCode_t delete_mi(fpid, mi) • retCode_t set_mi_bw(fpid, mi, bw) LC Dynamic MAC Table • retCode_t set_sched_mac(sid, dstMAC, srcMAC) • Must update the table which associates Ethernet source and destination addresses with a scheduler:SchedAddrs[sid].smac = smac (6 Bytes over 2 4B words)SchedAddrs[sid].dmac = dmac (6 Bytes over 2 4B words)
System Configuration: SRM to SCD • Common: Message context ID = 0 SRM to SCD; NPE Only • retCode_t start_mes() • The SCD starts the IXP microengines running the code that is automatically loaded when the SCD is launched • Returns {Success (0), Error (-1)} • retCode_t stop_mes() • The SCD stops the IXP microengines if they are running • Returns {Success (0), Error (-1)}
NPE: GPE Info • Common: Message context ID = 0 uint32_t IP : 4-Byte IP address (NBO) uint16_t PORT : UDP Port number (NBO) uint16_t QID: Slice relative queue id, must be converted by SCD to the tuple of qid, QM id and scheduler ID. • SRM to SCD NPE Only retCode_t set_encap_gpe(fpid, gpeIP, npeIP) retCode_t unset_encap_gpe(fpid) unset_encap_gpe() == set_encap_gpe(fpid,0,0) uint16_t fpid: Fast path id, globally unique • RMP to SCD NPE Only retCode_t set_gpe_info(exPort, ldPort, exQID, ldQID) retCode_t unset_gpe_info() Set context ID to the global fast path ID
Queues • Common: • if cid == 0 then qid is absolute otherwise SCD must convert from fastpath relative. uint16_t miid : Meta-Interface number. uint32_t qlen : Length of packet queue in Bytes uint32_t threshold : the maximum number of packets queued before dropping uint32_t bw : Kb/s, SCD must convert to the associated weight. uint8_t list_type : Explicit list (0) or range specification (1) uint16_t qid_list[] : List of queue identifiers to associate with meta-interface mi. In C or C++ this would be an array of integers {0, ..., n}. If using a range then the list has the first and one past the last id of the range. For example {4, 8} is the same as {4, 5, 6, 7}. • RMP to SCD NPE Only • retCode_t bind_queue(u16 miid, u8 list_type, u16[] qid_list) • retCode_t unbind_queue(u8 list_type, u16[] qid_list) LC and NPE • bw_t actual_bw set_queue_params(u16 qid, u32 threshold, u32 bw) • If a parameter is -1then do not update in table. • {u32 threshold, u32 bw} get_queue_params(u16 qid) • {u32 pktCnt, u32 byteCnt} get_queue_len(u16 qid) • SRM to SCD; LC (Ingress and Egress) • retCode_tbind_queue_sched(u16 qid, u16 sid) • Assign qid to scheduler sid. • retCode_t unbind_queue_sched(u16 qid) • Deallocate queue’s bw from its assigned scheduler and disassociate it from that scheduler
Lookup Table (TCAM) • Common: LC cid = 0, NPE cid = fpid • If cid == 0 then fid is absolute, otherwise it is fastpath relative and must be converted. uint32_t fid : Filter ID. If cid == 0 then absolute, otherwise relative. uint16_t dbid : Database ID. NPE: dbid=0; LC: Ingress dbid=0, Egress dbid=1 struct fltr { uint8_t key[N] : N-Byte key value, defined by code option. uint8_t mask[N] : N-Byte mask for key lookups, defined by code option. uint8_t result[M] : M-Byte result vector, defined by code option. } • SRM to SCD (ctx = 0); RMP to SCD (ctx = fpid)LC and NPE • ret_t write_fltr(dbid, fid, key, mask, result) • The SCD can lookup the database object with id = dbid to get the key and result byte widths. • The RMP will prepend the VLAN tag and change the slice MI numbers to the correct values • ret_t update_result(dbid, fid, result) • If fid is a valid entry then updates result vector. Otherwise no change to the database. • fltr get_fltr_bykey(dbid, key);fltr get_fltr_byfid(dbid, fid) • Returns a filter if key/fid matches a valid entry. Otherwise 0 (does not return an error) • result lookup_fltr(dbid, key) • Returns a result if key/fid matches a valid entry, otherwise 0 (does not return an error) • retcode rem_fltr_bykey(dbid, key);retcode rem_fltr_byfid(dbid, fid) • Returns fid if filter removed, otherwise -1
NPE Filter Interface • SRM to SCD (ctx = 0); RMP to SCD (ctx = fpid)LC and NPE subKeyWrap_t { type, rxip, rxport, coptKey[14]; } subResult_t { actions, sindx, daddr, dport, sport, qid;} subFltr_t { subKeyWrap, coptMask[14], subResult } • scdCode_t write_npe_fltr(fid, subFltr) • The SCD can lookup the database object with id = dbid to get the key and result byte widths. • The RMP will prepend the VLAN tag and change the slice MI numbers to the correct values • scdCode_t update_npe_result(fid, subResult) • If fid is a valid entry then updates result vector. Otherwise no change to the database. • fltr get_npe_fltr_bykey(subKeyWrap);fltr get_npe_fltr_byfid(fid) • Returns a filter if key/fid matches a valid entry. Otherwise 0 (does not return an error) • scdCode_t lookup_npe_fltr(subKeyWrap) • Returns a result if key/fid matches a valid entry, otherwise 0 (does not return an error) • scdCode_t rem_npe_fltr_bykey(subKeyWrap);scdCode_t rem_npe_fltr_byfid(fid) • Returns 0 on success or an error.
!TCP 00 RSV proto TCP 01 00 flags 2 2 12 IPv4 TCAM Filter Formats (on NPE) Defined by the IPv4 Code Option, 112bits Substrate defined T if vlan RX port daddr saddr sport dport tcp/proto 1 4 11 16 32 32 16 16 16 Represents input meta-interface 2 6 8 T = 0: Normal Lookup T = 1; substrate only lookup Result, 128 bits rsv D L rsv sindx TX IP daddr TX dport TX sport rsv QM Sch qid 3 1 1 11 16 32 16 16 12 2 3 15 global stats index (SCD maps slice’s sindx to global value) TX IP address and sport represents the output meta-interface. The dport is provided by the slice. (RMP maps miid to tx tunnel params, use dport provided by slice) 20-bit internal qid (SCD maps slice’s miid to QM and Sch. SCD Also maps slice’s qid to global qid value) D: Drop packet L: Local delivery Slice parameters: Key: Input miid, IPv4 fltr {daddr, saddr, sport, dport, tcp/proto} Result: Flags {Drop, GPE}, sindx, Output miID, QID
X X X X X T L W Flags: W : {0 - Packts, 1 - Bytes} L : {0 - PreQ, 1 - PostQ} T : {0 - Push, 1 - Pull} X : Don’t Care 7 6 5 4 3 2 1 0 Reading Statistics • Common: • if cid = 0 then sindx is absolute, otherwiseSCD must convert fastpath relative index. uint8_t flags uint32_t handle : opaque reference to periodic event uint32_t sindx : stats index • RMP to SCD, RLI/GIU to SCDLC and NPE • stats = read_stats(sindx, flags) • struct stats {uint32_t cnt, tstamp;} • Uses flags L and W • result = clear_stats(sindx) • result : success or error code • handle create_periodic(sindx, P, cnt, flags) • uint32_t P : periodic interval in milliseconds • uint16_t cnt : Number of samples to keep in a history buffer • Uses all 3 flags • retcode del_periodic(handle) • retcode set_callback(handle, udp_port) • push model: every (cnt·period) milliseconds send the last cnt samples to the client’s UDP port udp_port. • stats = get_periodic(handle) • pull model: returns the last cnt samples.
Accessing Memory • common: Message context id: NPE equals fpid for LC must equal 0. uint32_t offset : Offset in from start of assigned SRAM block. uint32_t len : Number of bytes to read/write uint8_t data[X] : Data buffer with X Bytes. uint32_t kpa : Kernel physical address • SRM to SCD; LC and NPE • result write_mem(kpa, len, data) • Message context ID must == 0. • Can read any valid physical address on the xscale. • data read_mem(kpa, len) • Message context ID must == 0. • Can read any valid physical address on the xscale. • RMP to SCD; NPE Only • result write_sram(offset, len, data) • Offset is relative to the starting address of SRAM block allocated to slice. SCD must verify the write operation is within bounds. • data read_sram(offset, len) • Read len bytes from SRAM block and return to client. First verify offset and len with within bounds for slice.
New Commands • NPE Only • retCode_t set_src_hwaddr(hwaddr_t) • context must be 0 • retCode_t set_iface_table(ipAddr_t[16]) • context must be 0