$59600097$ 



Febarat

Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas



A Front-End Readout Architecture for the CMS Barrel Muon Detector: A Feasibility Study

P. Aguayo J. Alberdi J.M. Barcala J. Marín A. Molinero J. Navarrete J.L. de Pablos L. Romero C. Willmott

Informes Técnicos Ciemat 782 *n*  $\overline{7}$  **w** a  $\overline{3}$  (1995)

 $\sum_{n=1}^{\infty}$ 



 $\hat{\mathbf{r}}$ 

# A Front-End Readout Architecture for the CMS Barrel Muon Detector: A Feasibility Study

P. Aguayo J. Alberdi J.M. Barcala J. Marín A. Molinero J. Navarrete J.L. de Pablos L. Romero C. Willmott

# **Dirección de Tecnología**

Toda correspondencia en relación con este trabajo debe dirigirse al Servicio de Información y Documentación, Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas, Ciudad Universitaria, 28040-MADRID, ESPAÑA.

Las solicitudes de ejemplares deben dirigirse a este mismo Servicio.

Los descriptores se han seleccionado del Thesauro del DOE para describir las materias que contiene este informe con vistas a su recuperación. La catalogación se ha hecho utilizando el documento DOE/TIC-4602 (Rev. 1) Descriptive Cataloguing On-Line, y la clasificación de acuerdo con el documento DOE/TIC.4584-R7 Subject Categories and Scope publicados por el Office of Scientific and Technical Information del Departamento de Energía de los Estados Unidos.

Se autoriza la reproducción de los resúmenes analíticos que aparecen en esta publicación.

Depósito Legal: M-14226-1995 NIPO: 238-96-001-0 ISSN: 1135-9420

Editorial CIEMAT

426000

FEASIBILITY STUDIES, MUON DETECTION, RELIABILITY, TRIGGER CIRCUITS

 $\sim 100$  km s  $^{-1}$ 

 $\mathcal{A}$ 

 $\sim$   $\sim$ 

CLASIFICACIÓN DOE Y DESCRIPTORES

#### **"A Front-End Readout Architecture for the CMS Barrel Muon Detector: A Feasibility Study"**

Aguayo, P.; Alberdi, J.; Barcala, J.M.; Marín, J.; Molinero, A.; Navarrete, J.; Pablos, J.L. de; Romero, L.; Willmott, C. 29 pp. 8 figs. 0 refs.

#### **Abstract**

A feasibility study of a possible architecture for the CMS barrel muon detector readout electronics is presented. Some aspects of system reliability are discussed. Values for the required FIFO's to store data during the first level trigger latency are given.

#### **"Un posible diseño de la Electrónica del Barril del Detector de Muones de CMS: Estudio de Viabilidad"**

Aguayo, P.; Alberdi, J.; Barcala, J.M.; Marín, J.; Molinero, A.; Navarrete, J.; Pablos, J.L. de; Romero, L.; Willmott, C. 29 pp. 8 figs. 0 refs.

#### **Resumen**

Se presenta un estudio de viabilidad de una posible arquitectura para la electrónica frontal del detector de muones del experimento CMS. Se hacen algunas consideraciones sobre la fiabilidad del sistema y se dan valores para las memorias FIFO necesarias para almacenar las señales durante el tiempo de latencia del trigger de primer nivel.



 $\mathcal{L}^{\text{max}}_{\text{max}}$  and  $\mathcal{L}^{\text{max}}_{\text{max}}$ 

 $\mathcal{L}^{\text{max}}_{\text{max}}$ 

 $\mathcal{L}(\mathcal{L})$  and  $\mathcal{L}(\mathcal{L})$ 

# **Index**

## **Introduction**

The CMS Technical Proposal indicates that, "The TDCs, the meantimers, the correlators, and the readout electronics receiving the primitive signals from the front-ends on chambers will sit outside the iron yoke."

However, some talk has been going on about the possible placement of part or all the front-end electronics by or inside the chambers. At first sight there seems to be benefits and limitations for both approaches. A decision has to be made soon as the chamber's design—hence the possible placement of electronics within the chamber's volume—will be frozen in one or two years from now. Therefore the purpose of this study is to set up a preliminary baseline for a founded discussion of the problem. We would like to state that this is not a proposal, which may only come out when all the difficulties will be understood. We have just undertaken the task of envisaging one of these approaches. The other one will have to be considered in due time. In this sense, this is a feasibility study of one possible architecture.

## **Overview**

In order to make a quantitative evaluation, the departing point of this study is the following:

- 4 wires will be connected to one ASIC. In this chip there will be 4 TDC and 1 MT. This is the TDC/MT chip. (Other integration factors are considered in Appendix 2.)
- 10 TDC/MT will be placed in one PC board. In addition, in this board there will be one Read Out Controller (ROC) and one Meantimer Controller (MTC). This is the ROC board. Connections will be provided in this board for next and previous MTCs.
- Depending on the chamber type, 17 to 27 ROC boards will be connected by a point to point serial link to one Read Out Server board (ROS). There will be one ROS board per chamber.
- At every ROS board there will be an Event Packer—to produce one data block per event—, and a Link Sender—to send these data to an Optical Digital Data Link (DDL).
- One DDL will collect data from one barrel sector—4 chambers, i.e., MSI, MS2, MS3, and MS4— and will take them out of the magnet volume.
- 8 DDL will be connected to one Front End Driver board (FED).
- 8 FED boards will be placed in one Readout Crate.

The following two tables and Fig. 1 summarise this architecture. It is not unnecessary to stress that this is a starting point for discussion.

#### **Barrel Muon Readout**



#### **Estimated nu-nber of units**





**Figure 1:** Chamber readout architecture.

## **TDC/MT chip**

Fig, 2 shows a block diagram of the readout logic for one TDC channel. Common parts for all channels in the chip are possible. The chip has a single connection to the readout ROC\_BUS through a 4 to 1 multiplexer with a 3 state output. The operation is as follows:

#### *Input signals*

- At each bunch crossing, the content of TDC\_Register and BX\_Counter are written to TDC\_FIFO—only when the former is different from zero. TDC\_Register is reset and BX\_Counter is incremented.
- When a trigger arrives after a fixed delay—the trigger latency—it is stored in TRG\_FIFO.

#### *Comparator*

There is one Buch\_Crossing\_Comparator per TDC channel (Fig. 3). The parameters OFF1, OFF2 and TRGLAT will be introduced in the chip by the slow control. The comparator performs 3 comparisons simultaneously to execute one decision in one clock cycle. According to the results of COMP\_X, COMP\_Y and COMP\_Z, one of the following actions is executed:

- 1. Write EV#, TBXN, TDCN, and BXN into RO\_FIFO and read next datum from TDC\_FIFO.
- 2. Read next datum from TRG\_FIFO.
- 3. Read next datum from TDC\_FIFO.

The decision logic is the following:

```
IF 3 BXN THEN
       IF \exists TBXN THEN
              IF TBXN-OFF1 \leq BXN \leq TBXN+OFF2 THEN
                     WRITE TDC_FIFO & TRG_FIFO INTO RO_FIFO
                     NEXT TDC_FIFO
              ELSE IP BXN > TBXN+OFF2 THEN
                     NEXT TRG_FIFO
              ELSE IF BXN < TBXN-OFF1 THEN
                     NEXTTDC_FIFO
              ENDIF
       ELSE IF BXN \leq BXC-TRGLATNEXTTDC.FIFO
       ENDIF
ELSE IF ETBXN THEN
       NEXTTRG_FIFO
         \sim 10^{-11}Contractor
ENDIF
```
#### *Readout*

Two mechanisms have been considered: interrupt driven and polling. Here we describe the second one. Therefore the readout of RO\_F1FO is driven from the ROC. The sequence can be the following:

- ROC presents at the RO\_EV# lines the event number to be read.
- With address lines SO and SI ROC selects one RO\_FIFO.
- The presence of an event in RO\_FIFO with the same event number is signalled to ROC by the EVENT\_HIT line. (In case the event stored at RO\_FIFO is older than RO\_EV#, the event is unloaded to nowhere. This should never happen, but may prevent RO\_FIFO to be blocked for ever.)
- With COE line ROC enables the 3-state output of one chip. One word is unloaded from the RO FIFO.
- The sequence is repeated for all four RO FIFOs.

#### Chip size

The estimated number of gates for the TDC readout logic—including FIFOs (see Appendix 1) and comparators— is of the order of 4000 per channel. In addition the chip will include 4 TDC and 1 MT.

#### TDC/MT package

The TDC/MT chip will need connections for the following signals:

- 9 Wire in: 4 are required for the TDCs and 5 additional ones for MT.
- 15 TTC signals: 1 CLK, 1 LV1 accept, 1 BXC reset, and 12 data lines.
- 4 Fast status (FIFO almost full). These 4 lines can be ored to 1 line.
- 29 ROC BUS data.
- 24 RO EV# data.
- 3 address lines: SO, SI, and COE.
- 1 COMP2\_EN.
- 1 EVENT\_HIT.
- 8 MT output signals for MTCs.
- 2 Slow Control lines (?).
- Chip Reset
- 8 Power lines.

The total number of lines is 105. If Fast\_status are ored, 102. Multiplexing of ROC\_BUS and RO\_EV# could be possible, reducing the total number to  $\overline{78}$  + some control lines.

## **ROC board**

Fig. 4 shows a block diagram of the ROC board logic. One board will collect data coming from 40 wires. In the  $\phi$  plane, 20 will be from the top superlayer and 20 from the bottom. At the ROC board there will be one ROC chip to control 10 TDC/MT chips and one MTC. A point-to-point serial data link will connect each ROC board to one ROS board.

## Board layout

Fig. 5 shows an "artistic view" of the ROC board. It has 22 by 27 cm<sup>2</sup>. At the ROC board there will be connectors for the following signals:

- 2 connectors for 40 wire signals (20 each).
- 2 connectors (40 signals wide) for the previous and next MTCs.
- 2 connectors (40 signals wide) from the previous and next MTs.
- 1 connector (16 signals wide) for TTC.
- 1 (or 2) connector(s) for the Serial Data Link.
- 1 connector for Slow Control Bus.
- Power connections.

## **ROC chip**

ROC could be implemented as one ASIC or as a programmable device. Fig. 6 shows a logic block diagram. The operation is as follows:

- At each trigger the trigger event number (EV#) and the trigger bunch crossing (TBX#) will be stored in TRG\_F1FO.
- With the appropriate delay, EV# will be presented at RO\_EV# lines and COMP2\_EN line will be set.
- With address lines S0 and S1, ROC will select one set of 10 RO\_FIFOs. RO\_FIFOs will signal the presence of hits with EV\_HIT lines.
- ROC will read sequentially RO\_FIFOs with hits using the COE lines. Data will be stored in Event Storage memory, together with the corresponding Wire\_Code.
- The sequence will be repeated for the 4 sets of RO\_FIFOs.
- ROC will write in EVENT STORAGE the event number (EV#), a word count and an End-of-event word.
- Once the Event Block is completed it will be sent to ROC\_FIFO. Eventually it may content no hits.
- As soon as there is an Event Block in ROC\_FIFO the Parallel Serial Register will start to send it to the Serial Data Link.

The following table shows the structure of the Event Block:



#### **Event Block format**

#### ROC package

The ROC chip will need connections for the following signals:

- *»* 14 TTC signals: 1 CLK, 1 LVl\_accept, and 12 data lines.
- 29 ROC\_BUS data.
- 24RO\_EV#data.
- 2 address lines: SO, SI.
- 10 Chip\_output\_enable (COE).
- 1 COMP2.EN.
- 10 EV\_HIT.
- 2 Serial Data Link.
- 2 Slow Control lines (?).
- Chip Reset

• 8 power lines

The total number of signals is 103. Multiplexing of ROC BUS and RO EV# could be

possible, reducing the total number to 79 + some control lines.

# **Serial data link**

A point-to-point serial link will transmit data from the ROC FIFO to the ROS FIFO. It will consist of a coaxial cable, and the transmission speed will be 5 Mbits per second.

## **ROS board**

Fig. 7 shows a block diagram of ROS board. At ROS board data coming from ROC board will be received at Serial Parallel Register. SPR will store the data in ROS FIFO. There will be one SPR and one ROS FIFO per ROC board connected to ROS board.

Event Packer will collect the data from the ROS FIFOs and pack them in one block per event. The event block will be sent to Link Sender.

The Link Sender will send the data through the Optical Digital Data Link to the Front End Driver.

# **Optical DDL**

This link is the "optical backplane" solution proposed in Front End Drivers in CMS DAQ, CMS TN/95-020, Rev. 1.03, Feb. 1995. One DDL will collect data from one sector: MSI, MS2, MS3, and MS4.

# **Slow Control**

The following information will be required at different levels of the readout system:

- LV1 trigger latency: a parameter, required at TDC/MT chip.
- OFF1 and OFF2: parameters, at TDC/MT.
- Clock deskew: parameter. Required at every TDC/MT chip.
- Wire enable/disable: one control bit at TDC/MT. Is it necessary?

• Maximum number of bunch crossings: at TDC comparator?

Also the following functions can be performed by Slow Control:

- Start chip internal test
- Read chip status
- Reset chip

The Slow Control will be implemented as one controller sitting at the ROS board. The controller can be a programmable device. A slow serial link will connect this controller to every chip.

## **Reliability**

Once installed in the detector, we will not be able to replace or repair damaged read-out components—except FEDs or some other components placed outside the magnet. Next table summarises a crude estimation of risk factor due to IC failure with the following assumptions: the operation time is 10 years; one IC has a FIT<sup>†</sup> value of 20; the operation of 1 ROC board depends on 2 ICs, including the serial link; the operation of one ROS board depends on 3 ICs; the FIT value for a DDL is unknown, 1 lost unit is assumed:

#### **Risk factors due to IC failure in 10 years operation**



These numbers can not be taken in absolute terms. They only show the relative weight of the different parts of the system. Therefore, to avoid unacceptable losses, redundancy has to be seriously considered at least at some level.

In addition, failures due to connector's malfunction cannot be disregarded.

## **Mechanics**

Fig. 8 shows a possible enclosure and thermal shield for the readout electronics sitting inside the chambers. A first estimation indicates that it would be possible to extract all the heat produced at the boards by proper water cooling.

One FIT—Failure In Time—is equivalent to one failure in  $10^9$  hours.

# **Conclusions**

As a general statement we conclude that the present architecture is possible. It is an elegant solution avoiding the cumbersome problem of 2.10<sup>5</sup> cables and 150 crates with their corresponding cost. However the final evaluation will be performed only after the alternative readout architecture will be studied.

Nevertheless we have detected the following issues:

#### Integration and board size

The present integration of 4 *TDCs* per chip is not adequate. The resulting board size is unmanageable. Integration factors of 20 or 40 TDCs per chip would be more convenient. Power dissipation per chip and gate count may be a limiting factor once TDC and MT architecture will be known.

Placement of board in two layers or overlapping has to be considered. Cooling does not seem to be a problem, except for the associated water leaks.

#### **Reliability**

This is probably our major concern. A careful study has to be carried out, including an evaluation of the required redundancy and costs. Comparison with other possible architectures is mandatory.

# **Appendix 1 Readout Memories Design Considerations**

#### **Rates**

The maximum particle rate at the barrel chambers is  $1$  Hz/cm<sup>2</sup> for MS1 at  $1.1 < \eta < 1.4$ . For other regions or chambers the rate is 1 to 2 orders of magnitude lower. This gives for a single wire a maximum hit rate of 0.8 kHz (MS1,  $\theta$ ) plane, 2m x 4cm). In addition, the maximum particle rate traversing any chamber would be 50 kHz (MSI, 2 m x 2.5 m). Therefore the maximum hit rate for any given chamber will be 0.6 Mhz (50.10<sup>3</sup> x 12).

The LV1 trigger rate is expected to be 300 kHz. Let's assume, for the time been, that at the level of a chamber one event corresponds to one particle traversing the chamber. This gives for a chamber an average event size of 12 hits per event. Also, at the level of a Readout Controller (ROC, grouping 40 wires), the average event rate will be 5 kHz with 8 hits per event in the  $\phi$  plane (20 cm x 2.5 m), and 8 kHz with 4 hits per event in the  $\theta$  plane (40 cm x 2 m).

In what follows the previous numbers will be rounded up when appropriate. In view of the nature of this study, all probabilities are calculated assuming Poisson distribution is applicable. For simple cases the memory size is evaluated for  $P(N)=(\lambda)^{N}e^{-\lambda}/N! < 10^{-6}$  ( $\lambda < 0.01$ ). In a few cases a simulation is made.

## **TDC FIFO**

The hit rate of a single wire is  $1 \text{ kHz}$  and the trigger latency  $3 \mu s$ . The required memory depth is 3 words. In case of a noisy wire (10 kHz) it would require 4 words.

The word width would be  $TDC+BXC = 5+12 = 17$  bits.

## **Trigger Tag FIFO**

The LV1 trigger rate is 100 kHz with a dead time of 3 clock cycles (75 ns). As the comparator will take 1 clock cycle (25 ns) to treat one trigger, the required memory depth would be 1 word. Probably there will be one register for the comparator to operate and one register ready to receive the next trigger.

The word width is  $TBX#+BX# = 24+12 = 36$  bits.

# **TDC Readout FIFO**

The Readout FIFO will receive hits at 1 kHz. The overall throughput at this level will be 40 kwords/s (5 kHz x 8 words/event). Assuming this FIFO could be read at  $2.5$  Mwords/s, i.e., 1 word at any FIFO every  $3.2 \, \mu s$ , a memory depth of 3 words would be sufficient.

The word width is  $EV#+BX#+TDC+BXN = 24+12+5+12 = 53$  bits.

# **ROC FIFO**

#### $\phi$  plane

The ROC FIFO would receive events at a rate of 5 kHz. The number of hits per event will be 8. Events will be packed with a header (EV#+WordCount) and a trailer, having in total 11 words. Assuming the FIFO is read at 5 Mbits/s and the word width is 35 bits (see later), one word will be read in  $7 \text{ }\mu\text{s}$ . A simulation of 100 seconds of operation shows that the required memory depth would be 85 words with an average occupancy of 4 words.

The word width would be  $BX\#+BXN+TDC+WN = 12+12+5+6 = 35$  bits.

## 0 plane

The ROC\_FIFO would receive events at a rate of 8 kHz. The number of hits per event will be 4. Events will be packed in blocks of 7 words. With the same readout speed as before, the simulation shows that the required memory depth would be 50 words.

# **ROS FIFO**

Events will arrive at each ROS FIFO at a rate of 5 kHz (11 words/event in the  $\phi$ ) plane and 7 in the  $\theta$  plane). The overall throughput at this level will be 50 kevents/s, with a mean size of 18 words/event. Assuming this FIFO will be read at 20 Mwords/s, this gives 1 us per event. The required memory depth would be 4 events, i.e., 44 words deep.

The word width would be again  $BX#+BXN+TDC+WN = 12+12+5+6 = 35$ bits.

## **Costs**



The following table shows ; he total FIFO's cost in bits:

TDC RO FIFO is the most expensive due to its word width. Reducing its depth from 3 to 2 words could be investigated with a proper simulation. Also multiplexing at this level may help, although the 4 channels are strongly correlated. Instead, the depth of ROC FIFO could be reduced by increasing its readout speed. Even if the total size of ROS FIFO is comparatively not large, it may condition the possible integration at the level of the ROS board. The total number of bits required would be 40 kbits per board.

## **Showers and radiation background**

Until now we have assumed one event is equivalent to one particle traversing the chamber. Showers are expected to be *5%* of the events. For the time been we do not have any information about the number of particles or the angular distribution of showers. However, in the case of particles hitting wires at different ROCs, what has been said should remain valid. Otherwise a more careful consideration has to be made at the level of ROC boards.

The CMS Technical Proposal indicates (p. 122) that "the visible background flux in MS4 is  $\approx 10 \text{ cm}^{-2} \text{ s}^{-1}$  and significantly less in the inner barrel stations". Only a proper simulation can decide whether a deeper TDC FIFO is required.

# **Appendix 2 Integration factors in the Readout Architecture**

## **Introduction**

Up to now, the number of channels per chip has been set to 4. However, other integration factors are possible and even may be more convenient. The final number will depend on several factors like power dissipation, chip size, pins required for the package, technology used, and others. Some of these issues will be commented here.

## **Numerology**

First of all, it appears convenient to integrate TDCs and MTs together. The alternatives would be the necessity to provide connections for 17 bits for every TDC in the chip, or the duplication of TDCs.

Therefore, as the MTC allocation goes in step of 5 MTs from the inner SL, the possible number of TDCs per chip would be 4, 8, 16, 20, 40, 60, and so forth. And 1, 2, 4, 5, 10, 15, MTs respectively.

In addition, one subunit—the group of TDCs controlled by one ROC—would have 1 or 2 MTCs. This means 40 or 80 channels. The following table summarises the possible combinations:



## **Gate count**

As one channel requires about 4000 gates—plus TDCs and MTs—the following table gives the corresponding gate counts:

| TDCs | Gates $\times$ 10 <sup>3</sup> |
|------|--------------------------------|
| 4    | 16                             |
| 8    | 32                             |
| 16   | 64                             |
| 20   | 80                             |
| 40   | 160                            |
| 80   | 320                            |

**Gate counts as a function of the number of TDCs per chip.**

The final count will depend on the gates required for the TDCs and MTs.

## **Pin requirements**

The signals concerned are the following:

- Wire\_in. One per TDC.
- Address lines: Sn, were 2<sup>n</sup> is the number of TDCs.
- MT signals. 8 per MT.

The following table summarises the required number of lines for each case. The number of lines required for the operation of the chip independently of the number of channels is 66:



**Number of pins required as a function of the number of TDCs per chip.**

## **Final comments**

For the time been not much can be said. Integration factors like 20 or 40 look very promising. Further considerations—as power dissipation—may decide the most convenient configuration.



Figure 2: TDC Readout logic.



Figure 3: Bunch-crossing comparator block diagram



**Figure 4 : ROC board block diagram.**





**.—**1 1



**Figure 6: ROC Chip block diagram.**



**Figure 7: ROS block diagram**



