MULTIPROCESSOR DATA ACQUISITION SYSTEM J. R. Haumann and R. K. Crawford Argonne National Laboratory Electronics Department and

Intense Pulsed Neutron Source Division 9700 S. Cass Avenue Argonne, Illinois 60439

#### Abstract

A multiprocessor data acquisition system has been built to replace the single processor systems at the Intense Pulsed Neutron Source (IPNS) at Argonne National Laboratory.[1][2] The multiprocessor system was needed to accommodate the higher data rates at IPNS brought about by improvements in the source and changes in instrument configurations. This paper describes the hardware configuration of the system and the method of task sharing and compares results to the single processor system.

## Introduction

#### Background

5.

The Intense Pulsed Neutron Source (IPNS) at Argonne National Laboratory began operation in May, 1981. At that time a distributed processing data acquisition system,  $^{[3]}$  comprised of a central VAX 11-780 linked to multiple PDP 11-34's each closely coupled to one or more Multibus based data acquisition systems, was implemented (see Figure 1). The data acquisition systems were programmed to control and acquire data from Camac hardware and to histogram these data in various methods requiring on-the-fly calculations dependent on lookup table parameters down loaded to the Multibus system from a PDP 11/34. The Multibus based data acquisition system was made up of a Zilog 4 MHz Z-8001 CPU board manufactured by Central Data Corporation, a custom interface board to a Camac crate controller, an interface to the PDP 11-34, a SIO/PIO board and multiple memory boards consisting of between 128K and 2M bytes of memory. The PDP 11-34 computer is the user interface and is used to generate the lookup table parameters, control data acquisition by passing commands to the Multibus data acquisition system, display and store the histogrammed data collected in Multibus memory and do preliminary analysis of the data.

By August 1986 there were seven PDP-11 computers with ten Multibus data acquisitions systems in operation at IPNS. Because of improvements in the neutron source and increased numbers of detectors per instrument, the Multibus data acquisition systems on some instruments were reaching an overload condition with higher data rates than the 4KHz rate they could histogram. Furthermore, an additional design change in the neutron source being implemented in the summer of 1987 is predicted to increase the data rate by another factor of 3. For these reasons a redesign of the data acquisition system was necessary.

# New System Requirements

The primary design requirement for the new data acquisition system for IPNS was that it be able to handle data rates up to 30KHz with the histogramming algorithm use.

CONF-870552--19

The existing data acquisition systems at IPNS include about \$90,000 in Multibus hardware and 7 to 10 man-years of software development. Because of this considerable investment, the new data acquisition system was to be as compatible as possible with existing software and use as much of the previously developed and purchased hardware as possible.

It also was desirable not to enter into a hardware development project to accomplish the higher data rates because of the uncertainty of the final performance, and costs of reproduction.

### System Description

The decision was made to keep the new data acquisition system a Multibus based system, thus allowing the use of existing hardware. With increased speed and hardware and software compatibility being the prime design requirements, the most natural upgrade would be to replace the existing Z-8001 processor with a higher speed processor of the same type. Upgrading to a higher speed processor would give some increase in performance (about a factor of 2), but would not give the increase desired. It was concluded that to achieve the full desired increase it would be necessary to use multiple high speed processors on the Multibus to share the work. This requires processors which have on-board RAM memory, preferably dual-ported to the Multibus, and bus arbitration logic to resolve contention for the Multibus. At the time this design was undertaken we could not find a commercially available processor based on the Z-8001 family which met these requirements, and we did not desire to develop such a board except as a last resort. Thus it was necessary to find a suitable Multibus microprocessor configured such that multiple processors could operate concurrently on a common Multibus. In addition to high speed, the desired features of such a Multibus processor board are:

- The board should contain at least 65K bytes of 1. on-board dual ported RAM with Multibus addresses
- on-board quar per and CPU bus addresses independence. The CPU should have memory addressive capability to at least 8M bytes. The processor board should have parallel bus orbitration logic to the Multibus. 2.
- 3.

Work performed under the auspices of U. S. Dept. of Energy.

The submitted manuscript has been authored by a contractor of the U.S. Government under contract No. W-31-109-ENG-38. Accordingly, the U.S. Government retains a nonexclusive, royalty-free license to publish or reproduce the published form of this contribution, or allow others to do so, for U. S. Government purposes.

CONF-870552--19

DE87 011429



Figure 1. Original IPNS Data Acquisition System.

4. The processor board chosen should have software cross-development tools available compatible with DEC RSX or VMS operating systems, and have some means of software on-line debug capability.

۲.

Processor boards based on the Intel 286, Motorola 68020 and National 32016 CPU chips were evaluated. The evaluations were based on the inclusion of the above desirable properties and the running of a benchmark program to evaluate the performance of these processors.

The benchmark performed on the various processors was a routine written in-house, which included the types of instructions used in building histograms, such as; memory to memory moves, table look-ups via indexing, logical shifts, integer addition and multiplication, and memory incrementing. Table 1 summarizes the benchmark results obtained.

Table 1. Processor Board Benchmark Comparisons.

| Processor type.                 | Intel<br>286 | Motorola<br>68020 | National<br>32016 |
|---------------------------------|--------------|-------------------|-------------------|
| Processor speed.                | 6 MHz        | 10 MHz            | 10 MHz            |
| Relative benchmark<br>run time. | 16           | 12                | 13                |

The three processor boards all met the first 3 desired features above. Only the board produced by National had extensive software cross-development tools available which could be hosted on a VAX computer. The other processor systems offered specialized development systems based on the type of CPU used on the board being developed.

We determined that a Multibus development board based on the National 32016 CPU, model NSP-DB32016-10, sold by National Semiconductor Corporation best met our requirements. Though this board did not have the best results in our benchmark test, it best met the other desired properties.

The National CPU board selected is configured as follows:

- 1. A 10MHz National NS32016 microprocessor and support circuitry. The NS32016 contains a 32-bit internal architecture with a 16-bit external dataway and 24-bit address bus.
- 2. A National NS32202 interrupt controller chip.
- 3. 128K or 512K bytes of on-board dual-port RAM with Multibus addressing space selectable independent of on- board address space. The amount of onboard memory accessible by the Multibus is selectable in 1/4 total memory increments.

- Four sockets for up to 96K bytes of ROM or EPROM.
  24 programmable parallel I/O lines.
- 6. Two RS232 compatible serial data ports.
- 7. Multibus interface with multimaster capability.
- 24-bit addressing allowing access of up to 16M bytes of combined system memory and I/O.
- 9. A set of readable hardware configuration switches.
- 10. A BLX expansion bus.

ъ.

The software development tools available include a cross-assembler, pascal compiler, linker, and symbolic debugger, all operating on a DEC VAX computer under VMS. The board is supplied with an on-board "Tiny Development System" to assemble and test small sections of code on line with a terminal connected directly to one serial input on the Multibus board. The source code for an EPROM based monitor which can be used for on line debug, down loading of code from the VAX, and operating interactively with the symbolic debugger, is available.

The hardware and software come with extensive documentation and complete schematics.

#### System Configuration

Figure 2 shows the Multibus hardware configuration for the multiprocessor data acquisition system. The only new components in the system are the National CPU boards. The remainder of the boards are from the old data acquisition system. The SIO/PIO boards are no longer needed since these functions are included on the National boards.

The only hardware modification necessary to the Multibus system was a change in the busing of pins on the P2 connectors on the Multibus card cage for the slots used by the National CPU boards. The old system had all pins bused across. The National CPU boards have diagnostic signals brought out on the P2 connector which would conflict with the signals from other similar boards on the same bus. Therefore these lines cannot be bused in the multiprocessor system. These lines were not used on the old processor system so modification of this busing did not preclude the use of the old Z-8001 CPU boards in the modified Multibus card cages.

The upgraded data acquisition system can be configured with one to four CPU boards depending on the data histograming rate required for each particular instrument. The amount of histograming memory is also variable from 128K to 7.5M bytes dependent on histogram size needed. Table 2 shows the Multibus physical memory map for the systems.



Figure 2. Current IPNS Data Acquisition System.

#### Table 2. Multibus Physical Memory Map

1. 1.

Multibus Memory Address

Description

| 0000 <b>H</b> –7 <i>F</i> FFFH | CPU 1 (Master) on-board dual ported memory.                       |
|--------------------------------|-------------------------------------------------------------------|
| 80000H-87FFFH                  | CPU 2 (Slave 1) on-board dual                                     |
| 88000H-8FFFFH                  | ported memory.<br>CPU 3 (Slave 2) on-board dual<br>ported memory. |
| 90000H-97FFFH                  | CPU 4 (Slave 3) on-board dual                                     |
| 0A0000H-7FFFFFH                | ported memory.<br>Histogram Memory                                |

We used the EPROM based monitor and cross assembler along with the symbolic debugger as our system development tools. The histograming software used with this data acquisition system is essentially a translation of the Z-8001 assembly language software to equivalent 32016 assembly language instructions, with the addition of routines to handle the multiprocessor task sharing protocol.

Multiprocessor Task Sharing Protocol The multiprocessor data acquisition system is configured to operate with one to four processors, a Master and up to three Slaves. Each processor has its own onboard RAM with unique Multibus addresses as shown in Table 2, The on-board RAM is divided into areas dedicated to look-up tables containing the histogram configurations, the program memory, and raw data buffers.

Two methods of operating multiprocessors, sharing a single bus, were tested. In both methods one processor would act a Master with up to three Slave processors. The Master processor would be responsible for initiating the DMA transfers from Camae to the raw data areas on all processor boards, and for executing commands from the PDP 11 computer. The major difference between the two methods tested is the way in which the histogram memory is updated by the Slave processors.

Method "A" for operating with multiple processors is for each processor to reduce the raw data and build the histograms which are in common memory. The memory would be accessed by each processor by arbitrating for the use of the multibus. In the Multibus arbitration scheme the processor waits, after issuing a bus request, until a bus grant is received. Using this method, as more processors are added to the bus there is a higher probability of each processor having to wait for access to the Multibus thus causing a higher average dead time per event analyzed by each processors. In this method, adding additional processors does not significantly increase the data analysis rate as the bus duty cycle becomes high.

Method "B" performs by having one processor, the Master, control the Multibus activity and having the Slave processors build tables, in their on-board memory, of histogram addresses, which need to be incremented. The Master processor would scan these tables, on each Slave board, and increment the appropriate histogram memory locations as indicated by those tables. With this method there is no bus arbitration since only the Master processor has access to the bus, but the Master processor spends a larger percentage of its time servicing the Slaves and therefore has less time available to reduce data.

Table 3 shows comparative results, with the IPNS histograming algorithm, for the two methods of operation using up to four processors.

Table 3. Bus Sharing Comparisons

|                   | Total<br>Events<br>Method<br>"A" | Histogrammed<br>Per Second<br>Method<br>"B" |
|-------------------|----------------------------------|---------------------------------------------|
| One 32016 CPU     | 9,000                            | 9,000                                       |
| Two 32016 CPU's   | 15,000                           | 17,000                                      |
| Three 32016 CPU's | 19,500                           | 25,500                                      |
| Four 32016 CPU's  | 22,000                           | 32,500                                      |

Because of the higher performance achieved using method "B", this is the one used in the upgraded data acquisition system and whose explanation follows.

<u>Master Processor</u> The Master processor is responsible for executing commands received from the PDP, down loading the program and histograming tables to the Slave processors, initiating the DMA transfers of the raw data from Camac to the Slave processors, and incrementing the calculated histogram addresses.

On reset, the Master processor will begin operation by executing the monitor program supplied by National. This monitor initializes the I/O ports and interrupts and polls for serial input. The monitor has been modified to also test the state of a hardware configuration switch. This switch is used to indicate whether the data acquisition program is resident in on-board EPROM. If the program is resident the CPU will bootstrap this program into fast on-board RAM and begin executing that program. If the program is not resident, the CPU will remain in the monitor polling the serial port and testing a memory location for a unique word value indicating that a program has been down loaded, from the PDP or VAX, to the program memory area of the Master processor. The CPU will then begin executing this program. While the CPU is executing the monitor program, the VAX program development computer can communicate with the National board via the serial port, for debug and testing.

When execution of the data acquisition program has started, the CPU determines whether it is a Master or Slave. This is indicated by the setting of a second configuration switch on the board. If a Master, it tests to see how many Slaves are present on the system. This test is done by trying to write and read a Multibus memory location assigned to each Slave (see Table 2). If the test is successful the Master will down load a copy of the program to each Slave and cause the Slave to begin executing that program.

When the Slaves, if any, are initialized, the Master processor waits for commands from the PDP. These include commands to control the Camac modules, clear sections of Multibus memory, and start and stop data acquisition. When the command to start data acquisition is received from the PDP, the Master CPU assumes that histogram configuration tables have been loaded to the appropriate locations in memory and the Master transfers these tables to the Slave processors. It then starts the data acquisition program. The Master processor is notified via an interrupt from its parallel input, that the polling module has detected that data is present in a Camac data acquisition The Master processor will read the module module. number from the parallel input and cause a DMA transfer of this data to a Slave processor via the Camac interface. Each processor has a section of memory dedicated to raw data storage with fill and empty pointers into the storage area. The Master processor checks for free space in the raw data buffer of one of the Slave processors present in the system, and sets up the DMA transfer from Camac to the first available buffer. See Figure 3 for a flowchart of this interrupt procedure. If there is no free space in any of the Slave processor buffers, or if no Slave processor is present on the system, the Master will DMA the raw data to its own buffer.

• , ⊁.

When the Master processor is not busy setting up DMA transfers, it checks the Slave buffer areas containing addresses to be histogrammed to determine whether these areas are filled. There are two buffers on each Slave processor with the first word of each indicating the number of addresses stored in the



buffer. When a buffer is full the Master processor will begin emptying that buffer while the Slave is filling the second buffer. If no buffers are full the Master processor checks its raw data area and histograms this data, if any. It also checks to see if any commands are received from the PDP and executes these commands. Figure 4 contains a flowchart of this procedure.

Slave Processor When the Slave processor is first started, it too will initialize its on-board programmable devices, determine that it is a Slave by testing the appropriate configuration switch, and then wait for the start of data acquisition. When a Slave processor is executing the data acquisition program, its sole function is to calculate the histogram addresses from the data contained in the raw data area, and store these addresses in one of the two histogram address buffers.



Figure 4. Master Processor Data Acquisition Loop.

Figure 3. Camac Raw Data Distribution Flowchart.

、· ·

#### Performance

The single Zilog Z-8001 data acquisition system could build histograms at about a 4,000 events per second rate. This can be compared to the results shown in Table 3 for one to four National 32016's.

When running at the maximum data rate with 4 processors, the Multibus is utilized at nearly its maximum bandwidth. If more processors were added to the system, in an attempt to increase the data rate further, it would be necessary to remove some activity from the Multibus. This could be done by causing the DMA transfers to occur to each processor through the BLX expansion bus contained on each processor board. This would involve the design of a special BLX module and a redesign of the Camac interface module. This design is not foreseen at this time.

#### CONCLUSION

The design goal of increasing the data rates achievable in the IPNS data acquisition system from 4KHz to at least 30 KHz was achieved by replacing the 4MHz Z-8001 processor system with a multiprocessor system made up of four 10MHz NS32016 processor boards. Using the Master-Slave method of operation described, four processor boards are the maximum that can efficiently operate on the Multibus.

The design, software conversion and additions for multiprocessor operation, and minor hardware changes, required about 8-9 man-months of effort. Only minor changes were required in the PDP system's software. Thus a factor of 8 improvement in speed was achieved with only a 10% increment in development and programming effort. The new system has now been installed and is operating reliably on all 10 IPNS instruments.

## References

- R.T. Daly, J.R. Haumann, M.R. Kraimer, F.R. Lenkszus, W.P. Lidinsky, C.P. Morgan, L.L. Rutledge, P.E. Rynes and J.W. Tippie "Data Acquisition and Control System for the IPNS Time-of-Flight Neutron Scattering Instruments" IEEE Transactions on Nuclear Science NS-26, 4554 (1979).
- 2. R.K. Crawford, R.T. Daly, J.R. Haumann,
  - R.L. Hitterman, C.B. Morgan, G.E. Ostrowski and T.G. Worlton "The Data Acquisition System for the Neutron Scattering Instruments at the Intense Pulsed Neutron Source"
    - Presented at the Topical Conference on Computerized Data Acquisition in Particle and Nuclear Physics on May 28-30, 1981. IEEE Transactions on Nuclear Science NS-28, No. 5
- 3. J.R. Haumann, R.T. Daly, T.G. Worlton, and R.K. Crawford "IPNS Distributed Processing Data Acquisition System" IEEE Transactions on Nuclear Science Vol. NS-29, No. 1, February 1982.

VAX and PDP are trademarks of Digital Equipment Corporation.

Multibus is a trademark of Intel Corporation.

"Reference to a company name or product does not imply approval or recommendation of the product by Argonne National Laboratory or the U.S. Dept. of Energy to the exclusion of others that may be suitable."

# DISCLAIMER

This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.

.1