Filters
Results 1 - 1 of 1
Results 1 - 1 of 1.
Search took: 0.026 seconds
AbstractAbstract
[en] Coarse-grained angular domain decomposition of the mesh sweep algorithm has been implemented in ORNL's three dimensional transport code TORT for Cray's macrotasking environment on platforms running the UNICOS operating system. A performance model constructed earlier is reviewed and its main result, namely the identification of the sources of parallelization overhead, is used to motivate the present work. The sources of overhead treated here are: redundant operations in the angular loop across participating tasks; repetitive task creation; lock utilization to prevent overwriting the flux moment arrays accumulated by the participating tasks. Substantial reduction in the parallelization overhead is demonstrated via sample runs with fixed tunning, i.e. zero CPU hold time. Up to 50% improvement in the wall clock speedup over the previous implementation with autotunning is observed in some test problems
Primary Subject
Secondary Subject
Source
1996; 12 p; Seminar on 3D deterministic radiation transport computer programs, features, applications and perspectives; Paris (France); 2-3 Dec 1996; CONTRACT AC05-96OR22464; Also available from OSTI as DE97001732; NTIS; US Govt. Printing Office Dep
Record Type
Report
Literature Type
Conference
Report Number
Country of publication
Reference NumberReference Number
INIS VolumeINIS Volume
INIS IssueINIS Issue