From SKA website: “Processing the vast quantities of data produced by the SKA will require very high performance supercomputers that will use millions of processors operating in parallel… To speed up some types of processing, the processors are likely to be assisted by specialised hardware…One of the software challenges for the SKA will be to adapt algorithms to operate on these new types of architectures…The project will also require middleware software and tools for ongoing software development.”
Developing and supporting massively parallel replacements for commonly used middleware is part of the core business of Open Parallel and its partners. The types of scalability changes required will need to go beyond tweaking the existent libraries, and may well involve pulling across the code using the targeted libraries into more parallel-friendly environments.
These developments have immediate spillover effect into other areas in science and industry -in New Zealand and globally: Open Parallel plans to use the expertise developed for SKA into other industries and apply it to Big Data, Internet of Things, and other problems
This is why we already prepared a draft proposal for a project able to be initiated now. Context of project: “An area where multicore focus is enough to get New Zealand involved so we have an entry point to grow a multicore community around it. We should focus in most likely area to start work with” John Houlker (NZTE, April 2012)
From Pre-Construction Phase, Stage 1 WBS (Work Breakdown Structure) and SoW of SKA (Square Kilometre Array)
6.10 SKA.TEL.SDP – Science Data Processor (pg. 275)
6.10.4 SKA.TEL.SDP.REQ – Requirements Analysis (pg. 303)
6.10.5 SKA.TEL.SDP.ARCH – Element Architecture (pg. 305)
In particular, WBS element identification – 5 SKA.TEL.SDP.ARCH.SARCH – Software Architecture (pg. 307)
WBS element description – Definition of software architectural requirements building on baseline software architecture. Starting from the reference architecture presented on the Conceptual Design Review (CoDR) perform gap analysis, analyse capabilities
Inputs – Standard input documents
Interdependencies – SKA.TEL.SDP.ARCH.OPTS – Architectural Options (pg. 307)
Tasks
– Analyse use cases and operating scenarios against reference architecture capabilities
– Perform gap analysis of reference architecture
– Evaluate methods of extending reference software architecture to include data flow
– Analyse scaleability of baseline software architecture through SKA1 to SKA2
– Identify possible common software libraries, including packages requiring changes e.g. for multi- threading.
Outputs / deliverables
Document candidate software architecture descriptions, including common software libraries
The problem:
Several radio astronomy packages (e.g. CASA, ASKAPSoft, LOFAR) make extensive use of a common set of software libraries, including casacore, CFITSIO, FFTW, WCSLIB. For scaling to SKA size problems, these must be adapted for use on parallel computers.
Open Parallel can contribute to reduce Intensity of Involvement (time spent) by solving a number of problems that can be considered independently, like this one.
We propose to deliver the task (highlighted above):
“Identify possible common software libraries, including packages requiring changes e.g. for multi- threading”
The project would cover the following:
– Identification of common software libraries for the SKA project (i.e. casacore, CFITSIO, WCSLIB)
– Description
– Status of development
– Scaleability
– Interdependencies with applications i.e. Redmine
Tasks
– Identify modules for multithreading (for example in casacore, analyse these modules: Arrays_module, HDF5_module, Quanta_module and others listed in http://www.atnf.csiro.au/computing/software/casacore/casacore-1.2.0/doc/html/group__casa.html
– Make an assessment of current status (For example, casacore http://code.google.com/project/casacoreis mostly undergoing minor changes now)
– Description of functionality
– Investigate feasibility of porting various modules to multithreaded environment
– Investigate cost of porting packages and modules to multithreaded environment
– Present a recommendation to SKA for inclusion in Stage 1 planning of the PEP and possible execution in Stage 2.
This is a significant opportunity for New Zealand to:
– build capabilities that are specifically required for SKA,
– position ourselves ahead of the RfP with concrete work delivered (this is one of the few pieces of work that are able to be built during SKA1 – Pre-Construction phase)
– showcase exclusive track record towards SKA
– start to build an ecosystem of talent around Multicore in New Zealand associated with SKA, ready for the following work
– start an effective collaboration with SKA central office and CSIRO from now
Other contribution:
Given the computational needs it’s important to present that Open Parallel has expertise in GPGPU computing. The statement from SKA website says: “To speed up some types of processing, the processors are likely to be assisted by specialised hardware. One of the software challenges for the SKA will be to adapt algorithms to operate on these new types of architectures”.
This could be a reference to GPUs, or a totally new type of processor, which would be very costly. Especially considering the investment already done in a small, GPU-heavy cluster for the preliminary SKA work in Western Australia. Open Parallel proposes the following:
Analysing the data produced by SKA requires a modern approach. The computational needs of SKA exceed, by far, any other radio astronomy project. Extending existing software packages is more likely to increase the cost of computing hardware than provide the necessary scalability.
GPU computing is a necessary, but not sufficient, step for analysing the data from SKA. GPUs provide excellent scalability for the easily parallelisable parts of the data analysis. However, dependencies between data sets inherently limits overall scalability.
Modern programming paradigms, proven to provide the best scalability for cloud computing, are necessary to ensure cost-efficient performance for SKA. Functional programming can be efficiently and automatically parallelised, thanks to it’s syntax being mathematical in nature. Merely extending existing software packages is unlikely to provide cost-efficient scalability.
Open Parallel is uniquely positioned in the region to ensure that SKA middleware scales at a low cost. Massively parallel execution of functional programming is part of Open Parallel’s expertise portfolio. As is the parallelisation of legacy code, which may be useful in the early prototype stage.
—————
Open Parallel can lead and / or contribute to the effective development, implementation and maintenance of software for SKA, to
– Process the data
– Transport data
– Develop embedded software that operates in different devices and artifacts
– Observation preparation and scheduling which are independent of data pipeline and data reduction processing.
– Real time monitor and control.
– Time critical online data reduction and archiving and storage
——————–
Open Parallel and its partners have relevant experience in distributed work at large scale projects, being the open source software development the most significant.
Open Parallel participates in SKA activities, being active member of the NZSKAIC since early 2011. Other activities involved several presentations particularly at the Rutherford Innovation Showcase (September 2011) and Multicore World 2012 (March 2012)
In 2012, Open Parallel worked at CSIRO (Sydney, directly with Tim Cornwell and Ben Humphreys) sponsored by NZTE, in the development of the proposal which is part of the EoI for New Zealand Government