Program – 6th Multicore World

Draft v4.0 (subject to changes) – Updated 16th February 2017

Speakers List is here

Buy Tickets here

Program MW17 v4.0 (PDF)

Panels list is here


Day 1 – Monday 20th February 2017

8:20 – 8:40 Opening Welcome – Setting the Scene

Nicolás Erdödy (Open Parallel)

8:40 – 9:10 Japanese plans for Open High Performance Computing and Big Data / Artificial Intelligence Infrastructure

Prof. Satoshi Matsuoka – Professor, Tokyo Institute of Technology and Fellow, Advanced Institute for Science and Technology, Japan

Abstract – Japanese investment into public, open science HPC infrastructures for research and academia have had a long history, in fact longer than that of the United States. There now is also a focus on Big Data / Artificial Intelligence (BD/AI), with national-level AI centers: In particular, I lead a project of facilitating one of the world’s largest BD/AI focused open and public computing infrastructure. The performance of the machine is slated to be well above 130 Petaflops for machine learning, as well as acceleration in I/O and other properties desirable for accelerating BD / AI.

9:15 – 10:00 Keynote – The Revolution in Experimental and Observational Science: The Convergence of Data-Intensive and Compute-Intensive Infrastructure

Prof. Tony Hey – Chief Data Scientist, Science & Technology Facilities Council (STFC), UK

Abstract – The revolution in Experimental and Observational Science (EOS) is being driven by the new generation of facilities and instruments, and by dramatic advances in detector technology. In addition, the experiments now being performed at large-scale facilities, such as the Diamond Light Source in the UK and Argonne Advanced Photon Source in the US, are becoming increasingly complex, often requiring advanced computational modelling to interpret the results. There is also an increasing requirement for the facilities to provide near real-time feedback on the progress of an experiment as the data is being collected. A final complexity comes from the need to understand multi-modal data which combines data from several different experiments on the same instrument or data from several different instruments. All of these trends are requiring a closer coupling between data and compute resources.

10:00 – 10:30 – Morning Tea

10:30 – 11:15 Keynote – The Quantum Revolution: Computing, past, present and future

Prof. Michelle Y. Simmons – Scientia Professor of Physics, University of New South Wales, Australia. Director, Centre for Quantum Computation & Communication Technology, School of Physics, UNSW. Australian Research Council Laureate Fellow

Abstract – Down-scaling has been the leading paradigm of the semiconductor industry since the invention of the first transistor in 1947, producing faster and smaller computers every year. However device miniaturization will soon reach the atomic limit, set by the discreteness of matter, leading to intensified research in alternative approaches for creating logic devices of the future. In this talk I will describe the emerging field of quantum information. In particular I will focus on the development of quantum computing, where Australia is leading an international race to build a large-scale prototype in silicon.

11:20 – 12:00 Panel – From a Multicore World to the Exascale Era -and beyond! What could happen? What NEEDS to happen?

Prof. Satoshi Matsuoka (Japan) – Moderator, Prof. Michelle Y. Simmons (Australia), Pete Beckman (US), Dave Jaggar (NZ), Prof. Michael Kelly (UK-NZ), Prof. Tony Hey (UK)

12:10 – 1:10 Lunch

1:10 – 1:35 HPC/Cloud Hybrids for Efficient Resource Allocation and Throughput

Lev Lafayette – HPC Support & Training Officer, The University of Melbourne, Australia

Abstract – HPC systems running massively parallel jobs need a fairly static software operating environment running on bare metal hardware, a high speed interconnect to reach their full potential, and offer linear performance scaling for cleverly designed applications. Cloud computing, on the other hand, offers flexible virtual environments and can be used for pleasingly parallel workloads.

Two approaches have been undertaken to make HPC resources available in a dynamically reconfigurable hybrid HPC/Cloud system. Both can can be achieved with few modifications to existing HPC/Cloud environments. The first approach, the “Nemo” system at the University of Freiburg, deploys a cloud-client on the HPC compute nodes, so the HPC hardware can run Cloud-Workloads for backfilling free compute slots. The second approach, the “Spartan” system at the University of Melbourne, generates a consistent compute node operating system image with variation in the virtual hardware specification which can be modified according to needs.

1:40 – 2:10 The ASKAP Science Data Processor system: A Computing precursor of the SKA

Juan Carlos Guzman – Head of the ATNF Software and Computing Group, Australia

Abstract – The ASKAP Science Data Processor software named ASKAPsoft has been in development since the ASKAP project started more than 10 years ago and is now processing and commissioning early science data with a third of the array. The software runs in a dedicated HPC platform located at the Pawsey Supercomputing Centre, processing radio interferometry data at a current rate sufficient to support the Early Science program. Software development and commissioning is ongoing to reach an order of magnitude better performance for the full ASKAP array.

The software system was originally designed to calibrate and image ASKAP science data, and over the last few months we’ve also started to look to expand its usage to other instruments such as MWA and SKA1_LOW. This talk describes the unique features of ASKAPsoft for the ASKAP Science projects, shares a few “words of wisdom” gained (and still gaining) during development and commissioning, and describes our development roadmap to support the full ASKAP and other instruments, in particular the SKA.

2:15 – 3:00 KeynoteSupercomputer Resilience – Then, Now, and Tomorrow

Nathan DeBardeleben – Senior research scientist at Los Alamos National Laboratory in High Performance Computing Design and the lead of the Ultrascale Systems Research Center. USA.

Abstract – Supercomputer resilience, reliability, and fault-tolerance are increasingly challenging areas of extreme-scale computer research as government agencies and large companies strive to solve the most challenging scientific problems. Today’s supercomputers are so large that failures are a common occurrence and various tolerance and mitigation strategies are required by the system, middleware, and even application software. In this talk we discuss trends in this field and how data analytics and machine learning techniques are being applied to influence the design, procurement, and operation of some of the world’s largest supercomputers.

3:00 – 3:30 Afternoon Tea

3:30 – 4:15 PanelBD / AI / ML / IoT / C@E / Deep Learning, etc, etc…Which is the Real Technology Behind These Trendy Buzzwords?

Pete Beckman (US) – Moderator, Paul Fenwick (Australia), Prof Michael Kelly (UK-NZ), John Gustafson (Singapore), Paul McKenney (US), Prof Satoshi Matsuoka (Japan)

4:15 – 5:00 Keynote – Parallel Computing at the Edge: Technology for Chicago Street Poles and for Exascale Systems

Pete Beckman – Co-Director, Northwestern-Argonne Institute for Science and Engineering, Argonne National Laboratory, USA. Leads Projects Argo Exascale Operating System, Waggle – Sensors for the Array of Things

Abstract – Sensors and embedded computing devices are being woven into buildings, roads, household appliances, and light bulbs. Most are as simple as possible, with low-power microprocessors that just push sensor values up to the cloud. However, another class of powerful, programmable sensor node is emerging. The Waggle (wa8.gl) platform supports parallel computing, machine learning, and computer vision for advanced intelligent sensing applications. Waggle is an open source and open hardware project at Argonne National Laboratory that has developed a novel wireless sensor system to enable a new breed of smart city research and sensor-driven environmental science. Leveraging machine learning tools such as Google’s TensorFlow and Berkeley’s Caffe and computer vision packages such as OpenCV, Waggle sensors can understand their surroundings while also measuring air quality and environmental conditions. Waggle is the core technology for the Chicago ArrayOfThings (AoT) project (arrayofthings.github.io). So how is this work related to Exascale operating systems and extreme-scale platforms for science? Join me as we explore the new world of parallel computing everywhere. The presentation will outline the current progress of designing and deploying new platforms and our progress on research topics in computer science, including parallel computing, operating system resilience, data aggregation, and HPC modelling and simulation.

5:00 – 5:15 Debate

5:15 – 7:00 Light Dinner


Day 2 – Tuesday 21st February 2017

8:30 – 8:40 – General Information and Recap

Nicolás Erdödy (Open Parallel)

8:45 – 9:25 – Keynote – OpenStack for HPC in Africa

Happy Sithole – Director, Centre for High Performance Computing, South Africa

Abstract – This presentation will cover the South African perspective in advancing HPC Technologies in the Region and how we plan to contribute to the global space. The focus will be on the technologies, what is required for a hub -looking at the opportunities and challenges, and how will it attract the attention of the world to be a global hub. It will cover HPC developments in the region, the OpenStack initiative and the African Research Cloud perspective.

9:30 – 10:00 – Ministerial Address

The Honourable Paul Goldsmith, New Zealand’s Minister for Science and Innovation, Minister of Tertiary Education, Skills and Employment, and Minister for Regulatory Reform

10:00 – 10:30 Morning Tea

10:30 – 11:15 Keynote – Extreme Scale Multi-tasking using DALiuGE

Prof. Andreas Wicenec – Professor of Data Intensive Research, ICRAR, Perth, Australia. Task leader of the Data Layer for the SKA Science Data Processor

Abstract – The SKA processing will require to launch, control and terminate up to some 100 million individual, but logically related tasks for a single reduction run spanning around 6-12 hours. This translates to about 5,000 tasks/s on average, distributed across potentially thousands of compute nodes. Since the actual reduction components as well as the compute platform are not yet determined and will also change during the operational lifetime of the SKA, both the number of tasks as well as the number of compute nodes are likely to be quite variable. In addition the size of a reduction deployment is also dependant on the actual scientific goals of a given experiment.

DALiuGE is a prototype framework designed and developed on the basis of these requirements. It is using many of the ideas published for other existing frameworks, such as SPARK, Swift/T, TensorFlow or Parsec. Different from those it goes back to first principles in order to tackle the most relevant requirements for the SKA use case first and foremost. For the scalability in particular, it is following a very strict architecture, enforcing completely asynchronous task execution as well as a hierarchical task deployment and tear-down. DALiuGE also enables the usage of existing reduction components without any change, while also allowing the development of more optimised, dedicated components, if required. The other unique feature is the complete separation of the logic of the reduction process from the deployment onto hardware at run-time. This also allows scientists to concentrate on the development of the logic, without having to deal with the final deployment issues. In this talk we will present the architectural concepts as well as results from very large scale deployments.

11:15 – 11:40 Update on New Zealand SKA Alliance participation in the SKA project

Andrew Ensor – Director, New Zealand SKA Alliance

Abstract – The Square Kilometre Array is the largest mega-Science project of the next decade and represents numerous firsts for New Zealand. This talk will outline the project, its unprecedented computing challenges, New Zealand’s key involvements, and progress on the computer system design as the construction phase draws closer.

11: 45 – 12:15 Panel – Towards the SKA 2018 Tender: Challenges and Opportunities

Simon Rae (NZ) – Moderator, Prof. Tony Hey (UK), Happy Sithole (South Africa), Andrew Ensor (NZ), Juan Carlos Guzman (Australia), Prof Andreas Wicenec (Australia)

12:20 – 1:20 Lunch

1:20 – 1:50 SKA – SDP Middleware: open and collaborative HPC in the SKA

Piers Harding – Senior Consultant, Catalyst IT, New Zealand

Abstract – The aspirations of the Science Data Processor (SDP) has many of the characteristics of a commodity ‘big data’ problem even though the bulk of the known processing requirements are seemingly narrow, well defined and closed in nature. Everything apart from the processing time frames, and power allocation are big: logging, messaging, storage, network, computation. Even the project structure is enormously collaborative with many countries and institutions involved.

Within the SDP processing the SKA could have opted for a closed solution for the rendering and delivery of data products, but the project has recognised an opportunity to do things differently by stipulating the guiding principles of using open standards, open source, and commodity computing. This framework will make it possible to bring the widest possible research audience closer to the processing frontier, giving them greater access to the telescope and more fine grained control of their own observations. This talk walks through some of the opportunities and problems that will be faced by the SDP as the project attempts to realise the ambition of utilising commodity computing technology, with stringent high performance computing and energy requirements, whilst reflecting on how this can take advantage of major solution architecture trends that will unfold over the coming years.

1:50 – 2:10 IHK/McKernel: A Lightweight Multi-kernel based Operating System for Extreme Scale HPC

Balasz Gerofi – Research Scientist in the System Software Research Team at RIKEN Advanced Institute for Computational Science (AICS) – Japan

Abstract – RIKEN Advanced Institute for Computation Science has been appointed by the Japanese government as the main organization for leading the development of Japan’s next generation flagship supercomputer, the successor of the K. Part of this effort is to design and develop a system software stack that suits the needs of future extreme scale computing. In this talk, we primarily focus on OS research and discuss IHK/McKernel, our multi-kernel based operating system framework. IHK/McKernel runs Linux with a light-weight kernel side-by-side on compute nodes with the primary motivation of providing scalable, consistent performance for large scale HPC simulations, but at the same time to retain a fully Linux compatible execution environment. Lightweight multi-kernels and specialized OS kernels in general have been receiving a great deal of attention recently, not only in HPC but in the context of cloud and embedded computing as well. We provide an update of the project and outline future research directions.

2:15 – 3:00 Keynote – Does RCU Really Work?

Paul McKenney – IBM Distinguished Engineer, IBM Linux Technology Center, USA

Abstract – Bugs will always be with us, and given that there are well over a billion instances of the Linux kernel running around the world, a hypothetical linux-kernel RCU race condition that happens once per million years of runtime will be happening several times per day across the installed base. Yet achieving even this level of robustness in a highly concurrent software artifact poses a severe challenge to the current software engineering state of the art. This presentation will give an overview of the techniques being used to start to meet this challenge, up to and including some exciting advances in formal verification, which have resulted in formal verification being added to the Linux-kernel RCU regression-test suite.

3:00 – 3:30 Afternoon Tea

3:30 – 4:15 Panel – Where is New Zealand’s ICT / High-Tech Ecosystem Heading?

Victoria Mclennan (2016 ICT NZer of the year) – Moderator, Clare Curran, MP (Labour Party ICT Spokesperson), Ralph Highnam (CEO, Volpara Technologies), Guy Kloss (Qrious), Mark Moir (Oracle), Dave Jaggar (ex-ARM)

4:15 – 5:00 Keynote – The Future Is Awesome (and what you can do about it)

Paul Fenwick – Public speaker, open source authority, and science educator. Managing Director, Perl Training, Australia.

Abstract – Technology is advancing at a faster rate than society’s expectations, and can go from science-fiction to being consumer-available, with very little in the way of discussion in between, but the questions they raise are critically important: What happens when self-driving vehicles cause unemployment, when medical expert systems work on behalf of insurance agencies rather than patients, and weapon platforms make their own lethal decisions?

5:00 – 5:15 Debate

5:15 – 7:00 Light Dinner


Day 3 – Wednesday 22nd February 2017

8:45 – 8:55 General Information and Recap

8:55 – 9:05 The Exascale Institute -and other projects from and for New Zealand

Nicolás Erdödy (Open Parallel)

9:1010:00 Keynote – FLOPS to BYTES: Accelerating Beyond Moore’s Law

Satoshi Matsuoka – Professor, Tokyo Institute of Technology and Fellow, Advanced Institute for Science and Technology, Japan

Abstract – The so-called “Moore’s Law”, by which the performance of the processors will increase exponentially by factor of 4 every 3 years or so, is slated to be ending in 10-15 year timeframe due to the lithography of VLSIs reaching its limits around that time, and combined with other physical factors. This is largely due to the transistor power becoming largely constant, and as a result, means to sustain continuous performance increase must be sought otherwise than increasing the clock rate or the number of floating point units in the chips, i.e., increase in the FLOPS. The promising new parameter in place of the transistor count is the perceived increase in the capacity and bandwidth of storage, driven by device, architectural, as well as packaging innovations: DRAM-alternative Non-Volatile Memory (NVM) devices, 3-D memory and logic stacking evolving from VIAs to direct silicone stacking, as well as next-generation terabit optics and networks. The overall effect of this is that, the trend to increase the computational intensity as advocated today will no longer result in performance increase, but rather, exploiting the memory and bandwidth capacities will instead be the right methodology. However, such shift in compute-vs-data tradeoffs would not exactly be return to the old vector days, since other physical factors such as latency will not change when spatial communication is involved in X-Y directions. Such conversion of performance metrics from FLOPS to BYTES could lead to disruptive alterations on how the computing system, both hardware and software, would be evolving towards the future.

10:0 – 10:30 Morning Tea

10:30 – 11:15 Panel – Does Big Science necessarily mean Big Budgets?

Prof. Michael Kelly (UK-NZ) – Moderator, Prof. Andreas Wicenec (Australia), Dr. Happy Sithole (South Africa), Dr. Andrew Ensor (New Zealand), Dr. John Gustafson (Singapore), Pete Beckman (US), Prof Satoshi Matsuoka (Japan)

11:15 – 12:00 Keynote – Beating Floats at Their Own Game: Faster Hardware and Better Answers

John Gustafson – Visiting Scientist at A*STAR and Professor in the School of Computing at the National University of Singapore. He is a former Director at Intel Labs and the former Chief Product Architect at AMD.

Abstract – A new data type called a “posit” is designed for direct drop-in replacement for IEEE Standard 754 floats. Unlike unum arithmetic, posits do not require interval-type mathematics or variable size operands, and they round if an answer is inexact, much the way floats do. However, they provide compelling advantages over floats, including simpler hardware implementation that scales from as few as two-bit operands to thousands of bits. For any bit width, they have a larger dynamic range, higher accuracy, better closure under arithmetic operations, and simpler exception-handling. For example, posits never overflow to infinity or underflow to zero, and there is no “Not-a-Number” (NaN) value. Posits take up less space to implement in silicon than an IEEE float of the same size, largely because there is no “gradual underflow” or subnormal numbers. With fewer gate delays per operation as well as lower silicon footprint, the posit operations per second (POPS) supported by a chip can be significantly higher than the FLOPs using similar hardware resources. GPU accelerators, in particular, could do more arithmetic per watt and per dollar yet deliver superior answer quality.

The “Accuracy on a 32-bit budget” benchmark compares how many decimals of accuracy can be produced for a set number of bits-per-value, using various number formats. Low-precision posits provide a better solution than “approximate computing” methods that try to tolerate decreases in answer quality. High-precision posits provide better answers (more correct decimals) than floats of the same size, suggesting that in some cases, a 32-bit posit may do a better job than a 64-bit float. In other words, posits beat floats at their own game.

12:00 – 1:00 Lunch

1:00 – 1:30 Why are still failing to attract and retain women in STEM?

Why we are still failing to attract, retain and keep women in Science, Technology, Engineering and Mathematics related areas and subjects (STEM) (and what every single one of us can do to lead the change)?

In a world where alternative facts have become an excuse for ignorance we need to face up to the language, culture and environment that turn women off our education and employment systems. We can all become leaders of change. In this short talk Victoria will provide insight into how all of us can make an enduring, sustainable difference for women and diversity in STEM.

Victoria Maclennan – Managing Director, Optimal BI; 2016 New Zealand ICT Professional of the Year; Co-Chair NZ Rise.

1:30 – 2:00 Breast imaging analytics that improve clinical decision-making and the early detection of breast cancer

Ralph Highnam – CEO, Volpara Solutions, New Zealand

Abstract – Dr Highnam will talk about breast cancer screening and some of the challenges and opportunities it presents for advanced computing techniques. Dr Highnam was a major part of the UK eDiamond project, and EU MammoGrid project during his time at the University of Oxford, those projects sought to apply “the Grid” (an early Cloud) to improving breast cancer detection. Roll-on 20 years, and Dr Highnam now leads an ASX-listed company based in Wellington which uses Azure to improve breast cancer detection.

2:05 – 2:50 Panel – Enterprise Systems: How big is the gap to reach 21st century performance? How will legacy code and hardware be updated?

Mark Moir (Oracle, NZ-US) – Moderator, Victoria Maclennan (Optimal BI, NZ), Paul McKenney (IBM, US), Paul Fenwick (Perl, Australia), John Gustafson (ex-Intel, AMD, Sun, Singapore), Nathan DeBardeleben (LANL, US)

2:50 – 3:15 Afternoon Tea

3:15 – 4:00 Keynote – How Might the Manufacturability of the Hardware at Device Level Impact on Exascale Computing?

Prof Michael J Kelly – MacDiarmid Institute for Advanced Materials and Nanomaterials, Victoria University of Wellington, New Zealand, and Department of Engineering, University of Cambridge, United Kingdom.

Abstract – The International Technology Roadmap for Semiconductors has been the main guide for scientists developing on-going solutions to miniaturisation of devices. Between 1990 and 2010 the main thrust was the continuation of Moore’s law applied to high performance computing. The last decade saw a rapid growth in the number of red boxes in the technology tables, indicating the absence of any solution, let alone a workable solution, to achieving key device parameters needed to keep Moore’s law on track. The introduction of ‘More than Moore’ in recognition of the expanding role of high-speed communications as the output of IT R&D broadened the output of the Roadmap and took the immediate pressure off the red boxes. Now that the Internet of Things envisages vast networks of interaction sensor nodes is the newer and broader output. In the meantime, the limitations of CMOS and the as-yet unmanufacturability of research devices to continue miniaturisation is slowing and will stop progress. Quantum tunnelling through thin barrier layers offers one route to very high-speed uncooled semiconductor devices with 1.9THz speeds demonstrated on a one-off basis. The low-cost high-volume manufacture of 0.2-0.3THz devices and circuits is still a challenge. I will focus on some recent progress in this latter space and attempt to draw wider implications from the present status.

4:05 – 4:50 Keynote – The ARM Architecture – From Sunk to Success

Dave Jaggar – former ARM’s Head of Architecture Design, New Zealand

Abstract – In the late 1980’s Acorn, a British one-hit-wonder computer company, developed its own workstation microprocessor, the Acorn RISC Machine (ARM). By the end of 1990 Acorn was down a very dark financial alley, and the ARM processor, which may well have been the worst microprocessor ever designed, was practically extinct. Acorn’s VLSI design team, minus the two original CPU designers, were cast out to fend for themselves, provisioned with only 18 months of financial rations from Apple. When Dave Jaggar joined the new company in the summer of 1991, with the ink not quite dry on his Master’s thesis, he thought perhaps it was the worst move since Martin Luther journeyed to Rome. However after twelve months he was given free rein to redefine the processor, mostly because the company couldn’t afford anyone better. The Advanced RISC Machine, as the company was renamed, had a completely new instruction set which sidestepped many of the problems inherited from the original. Over the next eight years Dave systematically defined the entire ARM architecture, enabling it to be a popular embedded controller for the digital revolution, with around 100 billion units shipped. Along the way he worked out a little bit about computer architecture, and then retired back to New Zealand to work out a bit more. Now that ARM is no longer independent, and Britain is about to be, Dave thinks it’s about time he explained his part in ARM’s downfall.

4:55 – 5:10 Conference Wrap-up. Feedback: Towards Multicore World 2018

Nicolás Erdödy (Open Parallel)

5:15 – 6:30 Drinks and Nibbles