寺西博士が11月22日に研究室を訪問します

サンディア国立研究所の寺西博士が11月22日に訪問して講演してくれます。
Dr. Keita Teranishi is a principal member of technical staff at Sandia National Laboratories, California, USA. He received the BS and MS degrees from the University of Tennessee, Knoxville, in 1998 and 2000, respectively, and the PhD degree from The Pennsylvania State University, in 2004. His research interests are parallel programming model, fault tolerance, numerical algorithm and data analytics for high performance computing systems.
講演の概要は以下の通りです。
Abstract: Tensors have found utility in a wide range of applications, such as chemometrics, network traffic analysis, neuroscience, and signal processing. Many of these data science applications have increasingly large amounts of data to process and require high-performance methods to provide a reasonable turnaround time for analysts. Sparse tensor decomposition is a tool that allows analysts to explore a compact representation (low-rank models) of high-dimensional data sets, expose patterns that may not be apparent in the raw data, and extract useful information from the large amount of initial data. In this work, we consider decomposition of sparse count data using CANDECOMP-PARAFAC Alternating Poisson Regression (CP-APR).
Unlike the Alternating Least Square (ALS) version, CP-APR algorithm involves non-trivial constraint optimization of nonlinear and nonconvex function, which contributes to the slow adaptation to high performance computing (HPC) systems. The recent studies by Kolda et al. suggest multiple variants of CP-APR algorithms amenable to data and task parallelism together, but their parallel implementation involves several challenges due to the continuing trend toward a wide variety HPC system architecture and its programming models.
To this end, we have implemented a production-quality sparse tensor decomposition code, named SparTen, in C++ using Kokkos as a hardware abstraction layer. By using Kokkos, we have been able to develop a single code base and achieve good performance on each architecture. Additionally, SparTen is templated on several data types that allow for the use of mixed precision to allow the user to tune performance and accuracy for specific applications. In this presentation, we will use SparTen as a case study to document the performance gains, performance/accuracy tradeoffs of mixed precision in this application, development effort, and discuss the level of performance portability achieved. Performance profiling results from each of these architectures will be shared to highlight difficulties of efficiently processing sparse, unstructured data. By combining these results with an analysis of each hardware architecture, we will discuss some insights for improved use of the available cache hierarchy, potential costs/benefits of analyzing the underlying sparsity pattern of the input data as a preprocessing step, critical aspects of these hardware architectures that allow for improved performance in sparse tensor applications, and where remaining performance may still have been left on the table due to having single algorithm implementations on diverging hardware architectures.

Dr. Keita Teranishi will visit our lab on Nov 22!

Dr. Keita Teranishi will visit our lab and give a talk on Nov 22.
He is a principal member of technical staff at Sandia National Laboratories, California, USA. He received the BS and MS degrees from the University of Tennessee, Knoxville, in 1998 and 2000, respectively, and the PhD degree from The Pennsylvania State University, in 2004. His research interests are parallel programming model, fault tolerance, numerical algorithm and data analytics for high performance computing systems.
The abstract of his talk is as follows.
Abstract: Tensors have found utility in a wide range of applications, such as chemometrics, network traffic analysis, neuroscience, and signal processing. Many of these data science applications have increasingly large amounts of data to process and require high-performance methods to provide a reasonable turnaround time for analysts. Sparse tensor decomposition is a tool that allows analysts to explore a compact representation (low-rank models) of high-dimensional data sets, expose patterns that may not be apparent in the raw data, and extract useful information from the large amount of initial data. In this work, we consider decomposition of sparse count data using CANDECOMP-PARAFAC Alternating Poisson Regression (CP-APR).
Unlike the Alternating Least Square (ALS) version, CP-APR algorithm involves non-trivial constraint optimization of nonlinear and nonconvex function, which contributes to the slow adaptation to high performance computing (HPC) systems. The recent studies by Kolda et al. suggest multiple variants of CP-APR algorithms amenable to data and task parallelism together, but their parallel implementation involves several challenges due to the continuing trend toward a wide variety HPC system architecture and its programming models.
To this end, we have implemented a production-quality sparse tensor decomposition code, named SparTen, in C++ using Kokkos as a hardware abstraction layer. By using Kokkos, we have been able to develop a single code base and achieve good performance on each architecture. Additionally, SparTen is templated on several data types that allow for the use of mixed precision to allow the user to tune performance and accuracy for specific applications. In this presentation, we will use SparTen as a case study to document the performance gains, performance/accuracy tradeoffs of mixed precision in this application, development effort, and discuss the level of performance portability achieved. Performance profiling results from each of these architectures will be shared to highlight difficulties of efficiently processing sparse, unstructured data. By combining these results with an analysis of each hardware architecture, we will discuss some insights for improved use of the available cache hierarchy, potential costs/benefits of analyzing the underlying sparsity pattern of the input data as a preprocessing step, critical aspects of these hardware architectures that allow for improved performance in sparse tensor applications, and where remaining performance may still have been left on the table due to having single algorithm implementations on diverging hardware architectures.

滝沢教授,江川准教授が28thWSSPで研究成果の発表を行いました.

2018年ドイツシュトゥットガルト大学高性能計算センター(HLRS)で開催された第28回Workshop on Sustained Simulation Perfirmanceにおいて,滝沢教授,江川が研究成果の発表を行いました.
第29回のWSSPは2019年3月19日,20日の2日間仙台で開催予定です!

M1 student Shiotsuki presented at SWoPP2018

M1 student Shiotsuki made presentations at SWoPP2018 (Summer United Workshops on Parallel, Distributed and Cooperative Processing) held at 熊本市国際交流会館 from July 30th to August 1st.
SWoPP2018:
https://sites.google.com/site/swoppweb/swopp2018
He made a presentation on “Performance evaluation of inter-process communication of SX-Aurora TSUBASA”.

修士1年の塩月くんがSWoPP2018で発表しました.

修士1年の塩月くんが7月30日〜8月1日に熊本市国際交流会館で開催されたSWoPP2018(Summer United Workshops on Parallel, Distributed and Cooperative Processing)で発表しました.
SWoPP2018:
https://sites.google.com/site/swoppweb/swopp2018
「SX-Aurora TSUBASAにおけるプロセス間通信の性能評価」という題目で発表しました.

Presentation at SX-Aurora TSUBASA Forum

I gave a talk at SX-Aurora TSUBASA Forum taken place at the NEC headquarter.
My talk was about a hot topic, the performance and functionality of NEC’s new product, SX-Aurora TSUBASA.
I am glad the audiences enjoyed it.
https://jpn.nec.com/event/180727aurora/index.html (in Japanese)

SX-Aurora TSUBASAフォーラムで講演してきました

滝沢です。NEC本社で開催されたSX-Aurora TSUBASAフォーラムで講演してきました。
新しい製品の性能と機能を評価した話なので、みなさんに興味をもって聞いてもらえました。
講演の冒頭でそこそこの笑い(苦笑?)をもらえてよかったです。
https://jpn.nec.com/event/180727aurora/index.html

夏前でも熱中症のリスク高 急に暑くなったら気をつけて

江川です.
名工大平田先生,日本気象協会と共同で取り組んでいる熱中症リスク評価に関する取り組みが新聞に取り上げられました.これまら暑い日が続くかと思いすが,皆様くれぐれも熱中症にはご注意ください.
-夏前でも熱中症のリスク高 急に暑くなったら気をつけて- 朝日新聞(6月3日)
自分の熱中症リスクは以下のサイトから確認出来ますよ.
熱中症セルフチェック(日本気象協会)
※本システムの構築には当センターのSX-ACEが用いられています:-)
 
 

iWAPT2018で発表してきました

滝沢です。カナダ・バンクーバー開催のIPDPS2018に参加して、ソフトウェア自動チューニングに関するワークショップ(iWAPT2018)で発表してきました。本研究室OBの川原畑くんの成果をまとめた論文の代理発表です。滞在中、自然豊かな公園であるスタンレーパークにもいってみました。素敵な写真をたくさん撮ることができましたが、どうも自撮りは苦手です。


Presentation at iWAPT2018

I attended IPDPS 2018 in Vancouver, and gave a talk at the international workshop on automatic performance tuning, iWAPT 2018, on behalf of Yuki Kawarabatake who was the first author of the work. During the stay, I also visited Stanley Park, a great park full of nature. It’s a pity I am not good at taking a selfie… 🙁
Hiro