Performance gaps between OpenMP and OpenCL for multi-core CPUs J Shen, J Fang, H Sips, AL Varbanescu 2012 41st International Conference on Parallel Processing Workshops, 116-125, 2012 | 73 | 2012 |
Performance traps in OpenCL for CPUs J Shen, J Fang, H Sips, AL Varbanescu 2013 21st Euromicro International Conference on Parallel, Distributed, and …, 2013 | 57 | 2013 |
An application-centric evaluation of OpenCL on multi-core CPUs J Shen, J Fang, H Sips, AL Varbanescu Parallel Computing 39 (12), 834-850, 2013 | 56 | 2013 |
Workload Partitioning for Accelerating Applications on Heterogeneous Platforms J Shen, AL Varbanescu, Y Lu, P Zou, H Sips IEEE Transactions on Parallel and Distributed Systems 27 (9), 2766 - 2780, 2016 | 50 | 2016 |
Workload partitioning for accelerating applications on heterogeneous platforms J Shen, AL Varbanescu, Y Lu, P Zou, H Sips IEEE Transactions on Parallel and Distributed Systems 27 (9), 2766-2780, 2015 | 50 | 2015 |
Moving from exascale to zettascale computing: challenges and techniques X Liao, K Lu, C Yang, J Li, Y Yuan, M Lai, L Huang, P Lu, J Fang, J Ren, ... Frontiers of Information Technology & Electronic Engineering 19, 1236-1244, 2018 | 42 | 2018 |
Glinda: a framework for accelerating imbalanced applications on heterogeneous platforms J Shen, AL Varbanescu, H Sips, M Arntzen, DG Simons Proceedings of the ACM International Conference on Computing Frontiers, 1-10, 2013 | 40 | 2013 |
Improving performance by matching imbalanced workloads with heterogeneous platforms J Shen, AL Varbanescu, P Zou, Y Lu, H Sips Proceedings of the 28th ACM international conference on Supercomputing, 241-250, 2014 | 30 | 2014 |
Look before you leap: Using the right hardware resources to accelerate applications J Shen, AL Varbanescu, H Sips 2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 …, 2014 | 23 | 2014 |
ELMO: A User-Friendly API to enable local memory in OpenCL kernels J Fang, AL Varbanescu, J Shen, H Sips 2013 21st Euromicro International Conference on Parallel, Distributed, and …, 2013 | 14 | 2013 |
Accelerating cost aggregation for real-time stereo matching J Fang, AL Varbanescu, J Shen, H Sips, G Saygili, L Van Der Maaten 2012 IEEE 18th International Conference on Parallel and Distributed Systems …, 2012 | 13 | 2012 |
A detailed performance analysis of the openmp rodinia benchmark J Shen, AL Varbanescu Proceedings of Technical Report PDS-2011-011, Delft University of Technology …, 2011 | 10 | 2011 |
Matchmaking applications and partitioning strategies for efficient execution on heterogeneous platforms J Shen, AL Varbanescu, X Martorell, H Sips 2015 44th International Conference on Parallel Processing, 560-569, 2015 | 8 | 2015 |
Acoustic ray tracing parallelization M Arntzen, DG Simons, J Shen, AL Varbanescu, H Sips National Aerospace Laboratory NLR, 2015 | 6 | 2015 |
Opencl vs. openmp: A programmability debate J Shen, J Fang, AL Varbanescu, H Sips | 6 | 2012 |
A distributed filesystem framework for transparent accessing heterogeneous storage services Y Lu, H Mao, J Shen 2009 IEEE International Symposium on Parallel & Distributed Processing, 1-8, 2009 | 6 | 2009 |
A study of application kernel structure for data parallel applications J Shen, AL Varbanescu, X Martorell, H Sips PDS Group, Delft University of Technology, Tech. Rep. PDS-2015-001, 2015 | 5 | 2015 |
Efficient high performance computing on heterogeneous platforms J Shen | 4 | 2015 |
Improving application performance by efficiently utilizing heterogeneous many-core platforms J Shen, AL Varbanescu, H Sips 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid …, 2015 | 4 | 2015 |
Heterogeneous computing with accelerators: an overview with examples AL Varbanescu, J Shen 2016 Forum on Specification and Design Languages (FDL), 1-8, 2016 | 2 | 2016 |