“This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder."
Conference/Journal Publications
- Judicael A. Zounmevo, Dries Kimpe, Robert Ross, and Ahmad Afsahi, "On the use of MPI in High-Performance Computing Services", EuroMPI 2013, Madrid, Spain, September 15-18, 2013. © ACM
- Xin Zhao, Darius Buntinas, Judicael Zounmevo, James Dinan, David Goodell, Pavan Balaji, Rajeev Thakur, Ahmad Afsahi, and William Gropp, "Towards Generalized, Asynchronous, and MPI-Interoperable Active Messages", 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), Delft, The Netherlands, May 13-16, 2013. © IEEE/ACM ( acceptance rate: 22.2%, 57/257)
- Judicael A. Zounmevo and Ahmad Afsahi, "An Efficient MPI Message Queue Mechanism for Large-scale Jobs", 18th IEEE International Conference on Parallel and Distributed Systems (ICPADS), Singapore, December 17-19, 2012. © IEEE (acceptance rate: 29.6%, 87/294)
- Grigori Inozemtsev and Ahmad Afsahi, "Designing an Offloaded Nonblocking MPI_Allgather Collective using CORE-Direct", 14th IEEE International Conference on Cluster Computing (Cluster 2012), Beijing, China, September 24-28, 2012. © IEEE (acceptance rate: 28.9%, 58/201)
- Reza Zamani and Ahmad Afsahi, "A Study of Hardware Performance Monitoring Counter Selection in Power Modeling of Computing Systems", 2nd International Workshop on Power Measurement and Profiling (PMP 2012), San Jose, CA, USA, June 5-8, 2012. © IEEE
- Mohammad J. Rashti and Ahmad Afsahi, "Exploiting Application Buffer Reuse to Improve MPI Small Message Transfer protocols over RDMA-enabled Networks", Cluster Computing, The Journal of Networks, Software Tools and Applications, Volume 14, Number 4, December 2011, pp. 345-356. © Springer
- Judicael A. Zounmevo and Ahmad Afsahi, "Investigating Scenario-conscious Asynchronous Rendezvous over RDMA", poster/short paper, 13th IEEE International Conference on Cluster Computing (Cluster 2011), Austin, Texas, USA, September 26-30, 2011. © IEEE
- Mohammad J. Rashti, Jonathan Green, Pavan Balaji, Ahmad Afsahi and William Gropp, "Multi-core and Network Aware MPI Topology Functions", 18th EuroMPI conference, Recent Advances in the Message Passing Interface (EuroMPI 2011), Santorini, Greece, September 18-21, 2011. © Springer.
- Ying Qian and Ahmad Afsahi, "Process Arrival Pattern Aware Alltoall and Allgather on InfiniBand Clusters", International Journal of Parallel Programming, Volume 39, No. 4, August 2011, pp. 473-493. © Springer. A preprint is available.
- Ryan E. Grant, Mohammad J. Rashti, Pavan Balaji, and Ahmad Afsahi, "RDMA Capable iWARP over Datagrams", 25th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2011), Anchorage, Alaska, USA, May 16-20, 2011. © IEEE (acceptance rate: 19.6%, 112/571)
- Mohammad J. Rashti, Ryan E. Grant, Pavan Balaji, and Ahmad Afsahi, "iWARP Redefined: Scalable Connectionless Communication over High-Speed Ethernet", 17th International Conference on High Performance Computing (HiPC 2010), Goa, India, December 19-22, 2010. © IEEE (acceptance rate: 19.2%, 40/208)
- Reza Zamani and Ahmad Afsahi, "Adaptive Estimation and Prediction of Power and Performance in High Performance Computing", International Conference on Energy-Aware High Performance Computing", September 16-17, 2010, Hamburg, Germany. Special issue paper, Journal of Computer Science - Research and Development, Vol. 25, No. 3-4, 177-186. © Springer
- Ryan E. Grant, Pavan Balaji, and Ahmad Afsahi, "A Study of Hardware Assisted IP over InfiniBand and its Impact on Data Center Performance", 2010 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2010), White Plains, New York, USA, March 28-30, 2010. © IEEE (acceptance rate: 33%, 22/66)
- Mohammad J. Rashti and Ahmad Afsahi, "Modern Interconnects for High-Performance Computing Clusters", Book Chapter, Cluster Computing and Multi-Hop Network Research, Eds: Ciceron Jimenez and Maurice Ortego, 2010, © Nova Science Publishers, Inc.
- Ryan E. Grant and Ahmad Afsahi, "Improving Energy Efficiency of Asymmetric Chip Multithreaded Multiprocessors through Reduced OS Noise Scheduling", Concurrency and Computation: Practice and Experience, Volume 21, Issue 18, pp. 2355-2376, December 25, 2009. © Wiley A preprint is available.
- Ryan E. Grant, Ahmad Afsahi, and Pavan Balaji, "Evaluation of ConnectX Virtual Protocol Interconnect for Data Centers", the15th International Conference on Parallel and Distributed Systems (ICPADS 2009), Shenzhen, China, December 8-11, 2009. © IEEE (acceptance rate: 29.8%, 91/305)
- Ying Qian and Ahmad Afsahi, "Process Arrival Pattern and Shared Memory Aware Alltoall on InfiniBand", 16th EuroPVM/MPI, Espoo, Finland, September 7-10, 2009, Lecture Notes in Computer Science (LNCS 5759), pp. 250-260. © Springer
- Mohammad J. Rashti and Ahmad Afsahi, "Improving RDMA-based MPI Eager Protocol for Frequently-used Buffers", 9th Workshop on Communication Architecture for Clusters (CAC 2009), in conjunction with the 23rd International Parallel and Distributed Processing Symposium (IPDPS 2009), Rome, Italy, May 25-29, 2009. © IEEE
- Mohammad J. Rashti and Ahmad Afsahi, "A Speculative and Adaptive MPI Rendezvous Protocol over RDMA-enabled Interconnects", International Journal of Parallel Programming, Volume 37, No. 2, April 2009, pp. 223-246. © Springer. A preprint is available.
- Ying Qian and Ahmad Afsahi, "Efficient Shared Memory and RDMA based Collectives on Multi-rail QsNetII SMP Clusters", Cluster Computing, The Journal of Networks, Software Tools and Applications, Volume 11, No. 4, December 2008, pp 341-354. © Springer. A preprint is available.
- Ying Qian, Mohammad J. Rashti, and Ahmad Afsahi, "Multi-connection and Multi-core Aware All-Gather on InfiniBand Clusters", 20th IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS 2008), Orlando, Florida, USA, November 16 – 18, 2008. © ACTA Press
- Ryan E. Grant, Mohammad J. Rashti, and Ahmad Afsahi, "An Analysis of QoS Provisioning for Sockets Direct Protocol vs. IPoIB over Modern InfiniBand Networks", International Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2), in conjunction with the 37th International Conference on Parallel Processing (ICPP 2008), Portland, Oregon, USA, September 12, 2008. © IEEE (acceptance rate: 45%, 9/20)
- Mohammad J. Rashti and Ahmad Afsahi, "Improving Communication Progress and Overlap in MPI Rendezvous Protocol over RDMA-enabled Interconnects," 22nd International Symposium on High Performance Computing Systems and Applications (HPCS 2008), Quebec City, Quebec, Canada, June 9-11, 2008. © IEEE
- Reza Zamani, Ahmad Afsahi, Ying Qian, and Carl Hamacher, "A Feasibility Analysis of Power-Awareness and Energy Minimization in Modern Interconnects for High-Performance Computing", 9th IEEE International Conference on Cluster Computing (Cluster 2007), Austin, Texas, USA, September 17-20, 2007. © IEEE
- Ryan E. Grant and Ahmad Afsahi, "Improving System Efficiency through Scheduling and Power Management”, International Workshop on Green Computing (GreenCom’07), invited paper, work-in-progress session, in conjunction with the 9th IEEE International Conference on Cluster Computing (Cluster 2007), Austin, Texas, USA, September 17, 2007. © IEEE
- Ying Qian and Ahmad Afsahi, "RDMA-based and SMP-aware Multi-port All-gather on Multi-rail QsNetII SMP Clusters", 36th International Conference on Parallel Processing (ICPP 2007), XiAn, China, September 10-14, 2007. © IEEE
- Mohammad J. Rashti and Ahmad Afsahi, "Assessing the Ability of Computation/Communication Overlap and Communication Progress in Modern Interconnects", 15th Annual IEEE Symposium on High-Performance Interconnects (Hot Interconnects 2007), Palo Alto, California, USA, August 22-24, 2007, pp. 117-124. © IEEE
- Ying Qian and Ahmad Afsahi, "High Performance RDMA-based Multi-port All-gather on Multi-rail QsNetII", 21st International Symposium on High Performance Computing Systems and Applications (HPCS 2007), Saskatoon, Saskatchewan, Canada, May 13-16, 2007. © IEEE
- Mohammad J. Rashti and Ahmad Afsahi, "10-Gigabit iWARP Ethernet: Comparative Performance Analysis with InfiniBand and Myrinet-10G", 7th Workshop on Communication Architecture for Clusters (CAC 2007), in conjunction with the 21st International Parallel and Distributed Processing Symposium (IPDPS 2007), Long Beach, California, USA, March 26-30, 2007. © IEEE (acceptance rate: 32%, 10/31)
- Ryan E. Grant and Ahmad Afsahi, "A Comprehensive Analysis of Multithreaded OpenMP Applications on Dual-Core Intel Xeon SMPs", Workshop on Multithreaded Architectures and Applications (MTAAP'07), in conjunction with the 21st International Parallel and Distributed Processing Symposium (IPDPS 2007), Long Beach, California, USA, March 26-30, 2007. © IEEE
- Ying Qian, Ahmad Afsahi, Nathan R. Fredrickson, and Reza Zamani, "Performance Evaluation of the Sun Fire Link SMP Clusters", International Journal of High Performance Computing and Networking (IJHPCN), 2006, Volume 4, Nos 5/6, pp 209-221. © Inderscience
- Mohammad J. Rashti and Ahmad Afsahi, "NetEffect PCI-Express 10-Gigabit iWARP Ethernet: A Performance Study", White Paper, November, 2006. Also, available at NetEffect, Inc. (www.neteffect.com) and TechOnline (www.techonline.com)
- Ryan E. Grant and Ahmad Afsahi, "Power-Performance Efficiency of Asymmetric Multiprocessors for Multi-threaded Scientific Applications", 2nd Workshop on High-Performance, Power-Aware Computing (HP-PAC 2006), in conjunction with the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), Rhodes Island, Greece, April 25-29, 2006. © IEEE (acceptance rate: 50%, 9/18)
- Ying Qian and Ahmad Afsahi, "Efficient RDMA-based Multi-port Collectives on Multi-rail QsNetII Clusters", 6th Workshop on Communication Architecture for Clusters (CAC 2006), in conjunction with the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), Rhodes Island, Greece, April 25-29, 2006. © IEEE
- Reza Zamani and Ahmad Afsahi, "Communication Characteristics of Message-Passing Scientific and Engineering Applications", 17th IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS 2005), Phoenix, Arizona, USA, November 14-16, 2005, pp. 644-649. © ACTA Press
- Ryan E. Grant and Ahmad Afsahi, "Characterization of Multithreaded Scientific Workloads on Simultaneous Multithreading Intel Processors", Workshop on Interaction between Operating System and Computer Architecture (IOSCA 2005), in conjunction with 2005 IEEE International Symposium on Workload Characterization (IISWC 2005), Austin, Texas, USA, October 6-8, 2005, pp. 13-19.
- Reza Zamani, Ying Qian, and Ahmad Afsahi, "An Evaluation of the Myrinet/GM2 Two-Port Networks", 3rd IEEE Workshop on High-Speed Local Networks (HSLN 2004), In Proceedings of the 29th Annual IEEE Conference on Local Computer Networks (LCN 2004), Tampa, Florida, USA, November 16-18, 2004, pp. 734-742. © IEEE
- Ying Qian, Ahmad Afsahi, and Reza Zamani, "Myrinet Networks: A Performance Study", 3rd IEEE International Symposium on Network Computing and Applications (NCA04), Cambridge, Massachusetts, USA, August 30 - September 1, 2004, pp. 323-328. © IEEE
- Ying Qian, Ahmad Afsahi, Nathan R. Fredrickson, and Reza Zamani, "Performance Evaluation of the Sun Fire Link SMP Clusters", 18th International Symposium on High Performance Computing Systems and Applications (HPCS 2004), Winnipeg, Manitoba, Canada, May 16-19, 2004, pp. 145-156.
- Ahmad Afsahi and Ying Qian, "Remote Shared Memory over Sun Fire Link Interconnect", 15th IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS 2003), Marina del Rey, California, USA, November 3-5, 2003, pp. 381-386. © ACTA Press
- Nathan R. Fredrickson, Ahmad Afsahi, and Ying Qian, "Performance Characteristics of OpenMP Constructs, and Application Benchmarks on a Large Symmetric Multiprocessor", 17th Annual ACM International Conference on Supercomputing (ICS 2003), San Francisco, California, USA, June 23-26, 2003, pp. 140-149. © ACM (acceptance rate: 21.1%, 36/171)
- Ahmad Afsahi and Nikitas J. Dimopoulos, "Efficient Communication Using Message Prediction for Clusters of Multiprocessors", Concurrency and Computation: Practice and Experience (CCPE 2002), Volume 14, Issue 10, 2002, pp. 859-883.
- Ahmad Afsahi and Nikitas J. Dimopoulos, "Analysis of a Latency Hiding Broadcasting Algorithm on a Reconfigurable Optical Interconnect", Parallel Processing Letters (PPL 2002), Volume 12, No. 1, 2002, pp. 41-50.
- Ahmad Afsahi and Nikitas J. Dimopoulos, "Architectural Extensions to Support Efficient Communication Using Message Prediction", 16th Annual International Symposium on High Performance Computing Systems and Applications (HPCS 2002), Moncton, Canada, June, 2002, pp. 18-25. © IEEE.
- Ahmad Afsahi and Nikitas J. Dimopoulos, "Communication Prediction in Message-Passing Multiprocessors", 14th Annual International Symposium on High PerformanceComputing Systems and Applications, (HPCS 2000), Victoria, Canada, June, 2000. High Performance Computing Systems and Applications, 2002, Chapter 18, pp. 253-271. ©Kluwer Academic Publishers.
- Ahmad Afsahi and Nikitas J. Dimopoulos, "Efficient Communication Using Message Prediction for Cluster of Multiprocessors", 4th Workshop on Communication, Architecture,and Applications for Network-based Parallel Computing (CANPC 2000), Toulouse, France, held in conjunction with the 6th International Symposium on High-Performance Computer Architecture (HPCA-6), January, 2000, Lecture Notes in Computer Science, Vol. 1797 , pp. 162-178. © Springer Verlag.
- Ahmad Afsahi and Nikitas J. Dimopoulos, "Communication Latency Hiding in Reconfigurable Message-Passing Environments: Quantitative Studies", 13th Annual International Symposium on High Performance Computing Systems and Applications (HPCS 99), Kingston, Canada, June, 1999. High Performance Computing Systems and Applications, 2000, Chapter 19, pp. 137-152. © Kluwer Academic Publishers.
- Ahmad Afsahi and Nikitas J. Dimopoulos, "Hiding Communication Latency in Reconfigurable Message-Passing Environments", 2nd Merged IEEE Symposium IPPS/SPDP 99: 13th International Parallel Processing Symposium & 10th Symposium on Parallel and Distributed Processing, San Juan, Puerto Rico, April, 1999, pp. 55-60, © IEEE. (acceptance rate: 43.5%, 113/260)
- Ahmad Afsahi and Nikitas J. Dimopoulos, "Communications Latency Hiding Techniques for a Reconfigurable Optical Interconnect: Benchmark Studies", 4th International Workshop on Applied Parallel Computing, Large Scale Scientific and Industrial Problems (PARA 98), Umeå, Sweden, June, 1998, Lecture Notes in Computer Science, Vol. 1541,pp. 1-6. © Springer Verlag.
- Ahmad Afsahi and Nikitas J. Dimopoulos, "Collective Communications on a Reconfigurable Optical Interconnect", International Conference on Principles of Distributed Systems(OPODIS 97), Chantilly, France, December, 1997, pp. 167-181. © Hermes.
PhD Dissertation/Master's Thesis
- Reza Zamani (Ph.D., 11/2012): Run-time Predictive Modeling of Power and Performance via Time-Series in High Performance Computing
- Ryan E. Grant (Ph.D., 09/2012): Improving High Performance Networking Technologies for Data Center Clusters
- Mohamamd J. Rashti (Ph.D., 10/2010): Improving Message-Passing Performance and Scalability in High-Performance Clusters
- Ying Qian (Ph.D., 10/2009): Design and Evaluation of Efficient Collective Communications on Modern Interconnects and Multi-core Clusters
- Ryan E. Grant (M.Sc., 2007): Analysis and Improvement of Performance and Power Consumption of Chip Multi-Threading SMP Architectures
- Reza Zamani (M.Sc., 2005): Communication Characteristics of Message-Passing Applications, and Impact of RDMA on their Performance
- Ying Qian (M.Sc., 2004): High-Performance Interconnects and Computing Systems: Quantitative Studies
Technical Reports
- Performance Characteristics of OpenMP Constructs, and Applications Benchmarks on a Large Symmetric Multiprocessor, Nathan R. Fredrickson, Ahmad Afsahi, and Ying Qian, Technical Report ECE-0302, Parallel Processing Research Laboratory, Department of Electrical and Computer Engineering, Queen's University, February 2003.