Bogdan Nicolae
- Research Professor of Computer Science
- Computer Scientist, Argonne National Laboratory
Education
Ph.D., University of Rennes 1, France
Dipl. Eng., Politehnica University Bucharest, Romania
Research Interests
- High Performance Computing
- Parallel and Distributed Computing
- Large Scale Data Management, Storage and Access
- Resilience and Fault tolerance
- Cloud Computing and Virtualization
Professional Affiliations & Memberships
Senior member, Association for Computing Machinery
Publications
- Maurya, A., Nicolae, B., Rafique, M., Tonellot, T. and Cappello, F. 2021. Towards Efficient I/O Scheduling for Collaborative Multi-Level Checkpointing. MASCOTS’21: The 29th IEEE International Symposium on the Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (Virtual, Portugal, 2021).
- Liu, H., Nicolae, B., Di, S., Cappello, F. and Jog, A. 2021. Accelerating DNN Architecture Search at Scale Using Selective Weight Transfer. CLUSTER’21: The 2021 IEEE International Conference on Cluster Computing (Portland, USA, 2021).
- Marcu, O., Costan, A., Nicolae, B. and Antoniu, G. 2021. Virtual Log-Structured Storage for High-Performance Streaming. CLUSTER’21: The 2021 IEEE International Conference on Cluster Computing (Portland, USA, 2021).
- Bicer, T., Yu, X., Ching, D.J., Chard, R., Cherukara, M.J., Nicolae, B. and Kettimuthu, R. 2021. High-Performance Ptychographic Reconstruction with Federated Facilities. SMC’21: The 2021 Smoky Mountains Computational Sciences and Engineering Conference (Kingsport, United States, 2021).
- Morales, N., Teranishi, K., Nicolae, B., Trott, C. and Cappello, F. 2021. Towards Portable Online Prediction of Network Utilization using MPI-level Monitoring. EuroPar’21: 27th International European Conference on Parallel and Distributed Systems (Lisbon, Portugal, 2021).
- Tseng, S.-M., Nicolae, B., Cappello, F. and Chandramowlishwaran, A. 2021. Demystifying asynchronous I/O Interference in HPC applications. The International Journal of High Performance Computing Applications. 35, 4 (2021), 391–412. DOI:https://doi.org/10.1177/10943420211016511.
- Hobson, T., Yildiz, O., Nicolae, B., Huang, J. and Peterka, T. 2021. Shared-Memory Communication for Containerized Workflows. CCGrid’21: The 21th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (Virtual, Australia, 2021).
- Nicolae, B., Moody, A., Kosinovsky, G., Mohror, K. and Cappello, F. 2021. VELOC: Very Low Overhead Checkpointing in the Age of Exascale. SuperCheck’21: The First International Symposium on Checkpointing for Supercomputing (Virtual Event, 2021).
- Wozniak, J., Yoo, H., Mohd-Yusof, J., Nicolae, B., Collier, N., Ozik, J., Brettin, T. and Stevens, R. 2020. High-bypass Learning: Automated Detection of Tumor Cells That Significantly Impact Drug Response. MLHPC’20: The 2020 IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments (in conjunction with SC’20) (Virtual Event, 2020).
- Nicolae, B. 2020. DataStates: Towards Lightweight Data Models for Deep Learning. SMC’20: The 2020 Smoky Mountains Computational Sciences and Engineering Conference (Nashville, United States, 2020).
- Maurya, A., Nicolae, B., Guliani, I. and Rafique, M.M. 2020. CoSim: A Simulator for Co-Scheduling of Batch and On-Demand Jobs in HPC Datacenters. DS-RT’20: The 24th IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications(Prague, Czech Republic, 2020), 167–174.
- Nicolae, B., Wozniak, J.M., Dorier, M. and Cappello, F. 2020. DeepClone: Lightweight State Replication of Deep Learning Models for Data Parallel Training. CLUSTER’20: The 2020 IEEE International Conference on Cluster Computing (Kobe, Japan, 2020).
- Dey, T., Sato, K., Nicolae, B., Guo, J., Domke, J., Yu, W., Cappello, F. and Mohror, K. 2020. Optimizing Asynchronous Multi-Level Checkpoint/Restart Configurations with Machine Learning. HPS’20: The 2020 IEEE International Workshop on High-Performance Storage (New Orleans, USA, 2020).
- Nicolae, B., Li, J., Wozniak, J., Bosilca, G., Dorier, M. and Cappello, F. 2020. DeepFreeze: Towards Scalable Asynchronous Checkpointing of Deep Learning Models. CGrid’20: 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing(Melbourne, Australia, 2020), 172–181.
- Nicolae, B., Moody, A., Gonsiorowski, E., Mohror, K. and Cappello, F. 2019. VeloC: Towards High Performance Adaptive Asynchronous Checkpointing at Large Scale. IPDPS’19: The 2019 IEEE International Parallel and Distributed Processing Symposium(Rio de Janeiro, Brazil, 2019), 911–920.
- Tseng, S.-M., Nicolae, B., Bosilca, G., Jeannot, E., Chandramowlishwaran, A. and Cappello, F. 2019. Towards Portable Online Prediction of Network Utilization using MPI-level Monitoring. EuroPar’19 : 25th International European Conference on Parallel and Distributed Systems (Goettingen, Germany, 2019), 47–60.
- Liang, X., Di, S., Li, S., Tao, D., Nicolae, B., Chen, Z. and Cappello, F. 2019. Significantly Improving Lossy Compression Quality Based on an Optimized Hybrid Prediction Model. SC ’19: 32nd International Conference for High Performance Computing, Networking, Storage and Analytics (Denver, USA, 2019), 1–26.
- Liang, X., Di, S., Tao, D., Li, S., Nicolae, B., Chen, Z. and Cappello, F. 2019. Improving Performance of Data Dumping with Lossy Compression for Scientific Simulation. CLUSTER’19: IEEE International Conference on Cluster Computing (Albuquerque, USA, 2019), 1–11.
- Nicolae, B., Riteau, P., Zhen, Z. and Keahey, K. 2019. Transparent Throughput Elasticity for Modern Cloud Storage: An Adaptive Block-Level Caching Proposal. Applying Integration Techniques and Methods in Distributed Systems and Technologies. IGI Global. 156–191.
- Caino-Lores, S., Carretero, J., Nicolae, B., Yildiz, O. and Peterka, T. 2019. Toward High-Performance Computing and Big Data Analytics Convergence: The Case of Spark-DIY. IEEE Access. 7, (2019), 156929–156955. DOI:https://doi.org/10.1109/ACCESS.2019.2949836.
- Li, J., Nicolae, B., Wozniak, J. and Bosilca, G. 2019. Understanding Scalability and Fine-Grain Parallelism of Synchronous Data Parallel Training. MLHPC’19: The 2019 IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments (in conjunction with SC’19) (Denver, USA, 2019), 1–8.
- Nicolae, B., Cappello, F., Moody, A., Gonsiorowski, E. and Mohror, K. 2018. VeloC: Very Low Overhead Checkpointing System. SC ’18: 31th International Conference for High Performance Computing, Networking, Storage and Analysis (Dallas, USA, 2018).
- Clemente-Castello, F.J., Nicolae, B., Mayo, R. and Fernandez, J.C. 2018. Performance Model of MapReduce Iterative Applications for Hybrid Cloud Bursting. IEEE Transactions on Parallel and Distributed Systems. 29, 8 (2018), 1794–1807. DOI:https://doi.org/10.1109/TPDS.2018.2802932.
- Caino-Lores, S., Carretero, J., Nicolae, B., Yildiz, O. and Peterka, T. 2018. Spark-DIY: A Framework for Interoperable Spark Operations with High Performance Block-Based Data Models. BDCAT’18: 5th IEEE/ACM International Conference on Big Data Computing Applications and Technologies (Zurich, Switzerland, 2018), 1–10.
- Marcu, O.-C., Costan, A., Antoniu, G., Perez-Hernandez, M.S., Nicolae, B., Tudoran, R. and Bortoli, S. 2018. KerA: Scalable Data Ingestion for Stream Processing. ICDCS’18: 38th IEEE International Conference on Distributed Computing Systems (Vienna, Austria, 2018), 1480–1485.
- Nicolae, B., Costa, C., Misale, C., Katrinis, K. and Park, Y. 2017. Leveraging Adaptive I/O to Optimize Collective Data Shuffling Patterns for Big Data Analytics. IEEE Transactions on Parallel and Distributed Systems. 28, 6 (2017), 1663–1674. DOI:https://doi.org/10.1109/TPDS.2016.2627558.
- Clemente-Castello, F.J., Nicolae, B., Mayo, M.M.R.R. and Fernandez, J.C. 2017. Evaluation of Data Locality Strategies for Hybrid Cloud Bursting of Iterative MapReduce. CCGrid’17 : 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing(Madrid, Spain, 2017), 181–185.
- Marcu, O.-C., Costan, A., Antoniu, G., Perez-Hernandez, M.S., Tudoran, R., Bortoli, S. and Nicolae, B. 2017. Towards a unified storage and ingestion architecture for stream processing. BigData’17: 2017 IEEE International Conference on Big Data (Boston, USA, 2017), 2402–2407.
- Marcu, O.-C., Tudoran, R., Nicolae, B., Costan, A., Antoniu, G. and Perez-Hernandez, M.S. 2017. Exploring Shared State in Key-Value Store for Window-Based Multi-Pattern Streaming Analytics. EBDMA’17: 1st Workshop on the Integration of Extreme Scale Computing and Big Data Management and Analytics (Madrid, Spain, 2017), 1044–1052.