Datasets and Code

Datasets and code of the publications of SwAPP Lab

AutoParLLM 

AutoParLLM leverages graph neural networks to guide in-context learning for generating efficient parallel code with LLMs. Also a new metric for evaluating OpenMP code OMPScore is proposed in this work.

CodeRosetta 

CodeRosetta is an encoder-decoder transformer model explicitly designed for translating between programming languages and also their HPC extensions. CodeRosetta is evaluated on C++ ↔ CUDA and C++ ↔ Fortran translation. It employs a customized learning-based framework with tailored pretraining and training objectives to effectively capture code semantics and parallel structural nuances, allowing for bidirectional code translation.

PerfoGraph 

PerfoGraph is a graph-based program representation that captures numerical and structural program features for improved machine learning-based program analysis and tasks like parallelism discovery and performance optimization. 

MIREncoder

This paper introduces MIREncoder, a lightweight multi-modal IR-based autoencoder designed to learn rich code embeddings for performance optimization tasks in HPC, enabling effective transfer learning. Unlike task-specific models, MIREncoder captures code syntax and semantics while maintaining low overhead, outperforming existing methods.

Numa-Config

This work shows that static Intermediate Representations (IR) can effectively guide NUMA and prefetcher optimizations. A proposed hybrid model matches dynamic methods' performance while reducing overhead.


AutoParLLM

Quazi Mahmud, Ali TehraniJamsaz, Hung Phan, Le Chen, Mihai Capotă, Ted Willke, Nesreen Ahmed, Ali Jannesari: AutoParLLM: GNN-guided Context Generation for Zero-Shot Code Parallelization using LLMs. In Proc. of the Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL 2025), Albuquerque, New Mexico, USA, pages 1–11, April 2025.

PDF    Dataset    Code   

Bibtex: @inproceedings{mahmud2025autoparllmNACCL,
    title         = {Autoparllm: Gnn-guided automatic code parallelization using large language models},
    author     = {Mahmud, Quazi Ishtiaque and TehraniJamsaz, Ali and Phan, Hung D and Ahmed, Nesreen K and Jannesari, Ali},
    year        = 2025,
    month     = {apr},
    booktitle  = {Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)}
}


CodeRosetta

Ali TehraniJamsaz, Arijit Bhattacharjee, Le Chen, Nesreen K. Ahmed, Amir Yazdanbakhsh, Ali Jannesari: CodeRosetta: Pushing the Boundaries of Unsupervised Code Translation for Parallel Programming. In Proc. of the 38th Conference on Neural Information Processing Systems (NeurIPS), Vancouver, Canada, pages 1–11, December 2024. 

PDF    Dataset1    Dataset2   Code 

Bibtex : @inproceedings{tehranijamsaz2024coderosetta,
     title             ={CodeRosetta: Pushing the Boundaries of Unsupervised Code Translation for Parallel Programming},
     author        ={TehraniJamsaz, Ali and Bhattacharjee, Arijit and Chen, Le and Ahmed, Nesreen K and Yazdanbakhsh, Amir and Jannesari, Ali},
     booktitle      ={Proceedings of the 38th International Conference on Neural Information Processing Systems},
     year             ={2024}, 

     pages           = {1--11},
     url                 ={https://openreview.net/forum?id=V6hrg4O9gg}
     }


PerfoGraph

Ali TehraniJamsaz, Quazi  Mahmud, Le Chen, Nesreen Ahmed, Ali Jannesari: PERFOGRAPH: A Numerical Aware Program Graph Representation for Performance Optimization and Program Analysis. In Proc. of the 37th Conference on Neural Information Processing Systems (NeurIPS), New Orleans, Louisiana, USA, pages 1–11, December 2023.

PDF    Dataset    Code   

Bibtex: @inproceedings{tehranijamsaz2023perfograph,
    title        = {Perfograph: A numerical aware program graph representation for performance optimization and program analysis},
    author   = {TehraniJamsaz, Ali and Mahmud, Quazi Ishtiaque and Chen, Le and Ahmed, Nesreen K and Jannesari, Ali},
    year       = 2023,
    journal    = {Advances in Neural Information Processing Systems},
    volume    = 36,
    pages      = {57783--57794}
}


Numa-Config

Ali TehraniJamsaz, Mihail Popov, Akash Dutta, Emmanuelle Saillard, Ali Jannesari: Learning Intermediate Representations using Graph Neural Networks for NUMA and Prefetchers Optimization. In Proc. of the 36th IEEE International Parallel and Distributed Processing Symposium (IPDPS), Lyon, France, pages 1–11, IEEE Computer Society, May 2022.

PDF    Dataset   

Bibtex: @inproceedings{tehranijamsaz2022learning,
    title                = {Learning intermediate representations using graph neural networks for numa and prefetchers optimization},
    author           = {TehraniJamsaz, Ali and Popov, Mihail and Dutta, Akash and Saillard, Emmanuelle and Jannesari, Ali},
    year              = 2022,
    booktitle        = {2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)},
    pages           = {1206--1216},
    organization = {IEEE}
}


MIREncoder

Akash Dutta and Ali Jannesari: MIREncoder: Multi-modal IR-based Pretrained Embeddings for Performance Optimizations. In Proc. of the 33rd International Conference on Parallel Architectures and Compilation Techniques (PACT), Long Beach, USA, pages 1–12, October 2024.

PDF    Dataset   

Bibtex: @inproceedings{dutta2024mirencoder,
    title                = {Mirencoder: Multi-modal ir-based pretrained embeddings for performance optimizations},
    author           = {Dutta, Akash and Jannesari, Ali},
    year              = 2024,
    booktitle        = {Proceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques (PACT)},
    pages           = {156--167},
    organization = {ACM}
}


Numa-Config

Ali TehraniJamsaz, Mihail Popov, Akash Dutta, Emmanuelle Saillard, Ali Jannesari: Learning Intermediate Representations using Graph Neural Networks for NUMA and Prefetchers Optimization. In Proc. of the 36th IEEE International Parallel and Distributed Processing Symposium (IPDPS), Lyon, France, pages 1–11, IEEE Computer Society, May 2022.

PDF    Dataset   

Bibtex: @inproceedings{tehranijamsaz2022learning,
    title                = {Learning intermediate representations using graph neural networks for numa and prefetchers optimization},
    author           = {TehraniJamsaz, Ali and Popov, Mihail and Dutta, Akash and Saillard, Emmanuelle and Jannesari, Ali},
    year              = 2022,
    booktitle        = {2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)},
    pages           = {1206--1216},
    organization = {IEEE}
}