Publications

Peer-reviewed publications and preprints


TOPICAL: TOPIC Pages AutomagicaLly, John Giorgi, Amanpreet Singh, Doug Downey, Sergey Feldman, Lucy Lu Wang. NAACL Demo Track (2024) [💻 code⚙️ demo]

Large Language Models are Fixated by Red Herrings: Exploring Creative Problem Solving and Einstellung Effect using the Only Connect Wall Dataset, Saeid Naeini, Raeid Saqur, Mozhgan Saeidi, John Giorgi, Babak Taati. NeurIPS Datasets and Benchmarks Track (2023) [💻 code]

WangLab at MEDIQA-Chat 2023: Clinical Note Generation from Doctor-Patient Conversations using Large Language Models, John Giorgi, Augustin Toma, Ronald Xie, Sondra S. Chen, Kevin R. An, Grace X. Zheng, Bo Wang. ClinicalNLP @ ACL (2023) [💻 code]

Towards Multi-Document Summarization in the Open-Domain, John Giorgi, Luca Soldaini, Bo Wang, Gary Bader, Kyle Lo, Lucy Lu Wang, Arman Cohan. EMNLP Findings (2023) [💻 code]

A sequence-to-sequence approach for document-level relation extraction, John Giorgi, Gary Bader, Bo Wang. BioNLP @ ACL (2022) [💻 code | ⚙️ demo | 🎥 video | 🛝 slides]

DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations, John Giorgi, Osvald Nitski, Bo Wang, Gary Bader. ACL-IJCNLP (2021) [💻 code | 🪧 poster | 🎥 video | 🛝 slides]

Author-sourced capture of pathway knowledge in computable form using Biofactoid, Jeffrey V Wong, Max Franz, Metin Can Siper, Dylan Fong, Funda Durupinar, Christian Dallago, Augustin Luna, John Giorgi, Igor Rodchenkov, Özgün Babur, John A Bachman, Benjamin M Gyori, Gary D Bader, Chris Sander. eLife (2021) [💻 web app]

A flexible search system for high-accuracy identification of biological entities and molecules, Max Franz, Jeffrey V. Wong, Metin Can Siper, Christian Dallago, John Giorgi, Emek Demir, Chris Sander, Gary D. Bader. The Journal of Open Source Software (2021) [💻 code]

Towards reliable named entity recognition in the biomedical domain, John Giorgi, Gary Bader. Bioinformatics (2019) [💻 code]

Transfer learning for biomedical named entity recognition with neural networks, John Giorgi, Gary Bader. Bioinformatics (2018) [💻 code]

High intraspecific genome diversity in the model arbuscular mycorrhizal symbiont Rhizophagus irregularis, Eric C. H. Chen, Emmanuelle Morin, Denis Beaudet, Jessica Noel, Gokalp Yildirir, Steve Ndikumana, Philippe Charron, Camille St-Onge, John Giorgi, Manuela Krüger, Timea Marton, Jeanne Ropars, Igor V. Grigoriev, Matthieu Hainaut, Bernard Henrissat, Christophe Roux, Francis Martin, Nicolas Corradi. New Phytologist (2018).

🌐 Large Scale Contributions

Smaller contributions to huge projects


BLOOM: A 176B-Parameter Open-Access Multilingual Language Model, Teven Le Scao et al. arXiv (2022) [💻 code]

BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing, Jason Alan Fries et al. NeurIPS Datasets and Benchmarks (2022) [💻 code]

🪦 Paper Graveyard

Stuff that didn’t make it to publication but I’m still proud of!


CiteNet: A Search and Visualization Tool for Scientific Literature, Duncan Forster, John Giorgi (2020) [💻 code]

End-to-end Named Entity Recognition and Relation Extraction using Pre-trained Language Models, John Giorgi, Xindi Wang, Nicola Sahar, Won Young Shin, Gary D. Bader, Bo Wang. arXiv (2019) [💻 code]