How to Cite Software
A comprehensive guide to making research software citable — covering why it matters, what to cite, which standards to use, and the tools that make it easy.
Why cite software
Software is a fundamental pillar of modern research. From climate models and genome assemblers to statistical packages and visualization libraries, software underpins virtually every scientific result published today. Yet software remains chronically under-cited in academic literature, depriving developers and maintainers of the recognition their work deserves.
Citing software matters for several reasons. First, it provides credit and attribution to the people who create and maintain research tools. Many research software engineers work outside traditional academic career tracks; citations are one of the few metrics that make their contributions visible to funding bodies and hiring committees.
Second, proper software citation improves reproducibility. When a paper cites the exact version of a package used in an analysis, other researchers can reconstruct the computational environment and verify results. Without version-specific citations, reproducing published work becomes guesswork.
Third, citation data creates a dependency graph of the scientific record. Funders, institutions, and the community can see which tools are most impactful, which need sustained investment, and where gaps exist. This visibility helps allocate resources to the infrastructure that science depends on.
The FORCE11 Software Citation Principles (Smith et al., 2016) established that software should be cited on the same basis as any other research output. Citey is an opinionated implementation of those principles — making the right thing easy.
What to cite (and what not to cite)
The rule is simple: if removing the software would change your results or methodology, it should appear in your reference list.
Cite the software itself, not a proxy paper about it. This is the central tension in software citation. Many maintainers ask researchers to cite a companion journal article instead of the software — e.g., a methods paper that describes the algorithm. FORCE11 disagrees, and so does Citey: cite what you actually used. If a paper meaningfully describes the methodology, that's a separate entry in your reference list, not a substitute for citing the artifact.
Cite the specific version you used whenever possible. Software evolves rapidly; behaviour, APIs, and outputs can change between releases. A version-pinned citation lets readers know exactly which code produced your results. Where the package mints version-specific DOIs (e.g., via Zenodo), use the version DOI rather than the concept DOI.
Always include a persistent identifier. A DOI minted through Zenodo, Figshare, or a similar archive ensures the reference remains resolvable even if the hosting platform changes. Repository URLs alone are fragile. Citey's database requires either a DOI or a stable URL on every entry.
For large analyses that depend on many packages, consider citing the most critical tools in the main text and providing a full software environment description (e.g., a requirements.txt or environment.yml) as supplementary material.
The Citey citation model
Citey enforces a single, opinionated shape for every citation in its database. There are no entry-type variations, no special cases for papers vs. software vs. datasets — just one model, applied uniformly.
Required fields: title (the software's name), authors (one or more, with optional ORCID and affiliation), year (a 4-digit year), and at least one of doi or url (a resolvable pointer). Optional fields: version, publisher (e.g., Zenodo, GitHub, JOSS).
What's deliberately excluded: there is no entry type (everything is a software citation), no journal / booktitle / volume / issue / pages (those describe papers), and no “preferred citation” redirect to a companion article. This makes Citey narrower than CFF or codemeta, which both support pointing at a paper instead of the software. Citey ignores those redirects on principle.
From this single model, Citey produces every output format users actually consume: BibTeX as @software{} entries, plain text in an APA-flavored style, and Zotero HTML for clipboard-driven imports. The researcher picks the format in the extension's options; the underlying citation data is the same.
This means Citey’s output is opinionated. If a maintainer's preferred citation points at a methods paper, Citey will still emit a citation for the software itself. Researchers who want to cite the paper too can do so manually — the software citation doesn't preclude it.
CFF, codemeta, and BibTeX in this workflow
Three standards show up in software citation, each playing a different role in Citey's pipeline.
Citation File Format (CFF) is a YAML file (CITATION.cff) placed at the root of a repository. It's how software authors publish their citation metadata in a machine-readable form. GitHub renders a “Cite this repository” button when one is present. CFF supports authors, version, DOI, license, keywords, and a preferred-citation block. The hub provides a CFF generator; output it, commit it to your repo.
CodeMeta is a JSON-LD file (codemeta.json) for richer software metadata: programming language, operating systems, software requirements, funding sources, and more. While CFF focuses on citation, codemeta is what software registries (Software Heritage, JOSS) consume for indexing. The hub also provides a codemeta generator.
BibTeX is the format researchers consume in LaTeX-based workflows (.bib files). Citey emits BibTeX @software{} entries built from its internal model.
In Citey's pipeline, CFF and codemeta are inputs: when you add a package via /packages, Citey fetches your repo's CFF and codemeta files, extracts the fields it needs, and stores them in the canonical Citey model. BibTeX (and plain text and Zotero HTML) are outputs: what the extension hands a researcher when they look up your software. The internal Citey model is the bridge.
As a software author, the recommended workflow is: (1) generate CITATION.cff using the hub generator and commit it to your repo; (2) generate codemeta.json the same way; (3) mint a DOI via Zenodo or similar so your software has a persistent identifier; (4) add your repo to Citey — one URL, one pull request, citation lookups available to every researcher who installs the extension.
References
Smith, A. M., Katz, D. S., Niemeyer, K. E., & FORCE11 Software Citation Working Group. (2016). Software citation principles. PeerJ Computer Science, 2, e86. https://doi.org/10.7717/peerj-cs.86
Druskat, S., Spaaks, J. H., Chue Hong, N., Haines, R., Baker, J., Bliven, S., ... & Willighagen, E. (2021). Citation File Format (CFF). https://doi.org/10.5281/zenodo.1003149
Jones, M. B., Boettiger, C., Mayes, A. C., Slaughter, P., Niemeyer, K., Gil, Y., ... & Harmon, T. (2017). CodeMeta: an exchange schema for software metadata, version 2.0. https://doi.org/10.5063/schema/codemeta-2.0
Chue Hong, N. P., Katz, D. S., Barker, M., Lamprecht, A.-L., Martinez, C., Psomopoulos, F. E., ... & Wueest, R. O. (2022). FAIR Principles for Research Software (FAIR4RS Principles). Scientific Data, 9, 622. https://doi.org/10.1038/s41597-022-01710-x
Katz, D. S., Chue Hong, N. P., Clark, T., Muench, A., Stall, S., Bouquin, D., ... & Yeston, J. (2021). Recognizing the value of software: a software citation guide. F1000Research, 9, 1257. https://doi.org/10.12688/f1000research.26932.2