
The fully open, persistent identifier-linked global index of scholarly research.

OpenAlex is a massive, open-source bibliographic index of the world’s scholarly research system, launched by the non-profit OurResearch as a direct successor to the Microsoft Academic Graph (MAG). By 2026, it has become the gold standard for 'Open Science' infrastructure, indexing over 250 million works, 90 million authors, and 100,000 institutions. Its technical architecture is built on a persistent identifier (PID) graph, linking DOIs, ORCIDs, ROR IDs, and PubMed IDs into a unified schema. OpenAlex uses advanced machine learning models for author disambiguation and automated concept tagging, allowing researchers and developers to perform complex bibliometric analysis without the restrictive licensing costs of legacy systems like Scopus or Web of Science. It operates on a 'linked data' philosophy, providing a REST API and complete data snapshots in JSON-LD format. This allows for massive-scale data mining, institutional benchmarking, and the creation of custom discovery tools. As a critical node in the AI research stack, it serves as a primary data source for training Large Language Models (LLMs) on high-quality, peer-reviewed scientific literature.
OpenAlex is a massive, open-source bibliographic index of the world’s scholarly research system, launched by the non-profit OurResearch as a direct successor to the Microsoft Academic Graph (MAG).
Explore all tools that specialize in bibliometric analysis. This domain focus ensures OpenAlex delivers optimized results for this specific requirement.
Uses a neural network-based model to group works by the same author even when names are identical or inconsistently formatted.
Applies a hierarchy of 65,000+ concepts to every work using a transformer-based classifier trained on the Wikidata taxonomy.
Native integration with the Research Organization Registry (ROR) for precise institutional affiliation tracking.
Real-time calculation of citation counts and h-index across the entire database.
Provides the entire database (~1TB+) as a series of compressed JSON-LD files on S3.
The entire metadata schema is open and public, allowing for easy ETL pipeline integration.
Operated by a non-profit (OurResearch) funded by grants and premium users.
Access the official website or API documentation at docs.openalex.org.
Review the entity schema focusing on Works, Authors, Sources, Institutions, and Concepts.
Test simple queries using the 'Polite Pool' by including your email address in the API request header.
Use the API Explorer tool to build and refine complex filters (e.g., filter by publication year and citation count).
Install client libraries like 'pyalex' for Python or 'openalexR' for R environments.
Configure environment variables for API keys if opting for the Premium tier to access the 'Fast Lane'.
To handle large datasets, download the latest data snapshot from the AWS S3 bucket.
Implement data parsing logic for the JSON-LD format using tools like Apache Spark or Pandas.
Schedule periodic updates by monitoring the 'last_updated' field in the API responses.
Set up institutional dashboards by mapping internal researcher IDs to OpenAlex author IDs.
All Set
Ready to go
Verified feedback from other users.
"Highly praised for its transparency and openness. Users love the lack of API keys for the polite pool and the comprehensive nature of the data."
Post questions, share tips, and help other users.
No direct alternatives found in this category.