Sourcify
Effortlessly find and manage open-source dependencies for your projects.

Industrial-strength natural language processing in Python.

spaCy is a Python library for advanced Natural Language Processing, designed for building real products and gathering real insights. It's written in Cython, offering blazing fast performance for large-scale information extraction. spaCy provides pre-trained pipelines and supports over 75 languages. It features components for named entity recognition, part-of-speech tagging, dependency parsing, and text classification. spaCy's new project system facilitates end-to-end workflows from prototype to production, allowing users to manage data transformation, preprocessing, and training steps. The library integrates Large Language Models (LLMs) into structured NLP pipelines via the spacy-llm package. spaCy layout, a plugin integrates with Docling to bring structured processing of PDFs, Word documents and other input formats to spaCy pipeline, outputing clean, structured data in a text-based format and creates spaCy's familiar Doc objects.
spaCy is a Python library for advanced Natural Language Processing, designed for building real products and gathering real insights.
Explore all tools that specialize in process natural language. This domain focus ensures spaCy delivers optimized results for this specific requirement.
Explore all tools that specialize in named entity recognition. This domain focus ensures spaCy delivers optimized results for this specific requirement.
Utilizes transformer models like BERT for state-of-the-art accuracy in NLP tasks.
A comprehensive and extensible system for configuring training runs, ensuring reproducibility and easy experiment tracking.
Offers a structured way to manage end-to-end NLP workflows, including data transformation, preprocessing, and training.
The spacy-llm package integrates Large Language Models (LLMs) into spaCy, featuring a modular system for fast prototyping and prompting.
Integrates with Docling to process PDFs and Word documents, extracting structured data and creating spaCy Doc objects.
Install spaCy via pip: `pip install spacy`
Download a pre-trained model: `python -m spacy download en_core_web_sm`
Load the model in Python: `import spacy; nlp = spacy.load('en_core_web_sm')`
Process text: `doc = nlp('Your text here')`
Access tokens and their attributes: `for token in doc: print(token.text, token.pos_)`
Utilize the project system for end-to-end workflows: `python -m spacy project clone pipelines/tagger_parser_ud`
Explore custom components and attributes to extend spaCy's functionality.
All Set
Ready to go
Verified feedback from other users.
"Highly regarded for speed, accuracy, and ease of use."
Post questions, share tips, and help other users.
Effortlessly find and manage open-source dependencies for your projects.

End-to-end typesafe APIs made easy.

Page speed monitoring with Lighthouse, focusing on user experience metrics and data visualization.

Topcoder is a pioneer in crowdsourcing, connecting businesses with a global talent network to solve technical challenges.

Explore millions of Discord Bots and Discord Apps.

Build internal tools 10x faster with an open-source low-code platform.

Open-source RAG evaluation tool for assessing accuracy, context quality, and latency of RAG systems.

AI-powered synthetic data generation for software and AI development, ensuring compliance and accelerating engineering velocity.