Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spacy V3 decorator string name #6

Open
rennanvoa2 opened this issue Mar 5, 2021 · 5 comments
Open

Spacy V3 decorator string name #6

rennanvoa2 opened this issue Mar 5, 2021 · 5 comments

Comments

@rennanvoa2
Copy link

Hello guys,
With the V3 update when I run the example code it complains:

ValueError: [E966] `nlp.add_pipe` now takes the string name of the registered component factory, not a callable component. Expected string, but got <spacy_cld.spacy_cld.LanguageDetector object at 0x7fb8d9051ed0> (name: 'None').

- If you created your component with `nlp.create_pipe('name')`: remove nlp.create_pipe and call `nlp.add_pipe('name')` instead.

- If you passed in a component like `TextCategorizer()`: call `nlp.add_pipe` with the string name instead, e.g. `nlp.add_pipe('textcat')`.

- If you're using a custom component: Add the decorator `@Language.component` (for function components) or `@Language.factory` (for class components / factories) to your custom component and assign it a name, e.g. `@Language.component('your_name')`. You can then run `nlp.add_pipe('your_name')` to add it to the pipeline.

I figured out that now we have to pass the string name, to nlp.add_pipe but how?

I've tried nlp.add_pipe("langdetect"), nlp.add_pipe("LanguageDetector"),nlp.add_pipe("languagedetector") and none of them seems to work.

Can you help me with this ?

@Cusard
Copy link

Cusard commented Mar 12, 2021

Hi,

Since I'm new to SpaCy and Python, I'm not sure if this is the correct way to implement it. For Python 3.9 with SpaCy 3.0.3 the following worked for me:

import spacy
from spacy.language import Language
from spacy_langdetect import LanguageDetector

# Add LanguageDetector and assign it a string name
@Language.factory("language_detector")
def create_language_detector(nlp, name):
    return LanguageDetector(language_detection_function=None)

# Use a blank Pipeline, also a model can be used, e.g. nlp = spacy.load("en_core_web_sm")
nlp = spacy.blank("en")

# Add sentencizer for longer text
nlp.add_pipe('sentencizer')

# Add components using their string names
nlp.add_pipe("language_detector")

# Analyze components and their attributes
text = "This is an English text."
doc = nlp(text)

# Document level language detection.
print(doc._.language)

# See what happened to the pipes
nlp.analyze_pipes(pretty=True)`

I got on this track with: Language-specific pipeline

Is this the right way to use it with SpaCy3?

How to use the result for language specific processing?
Do I have to load language specific models, e.g.
nlp_en = spacy.load("en_core_web_sm") and
nlp_de = spacy.load("de_core_news_sm")?

Many thanks and best regards,

Cusard

@renatojmsantos
Copy link

same problem

@FelixSiegfriedRiedel
Copy link

Hello everybody!
Thanks to @Cusard I got the example code to work with the current spacy version.

import spacy
from spacy.language import Language
from spacy_langdetect import LanguageDetector

@Language.factory("language_detector")
def create_language_detector(nlp, name):
    return LanguageDetector(language_detection_function=None)

nlp = spacy.load("en_core_web_sm")

nlp.add_pipe('language_detector')
text = 'This is an english text.'
doc = nlp(text)
# document level language detection. Think of it like average language of the document!
print(doc._.language)
# sentence level language detection
for sent in doc.sents:
   print(sent, sent._.language)

The output looks like this:

{'language': 'en', 'score': 0.9999983570159962}
This is an english text. {'language': 'en', 'score': 0.9999956329695125}

@luis-possatti
Copy link

Thanks for sharing the solution. It worked for me too.

It will be nice if the project home page had the example update: https://spacy.io/universe/project/spacy-langdetect

@benjlis
Copy link

benjlis commented Jun 27, 2022

The example provided by @FelixSiegfriedRiedel works for me with v3.3.

I've also raised an issue about updating the documentation: explosion/spaCy#11038

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants