GitHub

Solr Document Enricher

A bit of Solr background

Solr ships with all the tools and features necessary for an advanced search solution. These include the oft overlooked update request processors. They operate at the document level i.e. prior to individual field tokenisation and allow you to clean, modify and/or enrich incoming documents. Processing options include language identification, duplicate detection and HTML markup handling. Create a chain of them and you have a true document processing pipeline.

What problem does this update processor solve?

Some data sources contain documents with content fields that are dynamically populated by content from other documents. For a query to retrieve these kinds of documents, the external content needs to be retrieved and added during index time.

How does this update processor perform its purpose?

During index time, the processor scans incoming documents for a predefined field, topicRef for example, with a reference to one or more documents already present in the index. The referenced document(s) either contains a subsequent reference field or content that we want to add to the incoming document.

The processor retrieves any referenced documents, traverses a tree of subsequently referenced documents if necessary, and then maps the eventual leaf documents’ specified content fields to additional new fields in the incoming document.

Documentation

In progress.

Authors and Contributors

@sebnmuller

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src/main/java/com/muller/solr/update/processors		src/main/java/com/muller/solr/update/processors
.gitignore		.gitignore
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Solr Document Enricher

A bit of Solr background

What problem does this update processor solve?

How does this update processor perform its purpose?

Documentation

Authors and Contributors

About

Releases

Packages

Languages

sebnmuller/SolrDocumentEnricher

Folders and files

Latest commit

History

Repository files navigation

Solr Document Enricher

A bit of Solr background

What problem does this update processor solve?

How does this update processor perform its purpose?

Documentation

Authors and Contributors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages