10/28/2022 0 Comments Apache lucene pythonTo illustrate how this can be accomplished, let’s introduce a use-case. With the ONNX Runtime, we can utilize state-of-the-art transformer models trained in the Python ecosystem from Apache OpenNLP. It provides an open standard for machine learning models, along with runtimes for many languages, such as Java. ONNX, or the Open Neural Exchange, is a standard now supported by many tools and frameworks. While this method works, being able to utilize the newer models directly from our Java applications would be more performant and easier to maintain. Using state-of-the-art NLP models from a Java application often requires configuring a remote service to provide inference over an API. This has created a lack of NLP tooling in the Java ecosystem since nearly all modern NLP work is done in Python. With the recent explosion in capability and popularity provided by newer architectures such as transformers, there has not been a way to use these newer models with Apache OpenNLP. With this integration, Apache OpenNLP’s capabilities can be used from within Apache Solr to analyze documents during indexing. Natural language processing models can be trained by Apache OpenNLP and then used by the library in your Java applications.Īpache Solr features integration with Apache OpenNLP via Apache Lucene’s lucene/analysis/opennlp module. Apache OpenNLP provides common NLP functions such as tokenization, chunking, sentence detection, language detection, document classification, parts-of-speech tagging, lemmatization, parsing, and named-entity recognition. IntroductionĪpache OpenNLP is a machine learning-based library for performing natural language processing (NLP) in Java. The topic of this blog post was the subject of the Searching for the Right Words: Bringing NLP to Apache Solr through ONNX and Apache OpenNLP talk at the Linux Foundation’s Open Source Summit North America 2022 in Austin, TX.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |