java - using lucene 10 how to load an index from a directory - Stack Overflow

admin2025-04-19  0

So I am trying to use Lucene in a simple project and every search returns examples of how to do this on 10+ year old versions of lucene where entire packages referenced dont even exist anymore.

I have a simple application and I want to index the content of the database and store the lucene index on disk. When my application starts it should load the index and support searching.

I have this working... all except loading the index. I am creating an indexWriter and a searchManager and if I index documents they are in fact immediately available to search and everything works great. But eventually my database is going to be large enough I cant just reindex everything into lucene every time the application starts. Also, if I cant load the index from disk... what was the point in storing it in the first place.

    directory = FSDirectory.open(Paths.get(indexLocation));

    Analyzer analyzer = new StandardAnalyzer();
    IndexWriterConfig indexWriterConfig = new IndexWriterConfig(analyzer);
    indexWriter = new IndexWriter(directory, indexWriterConfig);
    indexWritermit();
        
    searcherManager = new SearcherManager(indexWriter,true,true,null);    
    searcherManager.maybeRefresh();

When my application first starts with the above and I run a search using searchManager

    ...
    IndexSearcher searcher = searcherManager.acquire();
    ...

It returns nothing... Just zero results. If I reindex using the indexWriter above and run the same search... presto results.

Its infuriating that the lucene documentation seems to be very thin... and everything seems to be from before version 5.

What am I missing here?

So I am trying to use Lucene in a simple project and every search returns examples of how to do this on 10+ year old versions of lucene where entire packages referenced dont even exist anymore.

I have a simple application and I want to index the content of the database and store the lucene index on disk. When my application starts it should load the index and support searching.

I have this working... all except loading the index. I am creating an indexWriter and a searchManager and if I index documents they are in fact immediately available to search and everything works great. But eventually my database is going to be large enough I cant just reindex everything into lucene every time the application starts. Also, if I cant load the index from disk... what was the point in storing it in the first place.

    directory = FSDirectory.open(Paths.get(indexLocation));

    Analyzer analyzer = new StandardAnalyzer();
    IndexWriterConfig indexWriterConfig = new IndexWriterConfig(analyzer);
    indexWriter = new IndexWriter(directory, indexWriterConfig);
    indexWritermit();
        
    searcherManager = new SearcherManager(indexWriter,true,true,null);    
    searcherManager.maybeRefresh();

When my application first starts with the above and I run a search using searchManager

    ...
    IndexSearcher searcher = searcherManager.acquire();
    ...

It returns nothing... Just zero results. If I reindex using the indexWriter above and run the same search... presto results.

Its infuriating that the lucene documentation seems to be very thin... and everything seems to be from before version 5.

What am I missing here?

Share Improve this question asked Mar 6 at 0:28 SeamusSeamus 1912 silver badges8 bronze badges 4
  • Start with the demo, which uses the latest version of Lucene. You can find the demo overview here, on the official Lucene website. This page includes links to the code. All the documentation is for the latest version of Lucene. Yes, it is true that many articles/blogs/questions/answers are for older versions of Lucene, which are no longer compatible with several more recent versions of Lucene. But the official site has a good up-to-date starter demo for you. – andrewJames Commented Mar 6 at 19:54
  • From the main Lucene home page here, I reached the demo by following the link for "Lucene Core (Java)" and then the link for "10.1.0" - and then to the demo page. The 10.1.0 landing page has a lot you can explore. Useful info can be found throughout the JavaDocs - but, yes, it can seem a bit scattered when you first go exploring. – andrewJames Commented Mar 6 at 19:57
  • 1 Specifically regarding this: "If I reindex using the indexWriter above and run the same search... presto results.", take a look at the JavaDoc for IndexWriter. Note this: "The IndexWriterConfig.OpenMode option on IndexWriterConfig.setOpenMode(OpenMode) determines whether a new index is created, or whether an existing index is opened.". – andrewJames Commented Mar 6 at 20:03
  • Thanks for the info, I have been going through. The OpenMode is interesting however it still does not allow the existing index to be searched. I'm sure there is just some nuance I'm missing... The demo code appears to open the indexwriter the same way I am. I added the OpenMode to my code... however, until I reindex a file it remains unsearchable – Seamus Commented Mar 7 at 21:34
Add a comment  | 

1 Answer 1

Reset to default 0

I took the code in your question and added in a few extra lines of my own to ensure there is a document added to the index, and then to ensure that document is found when I search for it.

Here is that code:

My Maven dependencies (Lucene 10.1.0):

<dependencies>
    <dependency>
        <groupId>.apache.lucene</groupId>
        <artifactId>lucene-core</artifactId>
        <version>10.1.0</version>
        <type>jar</type>
    </dependency>
    <dependency>
        <groupId>.apache.lucene</groupId>
        <artifactId>lucene-queryparser</artifactId>
        <version>10.1.0</version>
        <type>jar</type>
    </dependency>
</dependencies>

(You may be using some other dependency management tool, of course.)

My Java imports:

import java.io.IOException;
import java.nio.file.Paths;
import .apache.lucene.analysis.Analyzer;
import .apache.lucene.analysis.standard.StandardAnalyzer;
import .apache.lucene.document.Document;
import .apache.lucene.document.Field;
import .apache.lucene.document.TextField;
import .apache.lucene.index.IndexWriter;
import .apache.lucene.index.IndexWriterConfig;
import .apache.lucene.queryparser.classic.ParseException;
import .apache.lucene.queryparser.classic.QueryParser;
import .apache.lucene.search.IndexSearcher;
import .apache.lucene.search.Query;
import .apache.lucene.search.ScoreDoc;
import .apache.lucene.search.SearcherManager;
import .apache.lucene.store.Directory;
import .apache.lucene.store.FSDirectory;

The indexing and searching code:

public static void index() throws IOException, ParseException {
    Directory directory = FSDirectory.open(Paths.get("index_dir"));

    Analyzer analyzer = new StandardAnalyzer();
    IndexWriterConfig indexWriterConfig = new IndexWriterConfig(analyzer);
    indexWriterConfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE_OR_APPEND);
    try (IndexWriter indexWriter = new IndexWriter(directory, indexWriterConfig)) {
        Document doc = new Document();
        String text = "This is the text to be indexed.";
        doc.add(new Field("fieldname", text, TextField.TYPE_STORED));
        indexWriter.addDocument(doc);
        indexWritermit();
        //
        SearcherManager searcherManager = new SearcherManager(indexWriter, true, true, null);
        searcherManager.maybeRefresh();
        QueryParser parser = new QueryParser("fieldname", new StandardAnalyzer());
        Query query = parser.parse("text");
        IndexSearcher indexSearcher = searcherManager.acquire();
        ScoreDoc[] hits = indexSearcher.search(query, 10).scoreDocs;
        System.out.println(hits.length);
    }
}

The above code retrieves one document the first time I run it; then 2 documents the second time, and so on.

If I comment out the code which indexes a document, I still find all documents when I search.

So, you would need to see what is different about your code (in some code not shown in the question, or in some way it is structured), to see why your search does not retrieve any data when you expect it to.


However, I would suggest starting by not using SearcherManager - and try with the same approach used in the Lucene official demo.

That would be something like this:

import java.io.IOException;
import java.nio.file.Paths;
import .apache.lucene.analysis.standard.StandardAnalyzer;
import .apache.lucene.index.DirectoryReader;
import .apache.lucene.queryparser.classic.ParseException;
import .apache.lucene.queryparser.classic.QueryParser;
import .apache.lucene.search.IndexSearcher;
import .apache.lucene.search.Query;
import .apache.lucene.search.ScoreDoc;
import .apache.lucene.store.Directory;
import .apache.lucene.store.FSDirectory;

And:

public static void search() throws IOException, ParseException {
    try (Directory directory = FSDirectory.open(Paths.get("index_dir")); 
             DirectoryReader indexReader = DirectoryReader.open(directory)) {
        IndexSearcher indexSearcher = new IndexSearcher(indexReader);
        QueryParser parser = new QueryParser("fieldname", new StandardAnalyzer());
        Query query = parser.parse("text");
        ScoreDoc[] hits = indexSearcher.search(query, 10).scoreDocs;
        System.out.println(hits.length);
    }
}

This just uses an IndexSearcher created directly (not from searcherManager.acquire()).


Both approaches should work, of course (they do for me). So if you find that using the demo's approach also does not work for you, then we definitely need more information to help you to resolve your problem.

Basically, we would need a minimal reproducible example.

转载请注明原文地址:http://conceptsofalgorithm.com/Algorithm/1745001425a279250.html

最新回复(0)