So I am trying to use Lucene in a simple project and every search returns examples of how to do this on 10+ year old versions of lucene where entire packages referenced dont even exist anymore.
I have a simple application and I want to index the content of the database and store the lucene index on disk. When my application starts it should load the index and support searching.
I have this working... all except loading the index. I am creating an indexWriter and a searchManager and if I index documents they are in fact immediately available to search and everything works great. But eventually my database is going to be large enough I cant just reindex everything into lucene every time the application starts. Also, if I cant load the index from disk... what was the point in storing it in the first place.
directory = FSDirectory.open(Paths.get(indexLocation));
Analyzer analyzer = new StandardAnalyzer();
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(analyzer);
indexWriter = new IndexWriter(directory, indexWriterConfig);
indexWritermit();
searcherManager = new SearcherManager(indexWriter,true,true,null);
searcherManager.maybeRefresh();
When my application first starts with the above and I run a search using searchManager
...
IndexSearcher searcher = searcherManager.acquire();
...
It returns nothing... Just zero results. If I reindex using the indexWriter above and run the same search... presto results.
Its infuriating that the lucene documentation seems to be very thin... and everything seems to be from before version 5.
What am I missing here?
So I am trying to use Lucene in a simple project and every search returns examples of how to do this on 10+ year old versions of lucene where entire packages referenced dont even exist anymore.
I have a simple application and I want to index the content of the database and store the lucene index on disk. When my application starts it should load the index and support searching.
I have this working... all except loading the index. I am creating an indexWriter and a searchManager and if I index documents they are in fact immediately available to search and everything works great. But eventually my database is going to be large enough I cant just reindex everything into lucene every time the application starts. Also, if I cant load the index from disk... what was the point in storing it in the first place.
directory = FSDirectory.open(Paths.get(indexLocation));
Analyzer analyzer = new StandardAnalyzer();
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(analyzer);
indexWriter = new IndexWriter(directory, indexWriterConfig);
indexWritermit();
searcherManager = new SearcherManager(indexWriter,true,true,null);
searcherManager.maybeRefresh();
When my application first starts with the above and I run a search using searchManager
...
IndexSearcher searcher = searcherManager.acquire();
...
It returns nothing... Just zero results. If I reindex using the indexWriter above and run the same search... presto results.
Its infuriating that the lucene documentation seems to be very thin... and everything seems to be from before version 5.
What am I missing here?
I took the code in your question and added in a few extra lines of my own to ensure there is a document added to the index, and then to ensure that document is found when I search for it.
Here is that code:
My Maven dependencies (Lucene 10.1.0):
<dependencies>
<dependency>
<groupId>.apache.lucene</groupId>
<artifactId>lucene-core</artifactId>
<version>10.1.0</version>
<type>jar</type>
</dependency>
<dependency>
<groupId>.apache.lucene</groupId>
<artifactId>lucene-queryparser</artifactId>
<version>10.1.0</version>
<type>jar</type>
</dependency>
</dependencies>
(You may be using some other dependency management tool, of course.)
My Java imports:
import java.io.IOException;
import java.nio.file.Paths;
import .apache.lucene.analysis.Analyzer;
import .apache.lucene.analysis.standard.StandardAnalyzer;
import .apache.lucene.document.Document;
import .apache.lucene.document.Field;
import .apache.lucene.document.TextField;
import .apache.lucene.index.IndexWriter;
import .apache.lucene.index.IndexWriterConfig;
import .apache.lucene.queryparser.classic.ParseException;
import .apache.lucene.queryparser.classic.QueryParser;
import .apache.lucene.search.IndexSearcher;
import .apache.lucene.search.Query;
import .apache.lucene.search.ScoreDoc;
import .apache.lucene.search.SearcherManager;
import .apache.lucene.store.Directory;
import .apache.lucene.store.FSDirectory;
The indexing and searching code:
public static void index() throws IOException, ParseException {
Directory directory = FSDirectory.open(Paths.get("index_dir"));
Analyzer analyzer = new StandardAnalyzer();
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(analyzer);
indexWriterConfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE_OR_APPEND);
try (IndexWriter indexWriter = new IndexWriter(directory, indexWriterConfig)) {
Document doc = new Document();
String text = "This is the text to be indexed.";
doc.add(new Field("fieldname", text, TextField.TYPE_STORED));
indexWriter.addDocument(doc);
indexWritermit();
//
SearcherManager searcherManager = new SearcherManager(indexWriter, true, true, null);
searcherManager.maybeRefresh();
QueryParser parser = new QueryParser("fieldname", new StandardAnalyzer());
Query query = parser.parse("text");
IndexSearcher indexSearcher = searcherManager.acquire();
ScoreDoc[] hits = indexSearcher.search(query, 10).scoreDocs;
System.out.println(hits.length);
}
}
The above code retrieves one document the first time I run it; then 2 documents the second time, and so on.
If I comment out the code which indexes a document, I still find all documents when I search.
So, you would need to see what is different about your code (in some code not shown in the question, or in some way it is structured), to see why your search does not retrieve any data when you expect it to.
However, I would suggest starting by not using SearcherManager
- and try with the same approach used in the Lucene official demo.
That would be something like this:
import java.io.IOException;
import java.nio.file.Paths;
import .apache.lucene.analysis.standard.StandardAnalyzer;
import .apache.lucene.index.DirectoryReader;
import .apache.lucene.queryparser.classic.ParseException;
import .apache.lucene.queryparser.classic.QueryParser;
import .apache.lucene.search.IndexSearcher;
import .apache.lucene.search.Query;
import .apache.lucene.search.ScoreDoc;
import .apache.lucene.store.Directory;
import .apache.lucene.store.FSDirectory;
And:
public static void search() throws IOException, ParseException {
try (Directory directory = FSDirectory.open(Paths.get("index_dir"));
DirectoryReader indexReader = DirectoryReader.open(directory)) {
IndexSearcher indexSearcher = new IndexSearcher(indexReader);
QueryParser parser = new QueryParser("fieldname", new StandardAnalyzer());
Query query = parser.parse("text");
ScoreDoc[] hits = indexSearcher.search(query, 10).scoreDocs;
System.out.println(hits.length);
}
}
This just uses an IndexSearcher
created directly (not from searcherManager.acquire()
).
Both approaches should work, of course (they do for me). So if you find that using the demo's approach also does not work for you, then we definitely need more information to help you to resolve your problem.
Basically, we would need a minimal reproducible example.
IndexWriter
. Note this: "The IndexWriterConfig.OpenMode option on IndexWriterConfig.setOpenMode(OpenMode) determines whether a new index is created, or whether an existing index is opened.". – andrewJames Commented Mar 6 at 20:03