| Sign In/My Account | View Cart |
Using Lucene to Search Java Source Code
Pages: 1, 2, 3, 4, 5
Lucene has four different types of fields, which can be specified for optimal index creation: Keyword, UnIndexed, UnStored, and Text.
JavaSourceCodeIndexer uses this field to store import declarations.UnIndexed fields. Fields of this type are analyzed and indexed, but are not stored in the index. The source code of the method is indexed as an UnStored code field, as storing every line of code would require a large amount of space. The source code of a method can be directly retrieved from the original Java file, resulting in an optimal index size.Fields used by JavaSourceCodeIndexer is shown in the following table:| Field | Type |
| Class Name | Text |
| Import Declarations | Keyword |
| Method Name | Text |
| Method Block (Code) | UnStored |
| File Name | UnIndexed |
| Method Parameter Type | Text |
| Return Type | Text |
| Comments | UnStored |
| Extends Class | Text |
| Implements | Text |
The indexes created by Lucene can be viewed and modified using Luke, a useful open source tool for understanding indexes. Luke's snapshot of the indexes creates by JavaSourceCodeIndexer is shown in Figure 1.

Figure 1. Snapshot of indexes in Luke
As you can see, the import declarations are stored as is, without tokenizing or analyzing. The class names and method names are converted to lower case and stored.