Class WriteLineDocTask
java.lang.Object
org.apache.lucene.benchmark.byTask.tasks.PerfTask
org.apache.lucene.benchmark.byTask.tasks.WriteLineDocTask
- All Implemented Interfaces:
Cloneable
- Direct Known Subclasses:
WriteEnwikiLineDocTask
A task which writes documents, one line per document. Each line is in the following format: title
<TAB> date <TAB> body. The output of this task can be consumed by
LineDocSource and is intended to save the IO overhead
of opening a file per document to be indexed.
The format of the output is set according to the output file extension. Compression is
recommended when the output file is expected to be large. See info on file extensions in StreamUtils.Type
Supports the following parameters:
- line.file.out - the name of the file to write the output to. That parameter is mandatory. NOTE: the file is re-created.
- line.fields - which fields should be written in each line. (optional, default:
DEFAULT_FIELDS). - sufficient.fields - list of field names, separated by comma, which, if all of them
are missing, the document will be skipped. For example, to require that at least one of
f1,f2 is not empty, specify: "f1,f2" in this field. To specify that no field is required,
i.e. that even empty docs should be emitted, specify ",". (optional, default:
DEFAULT_SUFFICIENT_FIELDS).
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidclose()intdoLogic()Perform the task once (ignoring repetitions specification) Return number of work items done by this task.protected StringgetLogMessage(int recsCount) protected PrintWriterlineFileOut(Document doc) Selects output line file by written doc.voidSet the params (docSize only)booleanSub classes that support parameters must override this method to return true.protected voidwriteHeader(PrintWriter out) Write header to the lines file - indicating how to read the file later.Methods inherited from class org.apache.lucene.benchmark.byTask.tasks.PerfTask
clone, getAlgLineNum, getBackgroundDeltaPriority, getDepth, getName, getParams, getRunData, getRunInBackground, isDisableCounting, runAndMaybeStats, setAlgLineNum, setDepth, setDisableCounting, setName, setRunInBackground, setup, shouldNeverLogAtStart, shouldNotRecordStats, stopNow, tearDown, toString
-
Field Details
-
FIELDS_HEADER_INDICATOR
- See Also:
-
SEP
public static final char SEP- See Also:
-
DEFAULT_FIELDS
Fields to be written by default -
DEFAULT_SUFFICIENT_FIELDS
Default fields which at least one of them is required to not skip the doc.- See Also:
-
fname
-
-
Constructor Details
-
WriteLineDocTask
- Throws:
Exception
-
-
Method Details
-
writeHeader
Write header to the lines file - indicating how to read the file later. -
getLogMessage
- Overrides:
getLogMessagein classPerfTask
-
doLogic
Description copied from class:PerfTaskPerform the task once (ignoring repetitions specification) Return number of work items done by this task. For indexing that can be number of docs added. For warming that can be number of scanned items, etc. -
lineFileOut
Selects output line file by written doc. Default: original output line file. -
close
-
setParams
Set the params (docSize only) -
supportsParams
public boolean supportsParams()Description copied from class:PerfTaskSub classes that support parameters must override this method to return true.- Overrides:
supportsParamsin classPerfTask- Returns:
- true iff this task supports command line params.
-