Package org.apache.lucene.tests.analysis
Class BaseTokenStreamTestCase
java.lang.Object
org.junit.Assert
org.apache.lucene.tests.util.LuceneTestCase
org.apache.lucene.tests.analysis.BaseTokenStreamTestCase
- Direct Known Subclasses:
BaseTokenStreamFactoryTestCase
Base class for all Lucene unit tests that use TokenStreams.
When writing unit tests for analysis components, it's highly recommended to use the helper
methods here (especially in conjunction with MockAnalyzer or MockTokenizer), as
they contain many assertions and checks to catch bugs.
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic interfaceAttribute that records if it was cleared or not.static final classAttribute that records if it was cleared or not.Nested classes/interfaces inherited from class org.apache.lucene.tests.util.LuceneTestCase
LuceneTestCase.AwaitsFix, LuceneTestCase.Concurrency, LuceneTestCase.Monster, LuceneTestCase.Nightly, LuceneTestCase.SuppressCodecs, LuceneTestCase.SuppressFileSystems, LuceneTestCase.SuppressFsync, LuceneTestCase.SuppressReproduceLine, LuceneTestCase.SuppressSysoutChecks, LuceneTestCase.SuppressTempFileChecks, LuceneTestCase.ThrowingConsumer<T>, LuceneTestCase.ThrowingRunnable, LuceneTestCase.Weekly -
Field Summary
Fields inherited from class org.apache.lucene.tests.util.LuceneTestCase
assertsAreEnabled, classRules, DEFAULT_LINE_DOCS_FILE, INFOSTREAM, JENKINS_LARGE_LINE_DOCS_FILE, LEAVE_TEMPORARY, MAYBE_CACHE_POLICY, RANDOM_MULTIPLIER, ruleChain, suiteFailureMarker, SYSPROP_AWAITSFIX, SYSPROP_FAILFAST, SYSPROP_MAXFAILURES, SYSPROP_MONSTER, SYSPROP_NIGHTLY, SYSPROP_WEEKLY, TEST_ASSERTS_ENABLED, TEST_AWAITSFIX, TEST_CODEC, TEST_DIRECTORY, TEST_DOCVALUESFORMAT, TEST_LINE_DOCS_FILE, TEST_MONSTER, TEST_NIGHTLY, TEST_POSTINGSFORMAT, TEST_THROTTLING, TEST_WEEKLY, VERBOSE -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic voidassertAnalyzesTo(Analyzer a, String input, String[] output) static voidassertAnalyzesTo(Analyzer a, String input, String[] output, int[] posIncrements) static voidassertAnalyzesTo(Analyzer a, String input, String[] output, int[] startOffsets, int[] endOffsets) static voidassertAnalyzesTo(Analyzer a, String input, String[] output, int[] startOffsets, int[] endOffsets, int[] posIncrements) static voidassertAnalyzesTo(Analyzer a, String input, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements) static voidassertAnalyzesTo(Analyzer a, String input, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths) static voidassertAnalyzesTo(Analyzer a, String input, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, boolean graphOffsetsAreCorrect) static voidassertAnalyzesTo(Analyzer a, String input, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, boolean graphOffsetsAreCorrect, byte[][] payloads) static voidassertAnalyzesTo(Analyzer a, String input, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, float[] boost) static voidassertAnalyzesTo(Analyzer a, String input, String[] output, String[] types) static voidassertAnalyzesToPositions(Analyzer a, String input, String[] output, int[] posIncrements, int[] posLengths) static voidassertAnalyzesToPositions(Analyzer a, String input, String[] output, String[] types, int[] posIncrements, int[] posLengths) static voidassertGraphStrings(Analyzer analyzer, String text, String... expectedStrings) Enumerates all accepted strings in the token graph created by the analyzer on the provided text, and then asserts that it's equal to the expected strings.static voidassertGraphStrings(TokenStream tokenStream, String... expectedStrings) Enumerates all accepted strings in the token graph created by the already initializedTokenStream.static voidassertStreamHasNumberOfTokens(TokenStream ts, int expectedCount) Asserts that the given stream has expected number of tokens.static voidassertTokenStreamContents(TokenStream ts, String[] output) static voidassertTokenStreamContents(TokenStream ts, String[] output, int[] posIncrements) static voidassertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets) static voidassertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, int[] posIncrements) static voidassertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, int[] posIncrements, int[] posLengths, Integer finalOffset) static voidassertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, int[] posIncrements, Integer finalOffset) static voidassertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, Integer finalOffset) static voidassertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements) static voidassertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths) static voidassertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, Integer finalOffset) static voidassertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, Integer finalOffset, boolean graphOffsetsAreCorrect) static voidassertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, Integer finalOffset, boolean[] keywordAtts, boolean graphOffsetsAreCorrect) static voidassertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, Integer finalOffset, boolean[] keywordAtts, boolean graphOffsetsAreCorrect, float[] boost) static voidassertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, Integer finalOffset, boolean graphOffsetsAreCorrect, float[] boost) static voidassertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, Integer finalOffset, float[] boost) static voidassertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, Integer finalOffset, Integer finalPosInc, boolean[] keywordAtts, boolean graphOffsetsAreCorrect, byte[][] payloads) static voidassertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, Integer finalOffset, Integer finalPosInc, boolean[] keywordAtts, boolean graphOffsetsAreCorrect, byte[][] payloads, int[] flags) static voidassertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, Integer finalOffset, Integer finalPosInc, boolean[] keywordAtts, boolean graphOffsetsAreCorrect, byte[][] payloads, int[] flags, float[] boost) static voidassertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, Integer finalOffset) static voidassertTokenStreamContents(TokenStream ts, String[] output, String[] types) static voidcheckAnalysisConsistency(Random random, Analyzer a, boolean useCharFilter, String text) static voidcheckAnalysisConsistency(Random random, Analyzer a, boolean useCharFilter, String text, boolean graphOffsetsAreCorrect) static voidcheckOneTerm(Analyzer a, String input, String expected) static voidcheckRandomData(Random random, Analyzer a, int iterations) utility method for blasting tokenstreams with data to make sure they don't do anything crazystatic voidcheckRandomData(Random random, Analyzer a, int iterations, boolean simple) utility method for blasting tokenstreams with data to make sure they don't do anything crazystatic voidcheckRandomData(Random random, Analyzer a, int iterations, int maxWordLength) utility method for blasting tokenstreams with data to make sure they don't do anything crazystatic voidcheckRandomData(Random random, Analyzer a, int iterations, int maxWordLength, boolean simple) static voidcheckRandomData(Random random, Analyzer a, int iterations, int maxWordLength, boolean simple, boolean graphOffsetsAreCorrect) static voidcheckResetException(Analyzer a, String input) static StringgetGraphStrings(Analyzer analyzer, String text) Returns all paths accepted by the token stream graph produced by analyzing text with the provided analyzer.getGraphStrings(TokenStream tokenStream) Returns all paths accepted by the token stream graph produced by the already initializedTokenStream.protected static MockTokenizerkeywordMockTokenizer(Reader input) protected static MockTokenizerkeywordMockTokenizer(String input) static AttributeFactoryReturns a random AttributeFactory implstatic AttributeFactorynewAttributeFactory(Random random) Returns a random AttributeFactory implprotected Stringprotected voidstatic StringReturns aStringsummary of the tokens this analyzer produces on this textprotected static MockTokenizerwhitespaceMockTokenizer(Reader input) protected static MockTokenizerwhitespaceMockTokenizer(String input) Methods inherited from class org.apache.lucene.tests.util.LuceneTestCase
addVirusChecker, assertDeletedDocsEquals, assertDocsAndPositionsEnumEquals, assertDocsEnumEquals, assertDocsSkippingEquals, assertDocValuesEquals, assertDocValuesEquals, assertDoubleUlpEquals, assertFieldInfosEquals, assertFloatUlpEquals, assertNormsEquals, assertPointsEquals, assertPositionsSkippingEquals, assertReaderEquals, assertReaderStatisticsEquals, assertStoredFieldEquals, assertStoredFieldsEquals, assertTermsEnumEquals, assertTermsEquals, assertTermsEquals, assertTermsStatisticsEquals, assertTermStatsEquals, assertTermVectorsEquals, asSet, assumeFalse, assumeNoException, assumeTrue, atLeast, atLeast, callStackContains, callStackContains, callStackContainsAnyOf, closeAfterSuite, closeAfterTest, collate, createTempDir, createTempDir, createTempFile, createTempFile, dumpArray, dumpIterator, ensureSaneIWCOnNightly, expectThrows, expectThrows, expectThrows, expectThrowsAnyOf, expectThrowsAnyOf, getDataInputStream, getDataPath, getJvmForkArguments, getOnlyLeafReader, getTestClass, getTestName, isTestThread, localeForLanguageTag, maybeChangeLiveIndexWriterConfig, maybeWrapReader, newAlcoholicMergePolicy, newAlcoholicMergePolicy, newBytesRef, newBytesRef, newBytesRef, newBytesRef, newBytesRef, newBytesRef, newDirectory, newDirectory, newDirectory, newDirectory, newDirectory, newField, newField, newFSDirectory, newFSDirectory, newIndexWriterConfig, newIndexWriterConfig, newIndexWriterConfig, newIOContext, newIOContext, newLogMergePolicy, newLogMergePolicy, newLogMergePolicy, newLogMergePolicy, newLogMergePolicy, newMaybeVirusCheckingDirectory, newMaybeVirusCheckingFSDirectory, newMergePolicy, newMergePolicy, newMergePolicy, newMockDirectory, newMockDirectory, newMockDirectory, newMockFSDirectory, newMockFSDirectory, newSearcher, newSearcher, newSearcher, newSearcher, newSearcher, newSnapshotIndexWriterConfig, newStringField, newStringField, newStringField, newStringField, newTextField, newTextField, newTieredMergePolicy, newTieredMergePolicy, overrideDefaultQueryCache, overrideTestDefaultQueryCache, random, randomLocale, randomTimeZone, randomVectorFormat, rarely, rarely, replaceMaxFailureRule, resetDefaultQueryCache, restoreCPUCoreCount, restoreIndexWriterMaxDocs, runWithRestrictedPermissions, setIndexWriterMaxDocs, setUp, setupCPUCoreCount, setUpExecutorService, shutdownExecutorService, slowFileExists, tearDown, usually, usually, wrapReaderMethods inherited from class org.junit.Assert
assertArrayEquals, assertArrayEquals, assertArrayEquals, assertArrayEquals, assertArrayEquals, assertArrayEquals, assertArrayEquals, assertArrayEquals, assertArrayEquals, assertArrayEquals, assertArrayEquals, assertArrayEquals, assertArrayEquals, assertArrayEquals, assertArrayEquals, assertArrayEquals, assertArrayEquals, assertArrayEquals, assertEquals, assertEquals, assertEquals, assertEquals, assertEquals, assertEquals, assertEquals, assertEquals, assertEquals, assertEquals, assertEquals, assertEquals, assertFalse, assertFalse, assertNotEquals, assertNotEquals, assertNotEquals, assertNotEquals, assertNotEquals, assertNotEquals, assertNotEquals, assertNotEquals, assertNotNull, assertNotNull, assertNotSame, assertNotSame, assertNull, assertNull, assertSame, assertSame, assertThat, assertThat, assertThrows, assertThrows, assertTrue, assertTrue, fail, fail
-
Constructor Details
-
BaseTokenStreamTestCase
public BaseTokenStreamTestCase()
-
-
Method Details
-
assertTokenStreamContents
public static void assertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, Integer finalOffset, Integer finalPosInc, boolean[] keywordAtts, boolean graphOffsetsAreCorrect, byte[][] payloads, int[] flags, float[] boost) throws IOException - Throws:
IOException
-
assertTokenStreamContents
public static void assertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, Integer finalOffset, Integer finalPosInc, boolean[] keywordAtts, boolean graphOffsetsAreCorrect, byte[][] payloads, int[] flags) throws IOException - Throws:
IOException
-
assertTokenStreamContents
public static void assertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, Integer finalOffset, boolean[] keywordAtts, boolean graphOffsetsAreCorrect) throws IOException - Throws:
IOException
-
assertTokenStreamContents
public static void assertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, Integer finalOffset, boolean[] keywordAtts, boolean graphOffsetsAreCorrect, float[] boost) throws IOException - Throws:
IOException
-
assertTokenStreamContents
public static void assertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, Integer finalOffset, Integer finalPosInc, boolean[] keywordAtts, boolean graphOffsetsAreCorrect, byte[][] payloads) throws IOException - Throws:
IOException
-
assertTokenStreamContents
public static void assertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, Integer finalOffset, boolean graphOffsetsAreCorrect, float[] boost) throws IOException - Throws:
IOException
-
assertTokenStreamContents
public static void assertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, Integer finalOffset, boolean graphOffsetsAreCorrect) throws IOException - Throws:
IOException
-
assertTokenStreamContents
public static void assertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, Integer finalOffset) throws IOException - Throws:
IOException
-
assertTokenStreamContents
public static void assertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, Integer finalOffset, float[] boost) throws IOException - Throws:
IOException
-
assertTokenStreamContents
public static void assertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, Integer finalOffset) throws IOException - Throws:
IOException
-
assertTokenStreamContents
public static void assertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements) throws IOException - Throws:
IOException
-
assertTokenStreamContents
public static void assertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths) throws IOException - Throws:
IOException
-
assertTokenStreamContents
- Throws:
IOException
-
assertTokenStreamContents
public static void assertTokenStreamContents(TokenStream ts, String[] output, String[] types) throws IOException - Throws:
IOException
-
assertTokenStreamContents
public static void assertTokenStreamContents(TokenStream ts, String[] output, int[] posIncrements) throws IOException - Throws:
IOException
-
assertTokenStreamContents
public static void assertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets) throws IOException - Throws:
IOException
-
assertTokenStreamContents
public static void assertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, Integer finalOffset) throws IOException - Throws:
IOException
-
assertTokenStreamContents
public static void assertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, int[] posIncrements) throws IOException - Throws:
IOException
-
assertTokenStreamContents
public static void assertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, int[] posIncrements, Integer finalOffset) throws IOException - Throws:
IOException
-
assertTokenStreamContents
public static void assertTokenStreamContents(TokenStream ts, String[] output, int[] startOffsets, int[] endOffsets, int[] posIncrements, int[] posLengths, Integer finalOffset) throws IOException - Throws:
IOException
-
assertAnalyzesTo
public static void assertAnalyzesTo(Analyzer a, String input, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements) throws IOException - Throws:
IOException
-
assertAnalyzesTo
public static void assertAnalyzesTo(Analyzer a, String input, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths) throws IOException - Throws:
IOException
-
assertAnalyzesTo
public static void assertAnalyzesTo(Analyzer a, String input, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, float[] boost) throws IOException - Throws:
IOException
-
assertAnalyzesTo
public static void assertAnalyzesTo(Analyzer a, String input, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, boolean graphOffsetsAreCorrect) throws IOException - Throws:
IOException
-
assertAnalyzesTo
public static void assertAnalyzesTo(Analyzer a, String input, String[] output, int[] startOffsets, int[] endOffsets, String[] types, int[] posIncrements, int[] posLengths, boolean graphOffsetsAreCorrect, byte[][] payloads) throws IOException - Throws:
IOException
-
assertAnalyzesTo
- Throws:
IOException
-
assertAnalyzesTo
public static void assertAnalyzesTo(Analyzer a, String input, String[] output, String[] types) throws IOException - Throws:
IOException
-
assertAnalyzesTo
public static void assertAnalyzesTo(Analyzer a, String input, String[] output, int[] posIncrements) throws IOException - Throws:
IOException
-
assertAnalyzesToPositions
public static void assertAnalyzesToPositions(Analyzer a, String input, String[] output, int[] posIncrements, int[] posLengths) throws IOException - Throws:
IOException
-
assertAnalyzesToPositions
public static void assertAnalyzesToPositions(Analyzer a, String input, String[] output, String[] types, int[] posIncrements, int[] posLengths) throws IOException - Throws:
IOException
-
assertAnalyzesTo
public static void assertAnalyzesTo(Analyzer a, String input, String[] output, int[] startOffsets, int[] endOffsets) throws IOException - Throws:
IOException
-
assertAnalyzesTo
public static void assertAnalyzesTo(Analyzer a, String input, String[] output, int[] startOffsets, int[] endOffsets, int[] posIncrements) throws IOException - Throws:
IOException
-
checkResetException
- Throws:
IOException
-
checkOneTerm
- Throws:
IOException
-
checkRandomData
utility method for blasting tokenstreams with data to make sure they don't do anything crazy- Throws:
IOException
-
checkRandomData
public static void checkRandomData(Random random, Analyzer a, int iterations, int maxWordLength) throws IOException utility method for blasting tokenstreams with data to make sure they don't do anything crazy- Throws:
IOException
-
checkRandomData
public static void checkRandomData(Random random, Analyzer a, int iterations, boolean simple) throws IOException utility method for blasting tokenstreams with data to make sure they don't do anything crazy- Parameters:
simple- true if only ascii strings will be used (try to avoid)- Throws:
IOException
-
assertStreamHasNumberOfTokens
public static void assertStreamHasNumberOfTokens(TokenStream ts, int expectedCount) throws IOException Asserts that the given stream has expected number of tokens.- Throws:
IOException
-
checkRandomData
public static void checkRandomData(Random random, Analyzer a, int iterations, int maxWordLength, boolean simple) throws IOException - Throws:
IOException
-
checkRandomData
public static void checkRandomData(Random random, Analyzer a, int iterations, int maxWordLength, boolean simple, boolean graphOffsetsAreCorrect) throws IOException - Throws:
IOException
-
escape
-
checkAnalysisConsistency
public static void checkAnalysisConsistency(Random random, Analyzer a, boolean useCharFilter, String text) throws IOException - Throws:
IOException
-
checkAnalysisConsistency
public static void checkAnalysisConsistency(Random random, Analyzer a, boolean useCharFilter, String text, boolean graphOffsetsAreCorrect) throws IOException - Throws:
IOException
-
toDot
- Throws:
IOException
-
toDotFile
- Throws:
IOException
-
whitespaceMockTokenizer
- Throws:
IOException
-
whitespaceMockTokenizer
- Throws:
IOException
-
keywordMockTokenizer
- Throws:
IOException
-
keywordMockTokenizer
- Throws:
IOException
-
newAttributeFactory
Returns a random AttributeFactory impl -
newAttributeFactory
Returns a random AttributeFactory impl -
assertGraphStrings
public static void assertGraphStrings(Analyzer analyzer, String text, String... expectedStrings) throws IOException Enumerates all accepted strings in the token graph created by the analyzer on the provided text, and then asserts that it's equal to the expected strings. UsesTokenStreamToAutomatonto create an automaton. Asserts the finite strings of the automaton are all and only the given valid strings.- Parameters:
analyzer- analyzer containing the SynonymFilter under test.text- text to be analyzed.expectedStrings- all expected finite strings.- Throws:
IOException
-
assertGraphStrings
public static void assertGraphStrings(TokenStream tokenStream, String... expectedStrings) throws IOException Enumerates all accepted strings in the token graph created by the already initializedTokenStream.- Throws:
IOException
-
getGraphStrings
Returns all paths accepted by the token stream graph produced by analyzing text with the provided analyzer. The tokensCharTermAttributevalues are concatenated, and separated with space.- Throws:
IOException
-
getGraphStrings
Returns all paths accepted by the token stream graph produced by the already initializedTokenStream.- Throws:
IOException
-
toString
Returns aStringsummary of the tokens this analyzer produces on this text- Throws:
IOException
-