public static class InputSampler.RandomSampler<K,V> extends Object implements InputSampler.Sampler<K,V>
| Modifier and Type | Field and Description |
|---|---|
protected double |
freq |
protected int |
maxSplitsSampled |
protected int |
numSamples |
| Constructor and Description |
|---|
RandomSampler(double freq,
int numSamples)
Create a new RandomSampler sampling all splits.
|
RandomSampler(double freq,
int numSamples,
int maxSplitsSampled)
Create a new RandomSampler.
|
| Modifier and Type | Method and Description |
|---|---|
K[] |
getSample(InputFormat<K,V> inf,
Job job)
Randomize the split order, then take the specified number of keys from
each split sampled, where each key is selected with the specified
probability and possibly replaced by a subsequently selected key when
the quota of keys from that split is satisfied.
|
protected double freq
protected final int numSamples
protected final int maxSplitsSampled
public RandomSampler(double freq,
int numSamples)
freq - Probability with which a key will be chosen.numSamples - Total number of samples to obtain from all selected
splits.public RandomSampler(double freq,
int numSamples,
int maxSplitsSampled)
freq - Probability with which a key will be chosen.numSamples - Total number of samples to obtain from all selected
splits.maxSplitsSampled - The maximum number of splits to examine.public K[] getSample(InputFormat<K,V> inf, Job job) throws IOException, InterruptedException
getSample in interface InputSampler.Sampler<K,V>IOExceptionInterruptedExceptionCopyright © 2008–2022 Apache Software Foundation. All rights reserved.