public class MatrixBlockDictionary extends ADictionary
| Constructor and Description |
|---|
MatrixBlockDictionary(MatrixBlock data) |
| Modifier and Type | Method and Description |
|---|---|
void |
addMaxAndMin(double[] ret,
int[] colIndexes)
This method adds the max and min values contained in the dictionary to corresponding cells in the ret variable.
|
void |
addToEntry(Dictionary d,
int fr,
int to,
int nCol)
Copies and adds the dictionary entry from this dictionary to the d dictionary
|
double |
aggregate(double init,
Builtin fn)
Aggregate all the contained values, useful in value only computations where the operation is iterating through
all values contained in the dictionary.
|
void |
aggregateCols(double[] c,
Builtin fn,
int[] colIndexes)
Aggregates the columns into the target double array provided.
|
double[] |
aggregateTuples(Builtin fn,
int nCol)
Aggregate all entries in the rows.
|
ADictionary |
apply(ScalarOperator op)
Applies the scalar operation on the dictionary.
|
ADictionary |
applyBinaryRowOpLeft(BinaryOperator op,
double[] v,
boolean sparseSafe,
int[] colIndexes)
Apply binary row operation on this dictionary on the left side.
|
ADictionary |
applyBinaryRowOpRight(BinaryOperator op,
double[] v,
boolean sparseSafe,
int[] colIndexes)
Apply binary row operation on this dictionary on the right side.
|
ADictionary |
applyScalarOp(ScalarOperator op,
double newVal,
int numCols)
Applies the scalar operation on the dictionary.
|
ADictionary |
clone()
Returns a deep clone of the dictionary.
|
ADictionary |
cloneAndExtend(int len)
Clone the dictionary, and extend size of the dictionary by a given length
|
void |
colSum(double[] c,
int[] counts,
int[] colIndexes,
boolean square)
Get the column sum of the values contained in the dictionary
|
double[] |
colSum(int[] counts,
int nCol)
get the column sum of this dictionary only.
|
boolean |
containsValue(double pattern)
Detect if the dictionary contains a specific value.
|
MatrixBlockDictionary |
getAsMatrixBlockDictionary(int nCol)
Get this dictionary as a matrixBlock dictionary.
|
long |
getExactSizeOnDisk()
Calculate the space consumption if the dictionary is stored on disk.
|
long |
getInMemorySize()
Returns the memory usage of the dictionary.
|
static long |
getInMemorySize(int numberValues,
int numberColumns,
double sparsity) |
MatrixBlock |
getMatrixBlock() |
long |
getNumberNonZeros(int[] counts,
int nCol)
Calculate the number of non zeros in the dictionary.
|
int |
getNumberOfValues(int ncol)
Get the number of distinct tuples given that the column group has n columns
|
String |
getString(int colIndexes)
Get a string representation of the dictionary, that considers the layout of the data.
|
double[] |
getTuple(int index,
int nCol)
Get the values contained in a specific tuple of the dictionary.
|
double |
getValue(int i)
Get Specific value contained in the dictionary at index.
|
double[] |
getValues()
Get all the values contained in the dictionary as a linearized double array.
|
boolean |
isLossy()
Specify if the Dictionary is lossy.
|
void |
preaggValuesFromDense(int numVals,
int[] colIndexes,
int[] aggregateColumns,
double[] b,
double[] ret,
int cut)
Pre Aggregate values for right Matrix Multiplication.
|
static MatrixBlockDictionary |
read(DataInput in) |
ADictionary |
reExpandColumns(int max)
return a new Dictionary that have re expanded all values, based on the entries already contained.
|
ADictionary |
replace(double pattern,
double replace,
int nCol,
boolean safe)
Make a copy of the values, and replace all values that match pattern with replacement value.
|
ADictionary |
scaleTuples(int[] scaling,
int nCol)
Scale all tuples contained in the dictionary by the scaling factor given in the int list.
|
ADictionary |
sliceOutColumnRange(int idxStart,
int idxEnd,
int previousNumberOfColumns)
Modify the dictionary by removing columns not within the index range.
|
ADictionary |
subtractTuple(double[] tuple)
Allocate a new dictionary where the tuple given is subtracted from all tuples in the previous dictionary.
|
double |
sum(int[] counts,
int ncol)
Get the sum of the values contained in the dictionary
|
double[] |
sumAllRowsToDouble(boolean square,
int nrColumns)
Method used as a pre-aggregate of each tuple in the dictionary, to single double values.
|
double |
sumRow(int k,
boolean square,
int nrColumns)
Sum the values at a specific row.
|
double |
sumsq(int[] counts,
int ncol)
Get the square sum of the values contained in the dictionary
|
String |
toString() |
void |
write(DataOutput out)
Write the dictionary to a DataOutput.
|
applyBinaryRowOp, getMostCommonTuplepublic MatrixBlockDictionary(MatrixBlock data)
public MatrixBlock getMatrixBlock()
public double[] getValues()
ADictionarygetValues in class ADictionarypublic double getValue(int i)
ADictionarygetValue in class ADictionaryi - The index to extract the value frompublic long getInMemorySize()
ADictionarygetInMemorySize in class ADictionarypublic static long getInMemorySize(int numberValues,
int numberColumns,
double sparsity)
public double aggregate(double init,
Builtin fn)
ADictionaryaggregate in class ADictionaryinit - The initial Value, in cases such as Max value, this could be -infinityfn - The Function to apply to valuespublic double[] aggregateTuples(Builtin fn, int nCol)
ADictionaryaggregateTuples in class ADictionaryfn - The aggregate functionnCol - The number of columns contained in the dictionary.public void aggregateCols(double[] c,
Builtin fn,
int[] colIndexes)
ADictionaryaggregateCols in class ADictionaryc - The target double array, this contains the full number of columns, therefore the colIndexes for
this specific dictionary is needed.fn - The function to apply to individual columnscolIndexes - The mapping to the target columns from the individual columnspublic ADictionary apply(ScalarOperator op)
ADictionaryapply in class ADictionaryop - The operator to apply to the dictionary values.public ADictionary applyScalarOp(ScalarOperator op, double newVal, int numCols)
ADictionaryapplyScalarOp in class ADictionaryop - The operator to apply to the dictionary values.newVal - The value to append to the dictionary.numCols - The number of columns stored in the dictionary.public ADictionary applyBinaryRowOpLeft(BinaryOperator op, double[] v, boolean sparseSafe, int[] colIndexes)
ADictionaryapplyBinaryRowOpLeft in class ADictionaryop - The operation to this dictionaryv - The values to use on the left hand side.sparseSafe - boolean specifying if the operation is safe, and therefore dont need to allocate an extended
dictionarycolIndexes - The column indexes to consider inside v.public ADictionary applyBinaryRowOpRight(BinaryOperator op, double[] v, boolean sparseSafe, int[] colIndexes)
ADictionaryapplyBinaryRowOpRight in class ADictionaryop - The operation to this dictionaryv - The values to use on the right hand side.sparseSafe - boolean specifying if the operation is safe, and therefore dont need to allocate an extended
dictionarycolIndexes - The column indexes to consider inside v.public ADictionary clone()
ADictionaryclone in class ADictionarypublic ADictionary cloneAndExtend(int len)
ADictionarycloneAndExtend in class ADictionarylen - The length to extend the dictionary, it is assumed this value is positive.public boolean isLossy()
ADictionaryisLossy in class ADictionarypublic int getNumberOfValues(int ncol)
ADictionarygetNumberOfValues in class ADictionaryncol - The number of Columns in the ColumnGroup.public double[] sumAllRowsToDouble(boolean square,
int nrColumns)
ADictionarysumAllRowsToDouble in class ADictionarysquare - If each entry should be squared.nrColumns - The number of columns in the ColGroup to know how to get the values from the dictionary.public double sumRow(int k,
boolean square,
int nrColumns)
ADictionarysumRow in class ADictionaryk - The row index to sumsquare - If each entry should be squared.nrColumns - The number of columnspublic double[] colSum(int[] counts,
int nCol)
ADictionarycolSum in class ADictionarycounts - the counts of the values containednCol - The number of columns contained in each tuple.public void colSum(double[] c,
int[] counts,
int[] colIndexes,
boolean square)
ADictionarycolSum in class ADictionaryc - The output array allocated to contain all column groups output.counts - The counts of the individual tuples.colIndexes - The columns indexes of the parent column group, this indicate where to put the column sum into
the c output.square - Specify if the values should be squaredpublic double sum(int[] counts,
int ncol)
ADictionarysum in class ADictionarycounts - The counts of the individual tuplesncol - The number of columns containedpublic double sumsq(int[] counts,
int ncol)
ADictionarysumsq in class ADictionarycounts - The counts of the individual tuplesncol - The number of columns containedpublic String getString(int colIndexes)
ADictionarygetString in class ADictionarycolIndexes - The number of columns in the dictionary.public void addMaxAndMin(double[] ret,
int[] colIndexes)
ADictionaryaddMaxAndMin in class ADictionaryret - The double array that contains all columns min and max.colIndexes - The column indexes contained in this dictionary.public ADictionary sliceOutColumnRange(int idxStart, int idxEnd, int previousNumberOfColumns)
ADictionarysliceOutColumnRange in class ADictionaryidxStart - The column index to start at.idxEnd - The column index to end at (not inclusive)previousNumberOfColumns - The number of columns contained in the dictionary.public ADictionary reExpandColumns(int max)
ADictionaryreExpandColumns in class ADictionarymax - The number of output columns possible.public boolean containsValue(double pattern)
ADictionarycontainsValue in class ADictionarypattern - The value to search forpublic long getNumberNonZeros(int[] counts,
int nCol)
ADictionarygetNumberNonZeros in class ADictionarycounts - The counts of each dictionary entrynCol - The number of columns in this dictionarypublic void addToEntry(Dictionary d, int fr, int to, int nCol)
ADictionaryaddToEntry in class ADictionaryd - the target dictionaryfr - the from indexto - the to indexnCol - the number of columnspublic double[] getTuple(int index,
int nCol)
ADictionarygetTuple in class ADictionaryindex - The index where the values are locatednCol - The number of columns contained in this dictionarypublic ADictionary subtractTuple(double[] tuple)
ADictionarysubtractTuple in class ADictionarytuple - a double list representing a tuple, it is given that the tuple with is the same as this
dictionaries.public MatrixBlockDictionary getAsMatrixBlockDictionary(int nCol)
ADictionarygetAsMatrixBlockDictionary in class ADictionarynCol - The number of columns contained in this column group.public ADictionary scaleTuples(int[] scaling, int nCol)
ADictionaryscaleTuples in class ADictionaryscaling - The ammout to multiply the given tuples withnCol - The number of columns contained in this column group.public void write(DataOutput out) throws IOException
ADictionarywrite in class ADictionaryout - the output sink to write the dictionary to.IOException - if the sink fails.public static MatrixBlockDictionary read(DataInput in) throws IOException
IOExceptionpublic long getExactSizeOnDisk()
ADictionarygetExactSizeOnDisk in class ADictionarypublic void preaggValuesFromDense(int numVals,
int[] colIndexes,
int[] aggregateColumns,
double[] b,
double[] ret,
int cut)
ADictionarypreaggValuesFromDense in class ADictionarynumVals - The number of values contained in this dictionarycolIndexes - The column indexes that is associated with the parent column groupaggregateColumns - The column to aggregate, this is preprocessed, to find remove consideration for empty
columnsb - The values in the right hand side matrixret - The double array to put in the aggregate.cut - The number of columns in b.public ADictionary replace(double pattern, double replace, int nCol, boolean safe)
ADictionaryreplace in class ADictionarypattern - The value to look forreplace - The value to replace the other value withnCol - The number of columns contained in the dictionary.safe - Specify if the operation require consideration of adding a new tuple. This happens if the
dictionary have allocated the last zero tuple or not.Copyright © 2021 The Apache Software Foundation. All rights reserved.