class SimilarityRenameDetector
extends java.lang.Object
Modifier and Type | Field and Description |
---|---|
private static int |
BITS_PER_INDEX
Number of bits we need to express an index into src or dst list.
|
private java.util.List<DiffEntry> |
dsts
All destinations to consider looking for a rename.
|
private static int |
INDEX_MASK |
private long[] |
matrix
Matrix of all examined file pairs, and their scores.
|
private java.util.List<DiffEntry> |
out |
private ContentSource.Pair |
reader |
private int |
renameScore
Score a pair must exceed to be considered a rename.
|
private static int |
SCORE_SHIFT |
private java.util.List<DiffEntry> |
srcs
All sources to consider for copies or renames.
|
private boolean |
tableOverflow
Set if any
SimilarityIndex.TableFullException occurs. |
Constructor and Description |
---|
SimilarityRenameDetector(ContentSource.Pair reader,
java.util.List<DiffEntry> srcs,
java.util.List<DiffEntry> dsts) |
Modifier and Type | Method and Description |
---|---|
private int |
buildMatrix(ProgressMonitor pm) |
private static java.util.List<DiffEntry> |
compactDstList(java.util.List<DiffEntry> in) |
private static java.util.List<DiffEntry> |
compactSrcList(java.util.List<DiffEntry> in) |
(package private) void |
compute(ProgressMonitor pm) |
private static int |
decodeFile(int v) |
(package private) static int |
dstFile(long value) |
(package private) static long |
encode(int score,
int srcIdx,
int dstIdx) |
private static long |
encodeFile(int idx) |
(package private) java.util.List<DiffEntry> |
getLeftOverDestinations() |
(package private) java.util.List<DiffEntry> |
getLeftOverSources() |
(package private) java.util.List<DiffEntry> |
getMatches() |
private SimilarityIndex |
hash(DiffEntry.Side side,
DiffEntry ent) |
private static boolean |
isFile(FileMode mode) |
(package private) boolean |
isTableOverflow() |
(package private) static int |
nameScore(java.lang.String a,
java.lang.String b) |
private static int |
score(long value) |
(package private) void |
setRenameScore(int score) |
private long |
size(DiffEntry.Side side,
DiffEntry ent) |
(package private) static int |
srcFile(long value) |
private static final int BITS_PER_INDEX
This must be 28, giving us a limit of 2^28 entries in either list, which is an insane limit of 536,870,912 file names being considered in a single rename pass. The other 8 bits are used to store the score, while staying under 127 so the long doesn't go negative.
private static final int INDEX_MASK
private static final int SCORE_SHIFT
private ContentSource.Pair reader
private java.util.List<DiffEntry> srcs
A source is typically a DiffEntry.ChangeType.DELETE
change, but could be
another type when trying to perform copy detection concurrently with
rename detection.
private java.util.List<DiffEntry> dsts
A destination is typically an DiffEntry.ChangeType.ADD
, as the name has
just come into existence, and we want to discover where its initial
content came from.
private long[] matrix
The upper 8 bits of each long stores the score, but the score is bounded to be in the range (0, 128] so that the highest bit is never set, and all entries are therefore positive.
List indexes to an element of srcs
and dsts
are encoded
as the lower two groups of 28 bits, respectively, but the encoding is
inverted, so that 0 is expressed as (1 << 28) - 1
. This sorts
lower list indices later in the matrix, giving precedence to files whose
names sort earlier in the tree.
private int renameScore
private boolean tableOverflow
SimilarityIndex.TableFullException
occurs.private java.util.List<DiffEntry> out
SimilarityRenameDetector(ContentSource.Pair reader, java.util.List<DiffEntry> srcs, java.util.List<DiffEntry> dsts)
void setRenameScore(int score)
void compute(ProgressMonitor pm) throws java.io.IOException, CancelledException
java.io.IOException
CancelledException
java.util.List<DiffEntry> getMatches()
java.util.List<DiffEntry> getLeftOverSources()
java.util.List<DiffEntry> getLeftOverDestinations()
boolean isTableOverflow()
private static java.util.List<DiffEntry> compactSrcList(java.util.List<DiffEntry> in)
private static java.util.List<DiffEntry> compactDstList(java.util.List<DiffEntry> in)
private int buildMatrix(ProgressMonitor pm) throws java.io.IOException, CancelledException
java.io.IOException
CancelledException
static int nameScore(java.lang.String a, java.lang.String b)
private SimilarityIndex hash(DiffEntry.Side side, DiffEntry ent) throws java.io.IOException, SimilarityIndex.TableFullException
java.io.IOException
SimilarityIndex.TableFullException
private long size(DiffEntry.Side side, DiffEntry ent) throws java.io.IOException
java.io.IOException
private static int score(long value)
static int srcFile(long value)
static int dstFile(long value)
static long encode(int score, int srcIdx, int dstIdx)
private static long encodeFile(int idx)
private static int decodeFile(int v)
private static boolean isFile(FileMode mode)