In this post I will try to describe refactored java code.
Please refer previous post about Suffix Trees - Suffix Trees: Java Ukkonen's Algorithm
Tree structure:
Tree consists of:
Node has next data:
Suffix is important for building tree. It has next data:
All classes are wrapped by SuffixTree class that creates a tree. It has:
Class diagram:
Future plans:
Download Suffix Trees Refactored Java Code.
Several test runs are listed below:
book
bookke
cacao
googol
abababc
Please refer previous post about Suffix Trees - Suffix Trees: Java Ukkonen's Algorithm
Tree structure:
Tree consists of:
- nodes;
- edges.
- reference to start node;
- reference to end node;
- sub string begin index;
- sub string end index.
- split edge by a Suffix and return a new Node
Node has next data:
- reference to suffix node;
- hash map of edge references;
- name.
- add Edge into hash map
- remove Edge from hash map
- find Edge in hash map by character
Suffix is important for building tree. It has next data:
- reference to origin node;
- sub string begin index;
- sub string end index.
- canonize
All classes are wrapped by SuffixTree class that creates a tree. It has:
- input text;
- root Node.
Class diagram:
Future plans:
- suffix trees use cases
- further refactoring
Download Suffix Trees Refactored Java Code.
Several test runs are listed below:
book
Edge Start End Suf First Last String 1 0 1 -1 0 3 book 3 0 3 0 1 1 o 5 0 5 -1 3 3 k 2 3 2 -1 2 3 ok 4 3 4 -1 3 3 k
bookke
Edge Start End Suf First Last String 8 0 8 -1 5 5 e 1 0 1 -1 0 5 bookke 3 0 3 0 1 1 o 6 0 6 0 3 3 k 2 3 2 -1 2 5 okke 4 3 4 -1 3 5 kke 7 6 7 -1 5 5 e 5 6 5 -1 4 5 ke
cacao
Edge Start End Suf First Last String 3 0 3 5 0 1 ca 5 0 5 0 1 1 a 7 0 7 -1 4 4 o 1 3 1 -1 2 4 cao 4 3 4 -1 4 4 o 2 5 2 -1 2 4 cao 6 5 6 -1 4 4 o
googol
Edge Start End Suf First Last String 5 0 5 3 0 1 go 3 0 3 0 1 1 o 8 0 8 -1 5 5 l 1 5 1 -1 2 5 ogol 6 5 6 -1 5 5 l 4 3 4 -1 3 5 gol 2 3 2 -1 2 5 ogol 7 3 7 -1 5 5 l
abababc
Edge Start End Suf First Last String 9 0 9 0 1 1 b 11 0 11 -1 6 6 c 7 0 7 9 0 1 ab 10 9 10 -1 6 6 c 5 9 5 7 2 3 ab 8 7 8 -1 6 6 c 3 7 3 5 2 3 ab 6 5 6 -1 6 6 c 2 5 2 -1 4 6 abc 4 3 4 -1 6 6 c 1 3 1 -1 4 6 abc
5 comments:
see also more updated Suffix Trees: Java Code as Separate Classes
The link to the code is broken. Can you upload it again, please? Thanks!
Fixed, Sorry
The code is broken... or has some bug
Start End Suf First Last String
0 1 -1 0 10 ATATATATATA
0 2 -1 1 10 TATATATATA
Suffix : ATATATATATA
comparing: ATATATATATA to ATATATATATA
Suffix : TATATATATA
comparing: TATATATATA to TATATATATA
Suffix 0 count wrong!
Suffix 1 count wrong!
Suffix 2 count wrong!
Suffix 3 count wrong!
Suffix 4 count wrong!
Suffix 5 count wrong!
Suffix 6 count wrong!
Suffix 7 count wrong!
Suffix 8 count wrong!
Leaf count : 2 Error!
Branch count : 2 OK
Christina,
1) try to add $ to the end of your string;
2) try applet
http://illya-keeplearning.blogspot.com/2009/06/suffix-trees-java-applet.html
Post a Comment