How to run Mahout Minhash Clustering
1. mahout seqdirectory -i /home/venkat/Desktop/minhash4/reuters-out -o /home/venkat/Desktop/minhash4/reuters-out-seqdir -c UTF-8 -chunk 5
2. mahout seq2sparse -i /home/venkat/Desktop/minhash4/reuters-out-seqdir/ -o /home/venkat/Desktop/minhash4/reuters-out-seqdir-sparse-minhash --maxDFPercent 85 --namedVector
3. mahout org.apache.mahout.clustering.minhash.MinHashDriver -i /home/venkat/Desktop/minhash4/reuters-out-seqdir-sparse-minhash/tfidf-vectors -o /home/venkat/Desktop/minhash4/reuters-minhash --overwrite
No comments:
Post a Comment