Here is the implementation of Cosine similarity of two vectors (vec1,vec2) using java
------------------------------------------------------------------------------------------
public class cosine {
public static void main(String[] args) {
int vec1[] = {1,2,5,0,2,3};
int vec2[] = {2,1,3,2,0,1};
double cos_sim = cosine_similarity(vec1,vec2);
System.out.println("Cosine Similarity="+cos_sim);
}
private static double cosine_similarity(int[] vec1, int[] vec2) {
double dp = dot_product(vec1,vec2);
double magnitudeA = find_magnitude(vec1);
double magnitudeB = find_magnitude(vec2);
return (dp)/(magnitudeA*magnitudeB);
}
private static double find_magnitude(int[] vec) {
double sum_mag=0;
for(int i=0;i<vec.length;i++)
{
sum_mag = sum_mag + vec[i]*vec[i];
}
return Math.sqrt(sum_mag);
}
private static double dot_product(int[] vec1, int[] vec2) {
double sum=0;
for(int i=0;i<vec1.length;i++)
{
sum = sum + vec1[i]*vec2[i];
}
return sum;
}
}
its is very nice but i want to know how to get the vectors of frequencies of words for any text file ?
ReplyDelete