Classification of images by means of the BOVW method is well known and applied in different recognition projects, this method rely on three phases: detection and extraction of characteristics, representation of the image and finally the classification. SIFT, Kmeans and SVM is the most accepted combination. This article aims to demonstrate that this combination is not always the best choice for all types of datasets, different training sets of images were created from scratch and will be used for the bag of visual words model: the first phase of detection and extraction, SIFT will be used, later in the second phase a dictionary of words will be created through a clustering process using K-means, EM, K-means in combination with EM, finally, for classification it will be compared the algorithms of SVM, Gaussian NB, KNN, Decision Tree, Random Forest, Neural Network and AdaBoost in order to determine the performance and accuracy of every method.