Welcome to svmloader’s documentation!¶

svmloader is a very fast python module (written in cython) intended to load sparse data written at libsvm format.

It is not fully equivalent to sklearn.datasets.load_svmlight_file, in particular query_id are not supported and dtype is restricted.

The types of data and labels are distinguished. The labels types supported are int and float (default int), and data can be parsed as numpy.float64 or numpy.float32 type (float64 by default).

Compressed data in .gz or .bz2 format is supported as well.

API¶

svmloader.load_svmfile(filename, dtype='d', ltype='i', nfeatures=None, zero_based=True, multilabels=False)¶

Load a sparse CSR matrix from filename at svmlib format.

Files in .gz or .bz2 format will be uncompressed on the fly.

Parameters:	filename (str) – the file name dtype (str) – type of data, must be either ‘d’ (double) or ‘f’ (float) ltype (str) – type of labels, must be either ‘i’ (int) or ‘d’ (double) nfeatures (int) – the number of columns (infered from file if is None) zero_based (bool) – indicates if columns indexes are zero-based or one-based multilabels (bool) – indicates if file uses multiple labels per row
Returns:	(labels, sparse_matrix) tuple
Return type:	(`numpy.ndarray`, `scipy.sparse.csr_matrix`)

svmloader.load_svmfiles(filenames, dtype='d', ltype='i', zero_based=True, multilabels=False)¶

Load a sparse CSR matrix list from list of filenames at svmlib format.

Files in .gz or .bz2 format will be uncompressed on the fly.

The number of features will be infered from the maximum indice found on all files.

Parameters:	filenames (list) – the list of files names dtype (str) – type of data, must be either ‘d’ (double) or ‘f’ (float) ltype (str) – type of labels, must be either ‘i’ (int) or ‘d’ (double) zero_based (bool) – indicates if columns indexes are zero-based or one-based multilabels (bool) – indicates if file uses multiple labels per row
Returns:	a list [labels_0, matrix_0, .., labels_n, matrix_n]

Welcome to svmloader’s documentation!¶

API¶

Table Of Contents

This Page