Scikit learn is the powerful tool of machine learning.it provides the statistical modelling,classification,regression etc.This library is related to Python and also developed in Python Languages.
It is the free library,Scikit also support the functionality of the Numpy and Scipy,
Benefit of Scikit Learn
- Consistent interface to machine learning model
- Provide many tuning parameter but with the sensible default
- Exceptional documentation
- Rich set of Functionality for companion task
- Active community for Development and support
Type the following command to into your command shell to install Scikit Learn
pip install -U scikit-learn
conda install scikit-learn
Load the Dataset
Data set is the collection of the data,which have two type:
- Feature : it is known as the predictive value,This is our simple variable which represents the feature matrix.
- Reponse : It is known as the target value,the result of the response depends upon the feature variable.
Loading internal Dataset
# iris dataset from sklearn.datasets import load_iris irise = load_iris() # feature matrix is x and vector response y a= iris.data b = iris.target # store the feature and target names fnames = iris.feature_names tnames = iris.target_names # print the data set print("Feature names:", fnames) print("Target names:", tnames) # a and b are numpy arrays print("type is:", type(a)) # printing first 5 input rows print("10 rows of a", a[:10])
Feature names: ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)'] Target names: ['setosa' 'versicolor' 'virginica'] type is:
10 rows of a [[5.1 3.5 1.4 0.2] [4.9 3. 1.4 0.2] [4.7 3.2 1.3 0.2] [4.6 3.1 1.5 0.2] [5. 3.6 1.4 0.2] [5.4 3.9 1.7 0.4] [4.6 3.4 1.4 0.3] [5. 3.4 1.5 0.2] [4.4 2.9 1.4 0.2] [4.9 3.1 1.5 0.1]]
# loading iris dataset from sklearn.datasets import load_iris irise = load_iris() # feature matrix variable is (a) and vector response is (b) a = iris.data b = iris.target from sklearn.model_selection import train_test_split a_train, a_test, b_train, b_test = train_test_split(a, b, test_size=0.6, random_state=5) # print shapes of the a print(a_train.shape) print(a_test.shape) # printing the shapes of b print(b_train.shape) print(b_test.shape)
(60, 4) (90, 4) (60,) (90,)