Artificial intelligence

How to Train Your First Machine Learning Model

By Hillary

Posted on February 17, 2025

Machine learning (ML) is one of the most exciting fields in technology today, with applications in everything from self-driving cars to personalized recommendations on streaming platforms. If you’re new to ML and eager to train your first model, you’re in the right place! This guide will walk you through the basics in a simple, easy-to-understand way.
Understanding Machine Learning Basics

Before going into training a model, let’s first understand what machine learning is.

Machine learning is a subset of artificial intelligence (AI) that enables computers to learn from data and make predictions without being explicitly programmed. Instead of following strict rules, an ML model improves itself by recognizing patterns in the data.

Types of Machine Learning

Supervised Learning: The model learns from labeled data (e.g., predicting house prices based on past sales data).
Unsupervised Learning: The model finds patterns in data without labels (e.g., grouping customers by purchasing habits).
Reinforcement Learning: The model learns by trial and error, receiving rewards for good actions (e.g., training a robot to walk).

For this guide, we’ll focus on supervised learning, as it’s the easiest way to start.

Step 1: Choosing the Right Tools

To train an ML model, you need the right tools. Here are some beginner-friendly options:

Programming Language: Python (widely used in ML)
Libraries:
scikit-learn (for basic ML models)
pandas (for handling data)
numpy (for numerical operations)
matplotlib & seaborn (for visualizing data)

Install them using the following command:

pip install scikit-learn pandas numpy matplotlib seaborn

Step 2: Collecting and Preparing Data

Machine learning models are only as good as the data they learn from. You can find datasets from platforms like Kaggle or UCI Machine Learning Repository.

For example, let’s use a dataset of house prices. The data might include:

Square footage
Number of bedrooms
Location
House price (label to predict)

Loading the Data

Here’s how to load a dataset using pandas:

import pandas as pd

data = pd.read_csv('house_prices.csv')
print(data.head())

Cleaning the Data

Before training, clean your data by handling missing values and removing unnecessary columns. For example:

data = data.dropna()  # Remove missing values
data = data[['SquareFootage', 'Bedrooms', 'Location', 'Price']]  # Keep relevant columns

Splitting the Data

We split the dataset into two parts:

Training set (80%): Used to train the model.
Test set (20%): Used to evaluate the model’s performance.

from sklearn.model_selection import train_test_split

X = data[['SquareFootage', 'Bedrooms', 'Location']]
y = data['Price']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Step 3: Choosing and Training a Model

For beginners, a Linear Regression model is a great choice.

Training the Model

from sklearn.linear_model import LinearRegression

model = LinearRegression()
model.fit(X_train, y_train)

This step enables the model to learn patterns from the training data.

Step 4: Evaluating the Model

Once trained, we test the model on unseen data (X_test) to see how well it predicts.

y_pred = model.predict(X_test)

To measure performance, we use Mean Absolute Error (MAE):

from sklearn.metrics import mean_absolute_error

mae = mean_absolute_error(y_test, y_pred)
print(f'Mean Absolute Error: {mae}')

The lower the MAE, the better the model’s predictions.

Step 5: Improving the Model

If your model isn’t accurate enough, here’s how you can improve it:

Collect More Data – More data can improve accuracy.
Feature Engineering – Create new useful features (e.g., convert ‘Location’ into numerical values).
Try Different Models – Test models like Decision Trees or Random Forests.
Hyperparameter Tuning – Adjust settings like learning rate to optimize performance.

Step 6: Making Predictions on New Data

Once satisfied, you can use your model to predict prices for new houses.

new_house = [[1500, 3, 2]]  # Example input: 1500 sq ft, 3 bedrooms, Location 2
predicted_price = model.predict(new_house)
print(f'Predicted Price: {predicted_price[0]}')

Conclusion

Congratulations! You just trained your first machine learning model. The journey doesn’t stop here keep exploring by trying different datasets and models. Machine learning is a powerful skill, and with practice, you can build amazing applications.