AI Foundations: AI vs. ML and Your First scikit-learn Model
AI Foundations: Understanding AI vs. ML and Building Your First Predictive Models
To become an AI Engineer, it is essential to understand both the vision of Artificial Intelligence and the practical mechanics of Machine Learning.
What is Artificial Intelligence (AI)?
Artificial Intelligence is the broad field of computer science focused on building systems that appear intelligent. These systems can reason, learn, plan, perceive, and make decisions. AI includes everything from simple rule-based automation to advanced neural networks that generate text, images, and code.
Think of AI as the destination:
Creating machines that can think and act intelligently.
What is Machine Learning (ML)?
Machine Learning is a subset of AI and the engine that powers most modern AI systems. Instead of manually writing rules like:
If the house is big and in a good neighborhood, then the price is high…
we provide the machine with data and allow it to learn the pattern on its own.
At its core, Machine Learning is about discovering relationships between inputs and outputs.
In its simplest form, ML tries to learn this equation:
texty = mx + c
Where:
x = input (e.g., square footage of a house)
y = output (e.g., house price)
m = learned weight (how strongly x affects y)
c = bias (base value)
The learning process finds the best values of m and c that fit the data.
Hands-on: Implementing Linear & Logistic Regression
We will use the scikit-learn library in Python to build two models using simple CSV datasets.
1. Linear Regression (Predicting Continuous Values)
Used when you want to predict a number, such as a house price.
Input Data (real_estate_data.csv):
textsquare_feet,price 1500,300000 2000,400000 2500,500000 3000,600000 3500,700000
This dataset represents a simple linear relationship:
As the size of the house increases, the price increases.
The model’s job is to learn the best-fitting line:
textprice = m * square_feet + c
The Code:
pythonimport pandas as pd from sklearn.model_selection import train_test_split from sklearn.linear_model import LinearRegression # Load your data df = pd.read_csv('real_estate_data.csv') X = df[['square_feet']] y = df['price'] # Train the Model model = LinearRegression() model.fit(X, y) # Make a Prediction predicted_price = model.predict([[2500]]) print(f"Predicted price for 2500 sqft: ${predicted_price[0]:,.2f}")
The Output:
textPredicted price for 2500 sqft: $500,000.00
2. Logistic Regression (Classification)
Used for classification — deciding which category something belongs to.
Input Data (email_data.csv):
textword_count,is_spam 10,0 50,1 15,0 100,1 20,0
Here:
0 = Not Spam
1 = Spam
The Code:
pythonimport pandas as pd from sklearn.linear_model import LogisticRegression # Load data df = pd.read_csv('email_data.csv') X = df[['word_count']] y = df['is_spam'] # Train the Model clf = LogisticRegression() clf.fit(X, y) # Predict is_spam = clf.predict([[45]]) print(f"Is it spam? {'Yes' if is_spam[0] == 1 else 'No'}")
The Output:
textIs it spam? Yes
Why This Matters for AI Engineering
These models are the atoms of AI. Every modern AI system is built on this same foundation: input, weight, computation, output. Understanding Linear and Logistic Regression means you understand how machines learn from data.
Share this article
Related Articles
Deep Learning 101: From Foundations to Real-World Applications
A deep dive into Deep learning for AI engineers.
Machine Learning Models 101: From Theory to Practice
A deep dive into Machine Learning Models for AI engineers.
Cosine Search and Cosine Distance in RAG: The Foundation of Semantic Retrieval
A deep dive into Cosine Search and Cosine Distance in RAG for AI engineers.

