Skip to content

make_mtcars_check_data: MtCars Dataset

The make_mtcars_check_data function generates a synthetic version of the famous Motor Trend Car Road Tests (MtCars) dataset. This dataset is widely used for teaching and testing linear regression concepts, particularly for predicting fuel efficiency (mpg) based on vehicle characteristics.


Overview

This utility provides a reproducible, MtCars-shaped dataset:

  • Structure: 32 observations (cars) with 11 variables.
  • Relationships: Realistic correlations, such as mpg decreasing as hp (horsepower) and wt (weight) increase.
  • Purpose: Excellent for testing regression models, feature selection, and data visualization techniques.
  • Reproducibility: Uses a fixed seed (default 42).

Parameters

Parameter Type Description Default
seed int Random seed for reproducibility. 42

Returns

Return Type Description
data numpy.ndarray Array of shape (32, 11) containing vehicle specifications.
column_names list[str] List of column names: ['mpg', 'cyl', 'disp', 'hp', 'drat', 'wt', 'qsec', 'vs', 'am', 'gear', 'carb'].
car_names list[str] List of placeholder car names (e.g., 'Car 1', 'Car 2') matching the rows.

Columns Description

Column Description
mpg Miles/(US) gallon
cyl Number of cylinders
disp Displacement (cu.in.)
hp Gross horsepower
drat Rear axle ratio
wt Weight (1000 lbs)
qsec 1/4 mile time
vs Engine (0 = V-shaped, 1 = straight)
am Transmission (0 = automatic, 1 = manual)
gear Number of forward gears
carb Number of carburetors

Example Usage

from machinegnostics.data import make_mtcars_check_data

# Generate MtCars data
data, cols, names = make_mtcars_check_data()

print(f"Shape: {data.shape}")
# Output: (32, 11)

print(f"Key Features: {cols[3]} (HP), {cols[5]} (Weight)")
# Output: Key Features: hp (HP), wt (Weight)

# Predict MPG (column 0)
X = data[:, [3, 5]] # hp, wt
y = data[:, 0]      # mpg