Buldozer Price Regression Analysis

Hello! I'm JayaPrakash, and this is a part of a milestone project on my journey in learning machine learning. It is a regression problen where we need to determine the price of a buldozer with good accuracy. The data utilized is obtained from Kaggle (https://www.kaggle.com/c/bluebook-for-bulldozers).

1. Initialize

Import the required libraries to be used in this notebook. Also define all the functions used here for the sake of an organized notebook.

2. Exploratory Data Analysis

Import the data and clean it to a suitable format for fitting the models. Try to find ways to reduce the number of features in an iterative way, so as to promote experimentation in tuning the model.

3. Training and Testing the models

Fit the data to the model and obtain a preliminary evaluation for the model's performance.

4. Hyperparameter Tuning

Define a grid search iteration (Evaluated on its performance) to determine the best parameters for the model. Unable to proceed as my system isn't capable of efficiently tuning the model. :(

Final output result of 0.39 Root Mean Square Log Error from the Random Forest Regressor is around 90th place in the Kaggle leaderboards. Bronze medal obtained!