The goal of this project is to predict the price of a diamond based on its features. The dataset used for this project is the Diamonds dataset from Kaggle. The dataset contains 193572 rows and 11 columns. The dataset contains the following columns: carat, cut, color, clarity, depth, table, price, x, y, z. The dataset is a regression problem. The target variable is the price of the diamond.
- Python
- Pandas
- Seaborn
- Scikit-learn
- Jupyter Notebook
The dataset The goal is to predict price
of given a diamond.
There are 10 independent variables (including id
):
id
: unique identifier of each diamondcarat
: Carat (ct.) refers to the unique unit of weight measurement used exclusively to weigh gemstones and diamonds.cut
: Quality of Diamond Cutcolor
: Color of Diamondclarity
: Diamond clarity is a measure of the purity and rarity of the stone, graded by the visibility of these characteristics under 10-power magnification.depth
: The depth of diamond is its height (in millimeters) measured from the culet (bottom tip) to the table (flat, top surface)table
: A diamond's table is the facet which can be seen when the stone is viewed face up.x
: Diamond X dimensiony
: Diamond Y dimensionx
: Diamond Z dimension
Target variable:
price
: Price of the given Diamond.
Dataset Source Link : https://www.kaggle.com/competitions/playground-series-s3e8/data?select=train.csv
Clone this repository to your local machine.
git clone https://github.com/AJAmit17/DiamondPricePrediction.git
Change the directory.
cd DiamondPricePrediction
Install all the dependencies.
pip install -r requirements.txt
Run the Flask application.
python application.py
After the models are trained, the user can input the features of the diamond and the model will predict the price of the diamond.