Ishning maqsadi. Housing data csv yordamida StatsModels


Download 410.19 Kb.
Sana24.01.2023
Hajmi410.19 Kb.
#1118206
Bog'liq
4-lab MIT


211-guruh talabasi Azizbek
Ishning maqsadi. Housing_data.csv yordamida StatsModels kutubxonasidan foydalangan holda berilgan dataset orqali chiziqli regressiya(linear regression) quring

  1. Kerakli kurubxonalarni yuklab olamiz va malumotni yuklab olamiz

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.api as sm
df = pd.read_csv("housing_data.csv")
df.head()


  1. Ma’lumotni outlierlarini yuklab olamiz. Va null qiymatlarni olib tashlaymiz

maxSize = df["size"].quantile(0.96)
minSize = df['size'].quantile(0.04)
maxPrice = df["price"].quantile(0.96)
minPrice = df['price'].quantile(0.04)
data = df[(df["size"]< maxSize) & (df['size'] > minSize) & (df["price"]< maxPrice) & (df['price'] > minPrice)]
data = data.dropna(axis=1, thresh=1)
data.isnull().sum()



  1. Malumotlarni grafigini chiqaramiz

x1 = data['size']
y = data['price']
plt.scatter(x1, y)
plt.xlabel("Size of the house")
plt.ylabel("Price of the house")
plt.show()


  1. Malutmolarni regressiya orqali kerakli ma’lumotlarni chop etamiz

x = sm.add_constant(x1)
results = sm.OLS(y,x).fit()
results.summary()




  1. Grafikning regressiya chiziq chizamiz

plt.scatter(x1, Y)
yhat = 115.5314*x1 + 5.175e+04
fig = plt.plot(x1, yhat, lw=4, c='orange', label='regression line')
plt.xlabel('size', fontsize=20)
plt.ylabel('price', fontsize=20)
plt.show()



Xulosa


Bu laboratoriyani ishini bajarish davomida ma’lumotlar to’plamini bir biriga yaqinligi va bog’liqligni o’rgandim. Grafiklar orqali ma’lumotlarni umumiy holda ularning ma’lumotlaridan xulosa olish mumkin. Biz regressiya chiqizlari orqali kerakli bo’lgan xulosani ola olamiz men bula orqali price va size ning malumotlarni bir bog’liqligiga olish mumkin.
Download 410.19 Kb.

Do'stlaringiz bilan baham:




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling