从网址:https://github.com/ageron/handson-ml2
从GitHub获取机器 学习 实战: 基于 Scikit- Learn、 Keras 和 TensorFlow: 原 书 第 2 版里面的数据
复制书上coding 到Jupyter
全部抄了书上的代码2.3.2 下载数据
ileNotFoundError: [Errno 2] No such file or directory: 'datasets\housing\housing.csv'
代码检查无误
自动下载成功
题主需要在第二个函数里调用第一个函数,对数据下载并解压后再读取和处理,代码改为如下即可:
import os
import tarfile
from six.moves import urllib
DOWNLOAD_ROOT = "https://raw.githubusercontent.com/ageron/handson-ml2/master/"
HOUSING_PATH = os.path.join("datasets", "housing")
HOUSING_URL = DOWNLOAD_ROOT + "datasets/housing/housing.tgz"
def fetch_housing_data(housing_url=HOUSING_URL, housing_path=HOUSING_PATH):
if not os.path.isdir(housing_path):
os.makedirs(housing_path)
tgz_path = os.path.join(housing_path, "housing.tgz")
urllib.request.urlretrieve(housing_url, tgz_path)
housing_tgz = tarfile.open(tgz_path)
housing_tgz.extractall(path=housing_path)
housing_tgz.close()
import pandas as pd
def load_housing_data(housing_path=HOUSING_PATH):
fetch_housing_data()
csv_path = os.path.join(housing_path, "housing.csv")
return pd.read_csv(csv_path)
housing = load_housing_data()
print(housing.head())
longitude latitude housing_median_age total_rooms total_bedrooms population households median_income median_house_value ocean_proximity
0 -122.23 37.88 41.0 880.0 129.0 322.0 126.0 8.3252 452600.0 NEAR BAY
1 -122.22 37.86 21.0 7099.0 1106.0 2401.0 1138.0 8.3014 358500.0 NEAR BAY
...
如有帮助,请采纳。
No such file or directory: 'datasets\housing\housing.csv'
文件不存在或者路径不对。
housing.csv 文件需要放在当前项目的datasets\housing目录下。
housing.csv文件不存在,或者文件路径不对。
要把housing.csv放到当前目录下的datasets下的housing目录中
我本来是把代码全部复制到问题上,但是发不了。
其实datasets 在github 网上,这个代码是想建立一个链接,可以自动获取不断刷新的文件,不是单纯下载。
https://github.com/ageron/handson-ml2
import os
import tarfile
from six.moves import urllib
DOWNLOAD_ROOT = "https://raw.githubusercontent.com/ageron/handson-ml2/master/"
HOUSING_PATH = os.path.join("datasets", "housing")
HOUSING_URL = DOWNLOAD_ROOT + "datasets/housing/housing.tgz"
def fetch_housing_data(housing_url=HOUSING_URL, housing_path=HOUSING_PATH):
if not os.path.isdir(housing_path):
os.makedirs(housing_path)
tgz_path = os.path.join(housing_path, "housing.tgz")
urllib.request.urlretrieve(housing_url, tgz_path)
housing_tgz = tarfile.open(tgz_path)
housing_tgz.extractall(path=housing_path)
housing_tgz.close()
import pandas as pd
def load_housing_data(housing_path=HOUSING_PATH):
csv_path = os.path.join(housing_path, "housing.csv")
return pd.read_csv(csv_path)
housing = load_housing_data()
housing.head()