You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the 02-dataframe.ipynb notebook, you have the following:
df = dd.read_csv("data/yellow_tripdata_2019-.csv")
df
and
df = dd.read_csv("data/yellow_tripdata_2019-.csv",
dtype={'RatecodeID': 'float64',
'VendorID': 'float64',
'passenger_count': 'float64',
'payment_type': 'float64'
})
However, you are using df_dask in your checkpoint solution (for both) which does not exist.
I recommend changing to use df_dask for the Dask dataframe section and change existing code to use it.
There are a few places that use df.xxx when going over the Dask dataframe.
e.g.
%%time
Also, I would make Solution 1 the following (if you use df_dask) and show the output (added 2nd line)
std_tip = df_dask.groupby("passenger_count").tip_amount.std().compute()
std_tip
Thanks for putting this course together and the notebooks! Very much enjoyed it!
The text was updated successfully, but these errors were encountered:
In the 02-dataframe.ipynb notebook, you have the following:
df = dd.read_csv("data/yellow_tripdata_2019-.csv")
df
and
df = dd.read_csv("data/yellow_tripdata_2019-.csv",
dtype={'RatecodeID': 'float64',
'VendorID': 'float64',
'passenger_count': 'float64',
'payment_type': 'float64'
})
However, you are using df_dask in your checkpoint solution (for both) which does not exist.
Solution 1
std_tip = df_dask.groupby("passenger_count").tip_amount.std().compute()
Solution 2
mean_total = df_dask.total_amount.mean()
std_total = df_dask.total_amount.mean()
dask.compute(mean_total, std_total)
I recommend changing to use df_dask for the Dask dataframe section and change existing code to use it.
There are a few places that use df.xxx when going over the Dask dataframe.
e.g.
%%time
mean_tip_amount = df.groupby("passenger_count").tip_amount.mean()
mean_tip_amount
Also, I would make Solution 1 the following (if you use df_dask) and show the output (added 2nd line)
std_tip = df_dask.groupby("passenger_count").tip_amount.std().compute()
std_tip
Thanks for putting this course together and the notebooks! Very much enjoyed it!
The text was updated successfully, but these errors were encountered: