The following project shows and compares machine learning between Pandas DataFrames and Dask Dataframes.
Dask is a flexible library for parallel computing in Python. A Dask DataFrame is a large parallel DataFrame composed of many smaller Pandas DataFrames, split along the index. These Pandas DataFrames may live on disk for larger-than-memory computing on a single machine, or on many different machines in a cluster. One Dask DataFrame operation triggers many operations on the constituent Pandas DataFrames.
Click here for code.