silly-jewelry-58239
07/19/2022, 3:56 PMstraight-shampoo-11124
07/19/2022, 6:12 PMstraight-shampoo-11124
07/19/2022, 6:12 PMstraight-shampoo-11124
07/19/2022, 6:14 PMdownload()
function doesn't actually store anything on disk but it moves data directly from S3 to memory, which is very fast, especially on a large cloud instancestraight-shampoo-11124
07/19/2022, 6:15 PMpyarrow
to load data, which is very efficient and which Pandas also uses behind the scenes in many casesstraight-shampoo-11124
07/19/2022, 6:15 PMmetaflow.S3
can be faster than the native S3 loading in Pandas, since it is more heavily parallelized