In the previous article, we have talked about how FireDucks lazy-execution can take care of the caching for the intermediate results in order to avoid recomputation of an expensive operation. In today’s article, we will focus on the efficient data flow optimization by its JIT compiler. We will first try to understand some best practices when performing large-scale data analysis in pandas and then discuss how those can be automatically taken care by FireDucks lazy execution model.| fireducks-dev.github.io
We will explore the pitfalls of using the `%%time` magic command in Jupyter and other IPython Notebooks to measure the execution time of FireDucks processes.| fireducks-dev.github.io
Research says that Data scientists spend about 45% of their time on data preparation tasks, including loading (19%) and cleaning (26%) the data. Pandas is one of the most popular python libraries for tabular data processing because of its diverse utilities and large community support. However, due to its performance issue with the large-scale data processing, there is a strong need for high-performance data frame libraries for the community. Although there are many alternatives available at t...| fireducks-dev.github.io
FireDucks has a trace function that records how long each process such as read_csv, groupby, sort, etc. takes. This article introduces how to use the trace function. How to output and display trace files To use the trace function, you do not need to modify the program. Simply set the environment variables as shown below and execute the program to use the trace function. $ FIREDUCKS_FLAGS="--trace=3" python -mfireducks.pandas your_program.py After setting the environment variables and executin...| FireDucks – Posts
We are currently developing a GPU version of FireDucks. FireDucks is built with an architecture that translates programs into an intermediate representation at runtime, optimizes them in this intermediate representation, and then compiles and executes the intermediate representation for the backend. The currently released CPU version of FireDucks has a backend for CPUs. In the development of the GPU version, the backend is changed to a GPU. This allows us to use the translation to and optimiz...| FireDucks – Posts
As described here, FireDucks uses lazy execution model with define-by-run IR generation. Since FireDucks uses MLIR compiler framework to optimize and execute IR, first step of the execution is creating MLIR function which holds operations to be evaluated. This article describes how important this function creation step is for optimization, thus performance. In the simple example below, execution of IR is kicked by the print statement which calls df2.__repr__(). df0 = pd.| fireducks-dev.github.io
In the previous article, we have talked about how FireDucks can take care pushdown-projection related optimization for read_parquet(), read_csv() etc. In today’s article, we will focus on the efficient caching mechanism by its JIT compiler. Let’s consider the below sample query for the same data, used in previous article: import pandas as pd df = pd.read_parquet("sample_data.parquet") f_df = df.loc[df["a"] > 3, ["x", "y", "z"]] r1 = f_df.groupby("x")["z"].sum() print(r1) When executing th...| fireducks-dev.github.io
The availability of runtime memory is often a challenge faced at processing larger-than-memory-dataset while working with pandas. To solve the problem, one can either shift to a system with larger memory capacity or consider switching to alternative libraries supporting distributed data processing like (Dask, PySpark etc.). Well, do you know when working with data stored in columnar formats like csv, parquet etc. and only some part of data is to be processed, manual optimization is possible e...| fireducks-dev.github.io
Recently we have updated the result of polars-tpch benchmark on 4th generation Xeon processor. The latest result can be found here, and also below in this artice, explaining how to reproduce the same. For reproducibility, we have used AWS EC2 for this time evaluation. We have used m7i.8xlarge instance type with ubuntu 24.04 image and 128GB EBS SSD. This instance includes: 4th generation Xeon processor: Intel(R) Xeon(R) Platinum 8488C (32cores) 128GB memory Benchmark Result The graph shown bel...| fireducks-dev.github.io
Thank you for your interest in FireDucks. This article describes possible causes and remedies for slow programs using FireDucks. When a pandas program with FireDucks applied is slow, the reason may be the followings. Using ‘apply’ or ’loop’. Using pandas API not implemented in FireDucks. In the case of 1, if you change the pandas program, the program may become faster. For example, sum_val = 0 for i in range(len(df)): if df["A"][i] > 2: sum_val += df["B"][i] A program using ’loop’...| fireducks-dev.github.io