Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The rest of the queries? #39

Open
marsupialtail opened this issue Mar 8, 2023 · 5 comments
Open

The rest of the queries? #39

marsupialtail opened this issue Mar 8, 2023 · 5 comments

Comments

@marsupialtail
Copy link

Polars can run them for sure. Do you want a contribution?

@ritchie46
Copy link
Member

That would be great!

@marsupialtail
Copy link
Author

@ritchie46 I have started but ran into a problem. Here is how I wrote query 13:

ref_customer = polars.read_csv("/home/ziheng/tpc-h/customer.tbl", sep="|")
ref_orders = polars.read_csv("/home/ziheng/tpc-h/orders.tbl", sep="|").\
    filter( ~(polars.col("o_comment").str.contains('special') & polars.col("o_comment").str.contains('requests')))
ref = ref_customer.join(ref_orders, left_on="c_custkey", right_on="o_custkey", how="left")\
    .with_column(polars.col("o_orderkey").is_not_null().alias("o_orderkey_1")).groupby("c_custkey").agg([polars.col("o_orderkey_1").sum()])\
    .groupby("o_orderkey_1").count().sort('count',reverse = True)
    #.sort('o_orderkey_1',reverse = True)

However this give wrong results. Any suggestions?

@marsupialtail
Copy link
Author

NVM i know what the problem is. I need to make sure "special" comes before "requests". Have to use regex.....

@ghuls
Copy link

ghuls commented Mar 23, 2023

Implementation for Pandas for 22 queries:
https://gist.github.com/UranusSeven/55817bf0f304cc24f5eb63b2f1c3e2cd

@stinodego
Copy link
Member

Polars / pyspark / DuckDB have full query coverage. We should still include the pandas queries. Perhaps the link above could help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants