Skip to content

Commit

Permalink
chore: update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
Dat Nguyen committed Jan 31, 2024
1 parent 5747778 commit 651184a
Show file tree
Hide file tree
Showing 4 changed files with 24 additions and 15 deletions.
11 changes: 9 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,17 @@
<img align="right" width="150" height="150" src="./docs/assets/img/il-logo.png">

[![dbt-hub](https://img.shields.io/badge/Visit-dbt--hub%20↗️-FF694B?logo=dbt&logoColor=FF694B)](https://hub.getdbt.com/infinitelambda/dbt-data-diff)
[![support-snowflake](https://img.shields.io/badge/support-Snowflake-7faecd?logo=snowflake&logoColor=7faecd)](https://docs.snowflake.com/)
[![support-dbt](https://img.shields.io/badge/support-dbt%20v1.6+-FF694B?logo=dbt&logoColor=FF694B)](https://docs.getdbt.com/)
[![support-snowflake](https://img.shields.io/badge/support-Snowflake-7faecd?logo=snowflake&logoColor=7faecd)](https://docs.snowflake.com?ref=infinitelambda)
[![support-dbt](https://img.shields.io/badge/support-dbt%20v1.6+-FF694B?logo=dbt&logoColor=FF694B)](https://docs.getdbt.com?ref=infinitelambda)
[![built-in-sis](https://img.shields.io/badge/built--in-SiS-BD4042?logo=streamlit&logoColor=FF694B)](https://www.snowflake.com/en/data-cloud/overview/streamlit-in-snowflake?ref=infinitelambda)

Data-diff solution for dbt-ers with Snowflake ❄️ 🚀

> [!TIP]
> 📖 For more details, please help to visit [the documentation site](https://data-diff.iflambda.com/latest/) (or go to the [docs/index.md](./docs/index.md)) for more details
<img src="./docs/assets/img/data-diff.jpeg" alt="Sample diffing">

## Installation

- Add to `packages.yml` file:
Expand Down Expand Up @@ -51,6 +54,10 @@ dbt run -s data_diff --vars '{data_diff__on_migration: true}'

Let's jump to the [Quick Start](https://data-diff.iflambda.com/latest/#quick-start) section and the next [demo one](https://data-diff.iflambda.com/latest/#demo) 🏃

📊 Here is the sample Streamlit in Snowflake application based on the result produced by the package:

<img src="./docs/assets/img/sis_ui.png" alt="Sample SiS">

## How to Contribute

`dbt-data-diff` is an open-source dbt package. Whether you are a seasoned open-source contributor or a first-time committer, we welcome and encourage you to contribute code, documentation, ideas, or problem statements to this project.
Expand Down
Binary file modified docs/assets/img/sis_ui.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 4 additions & 2 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,10 @@

<img align="right" width="150" height="150" src="./assets/img/il-logo.png">

[![dbt-hub](https://img.shields.io/badge/Visit-dbt--hub%20↗️-FF694B?logo=dbt&logoColor=FF694B)](https://hub.getdbt.com/infinitelambda/dbt-data-diff)[![support-snowflake](https://img.shields.io/badge/support-Snowflake-7faecd?logo=snowflake&logoColor=7faecd)](https://docs.snowflake.com/)
[![support-dbt](https://img.shields.io/badge/support-dbt%20v1.6+-FF694B?logo=dbt&logoColor=FF694B)](https://docs.getdbt.com/)
[![dbt-hub](https://img.shields.io/badge/Visit-dbt--hub%20↗️-FF694B?logo=dbt&logoColor=FF694B)](https://hub.getdbt.com/infinitelambda/dbt-data-diff)
[![support-snowflake](https://img.shields.io/badge/support-Snowflake-7faecd?logo=snowflake&logoColor=7faecd)](https://docs.snowflake.com?ref=infinitelambda)
[![support-dbt](https://img.shields.io/badge/support-dbt%20v1.6+-FF694B?logo=dbt&logoColor=FF694B)](https://docs.getdbt.com?ref=infinitelambda)
[![built-in-sis](https://img.shields.io/badge/built--in-SiS-BD4042?logo=streamlit&logoColor=FF694B)](https://www.snowflake.com/en/data-cloud/overview/streamlit-in-snowflake?ref=infinitelambda)

Data-diff solution for dbt-ers with Snowflake ❄️ 🌟

Expand Down
22 changes: 11 additions & 11 deletions macros/resources/stored-procedures/create__check_data_diff.sql
Original file line number Diff line number Diff line change
Expand Up @@ -77,32 +77,32 @@
qualify row_number() over(
partition by src_db, src_schema, src_table, trg_db, trg_schema, trg_table, column_name, pipe_name
order by last_data_diff_timestamp desc
) = 1
) = 1 --get last schema diff result

),

base as (

select t.*
,listagg(column_name, ',') as col_list
,listagg(v.column_name, ',') as col_list
,'cast(md5_binary(concat_ws(''||'','
|| listagg('ifnull(nullif(upper(trim(cast(' || column_name || ' as varchar))), ''''), ''^^'')', ',' )
|| listagg('ifnull(nullif(upper(trim(cast(' || v.column_name || ' as varchar))), ''''), ''^^'')', ',' )
|| ' )) as binary(16)) as hashdiff' as hash_calc
,listagg('ifnull(nullif(upper(trim(cast(src.'|| column_name ||' as varchar))),''''),''^^'')= ifnull(nullif(upper(trim(cast(trg.'|| column_name ||' as varchar))),''''),''^^'') as '|| column_name || '_is_equal', ',' ) as is_equal
,listagg('sum(case when '|| column_name ||'_is_equal then 1 else 0 end) as '|| column_name || '_diff', ',' ) as diff_calc
,listagg(column_name ||'_diff / cnt as '|| column_name, ',') as result_calc
,listagg('ifnull(nullif(upper(trim(cast(src.'|| v.column_name ||' as varchar))),''''),''^^'')= ifnull(nullif(upper(trim(cast(trg.'|| v.column_name ||' as varchar))),''''),''^^'') as '|| v.column_name || '_is_equal', ',' ) as is_equal
,listagg('sum(case when '|| v.column_name ||'_is_equal then 1 else 0 end) as '|| v.column_name || '_diff', ',' ) as diff_calc
,listagg(v.column_name ||'_diff / cnt as '|| v.column_name, ',') as result_calc

from {{ configured_table_model }}_final t
join schema_validation v
from {{ configured_table_model }}_final as t
join schema_validation as v
on t.src_schema = v.src_schema
and t.src_table = v.src_table
where true
--excluded columns i.e always changing column, added or removed column
and (not array_contains(upper(column_name)::variant, t.exclude_columns))
and (not array_contains(upper(v.column_name)::variant, t.exclude_columns))
and (
case
when array_size(include_columns) > 0
then array_contains(column_name::variant, t.include_columns)
when array_size(t.include_columns) > 0
then array_contains(v.column_name::variant, t.include_columns)
else true
end
)
Expand Down

0 comments on commit 651184a

Please sign in to comment.