Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check_that throws cryptic errors when run after a pipeline step fails #153

Open
mrkaye97 opened this issue Aug 13, 2021 · 5 comments
Open

Comments

@mrkaye97
Copy link

Sorry in advance if this has already been asked -- I haven't seen anything about it. Pasting a reprex that does a better job explaining what's going on than I can:

suppressPackageStartupMessages({
  library(dplyr)
  library(validate)
})

iris %>%
  filter(
    foobar > 2
  )
#> Error: Problem with `filter()` input `..1`.
#> ℹ Input `..1` is `foobar > 2`.
#> x object 'foobar' not found

iris %>%
  filter(
    foo > 3
  ) %>%
  mutate(
    bar = Sepal.Length + 1
  )
#> Error: Problem with `filter()` input `..1`.
#> ℹ Input `..1` is `foo > 3`.
#> x object 'foo' not found

## problem:
iris %>%
  filter(
    foo > 3
  ) %>%
  check_that(
    Sepal.Length < 1000
  )
#> Error in (function (cond) : error in evaluating the argument 'dat' in selecting a method for function 'confront': Problem with `filter()` input `..1`.
#> ℹ Input `..1` is `foo > 3`.
#> x object 'foo' not found

## big problem:
iris %>%
  filter(
    foo > 3
  ) %>%
  check_that(
    Sepal.Length < 1000
  ) %>%
  filter(
    foo > 3
  ) %>%
  check_that(
    Sepal.Length < 1000
  ) %>%
  filter(
    foo > 3
  ) %>%
  check_that(
    Sepal.Length < 1000
  ) %>%
  filter(
    foo > 3
  ) %>%
  check_that(
    Sepal.Length < 1000
  ) %>%
  filter(
    foo > 3
  ) %>%
  check_that(
    Sepal.Length < 1000
  )
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'dat' in selecting a method for function 'confront': error in evaluating the argument 'dat' in selecting a method for function 'confront': error in evaluating the argument 'dat' in selecting a method for function 'confront': error in evaluating the argument 'dat' in selecting a method for function 'confront': error in evaluating the argument 'dat' in selecting a method for function 'confront': Problem with `filter()` input `..1`.
#> ℹ Input `..1` is `foo > 3`.
#> x object 'foo' not found

Created on 2021-08-13 by the reprex package (v2.0.0)

Basically, filter fails and then check_that seems to not know what to do, so it spits out a bunch of junk before printing an actual error. This is a bigger issue when you chain 100 steps together though, because then it spits out too much junk to actually print / parse, so it's hard to figure out what's actually going wrong.

Is this a known issue / conscious choice? And if it is, what's the best way to handle this behavior?

Thanks!

@markvanderloo
Copy link
Member

The first error msg I see comes from filter. This is not a validate function. I'd go after that first. Maybe use dplyr::filter.and similar for the other functions? (I'm not near a computer now so I can't test)

@mrkaye97
Copy link
Author

@markvanderloo Oh yeah, I know the error is coming from dplyr::filter(). The point here was that the checks run fine when filter() works (and it does, I just told it to filter by foo which doesn't exist), but when filter() bombs, it seems like I just get this long, unhelpful error message. Here's a reprex to show filter working:

suppressPackageStartupMessages({
  library(dplyr)
  library(validate)
})

iris <- tibble::as_tibble(iris)

## Filtering works fine
iris %>%
  filter(
    Sepal.Length > 5
  )
#> # A tibble: 118 x 5
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#>           <dbl>       <dbl>        <dbl>       <dbl> <fct>  
#>  1          5.1         3.5          1.4         0.2 setosa 
#>  2          5.4         3.9          1.7         0.4 setosa 
#>  3          5.4         3.7          1.5         0.2 setosa 
#>  4          5.8         4            1.2         0.2 setosa 
#>  5          5.7         4.4          1.5         0.4 setosa 
#>  6          5.4         3.9          1.3         0.4 setosa 
#>  7          5.1         3.5          1.4         0.3 setosa 
#>  8          5.7         3.8          1.7         0.3 setosa 
#>  9          5.1         3.8          1.5         0.3 setosa 
#> 10          5.4         3.4          1.7         0.2 setosa 
#> # … with 108 more rows

## Filtering and then piping into check_that also works fine
iris %>%
  filter(
    Sepal.Length > 5
  ) %>%
  check_that(
    Sepal.Length > 5
  )
#> Object of class 'validation'
#> Call:
#>     check_that(., Sepal.Length > 5)
#> 
#> Rules confronted: 1
#>    With fails   : 0
#>    With missings: 0
#>    Threw warning: 0
#>    Threw error  : 0


## problem: When `filter` fails, `check_that` throws a gibberish error
iris %>%
  filter(
    foo > 3
  ) %>%
  check_that(
    Sepal.Length < 1000
  )
#> Error in (function (cond) : error in evaluating the argument 'dat' in selecting a method for function 'confront': Problem with `filter()` input `..1`.
#> ℹ Input `..1` is `foo > 3`.
#> x object 'foo' not found

## big problem: Multiple `check_that`s means multiple repetitions of this error
## arbitrarily many of them, as the chain gets bigger
iris %>%
  filter(
    foo > 3
  ) %>%
  check_that(
    Sepal.Length < 1000
  ) %>%
  filter(
    foo > 3
  ) %>%
  check_that(
    Sepal.Length < 1000
  ) %>%
  filter(
    foo > 3
  ) %>%
  check_that(
    Sepal.Length < 1000
  ) %>%
  filter(
    foo > 3
  ) %>%
  check_that(
    Sepal.Length < 1000
  ) %>%
  filter(
    foo > 3
  ) %>%
  check_that(
    Sepal.Length < 1000
  )
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'dat' in selecting a method for function 'confront': error in evaluating the argument 'dat' in selecting a method for function 'confront': error in evaluating the argument 'dat' in selecting a method for function 'confront': error in evaluating the argument 'dat' in selecting a method for function 'confront': error in evaluating the argument 'dat' in selecting a method for function 'confront': Problem with `filter()` input `..1`.
#> ℹ Input `..1` is `foo > 3`.
#> x object 'foo' not found

## When filter fails and pipes into mutate, however, we still
##  get the same informative error we'd expect
iris %>%
  filter(
    foo > 3
  ) %>%
  mutate(
    bar = Sepal.Length + 1
  )
#> Error: Problem with `filter()` input `..1`.
#> ℹ Input `..1` is `foo > 3`.
#> x object 'foo' not found

Created on 2021-08-13 by the reprex package (v2.0.0)

I'm not an expert of the codebase, but my suspicion from this error is that there's an S3 method for confront that's getting NULL or something similar, and is trying to call confront.null and is getting confused. Again, not an expert but that's what seems like might be happening.

@mrkaye97
Copy link
Author

Also, FWIW, my mental model for what should happen here is that check_that shouldn't run if filter fails (just like mutate doesn't, or at least doesn't seem to). It seems to me like that's the real issue

@markvanderloo
Copy link
Member

markvanderloo commented Aug 15, 2021

Ok, so I now understand your question better. The error:

#> Error: Problem with `filter()` input `..1`.
#> ℹ Input `..1` is `foo > 3`.
#> x object 'foo' not found

is not thrown by check_that(). One clue is that it uses colorized output, which we use nowhere in validate. So it must ultimately come from dplyr or magrittr. If I use the R pipe, I get the same message so it must be dplyr or one of its dependencies.

> iris |> filter(foo>3) |> check_that(Sepal.Length>0)
Error in (function (cond)  : 
  error in evaluating the argument 'dat' in selecting a method for function 'confront': Problem with `filter()` input `..1`.Input `..1` is `foo > 3`.object 'foo' not found

@mrkaye97
Copy link
Author

mrkaye97 commented Aug 15, 2021

Thanks @markvanderloo. I think we're still getting our wires crossed. I know the error

#> Error: Problem with `filter()` input `..1`.
#> ℹ Input `..1` is `foo > 3`.
#> x object 'foo' not found

is coming from dplyr, but that isn't the issue I'm talking about. What I'm talking about is this piece of the error:

Error in (function (cond)  : 
  error in evaluating the argument 'dat' in selecting a method for function 'confront'

which is clearly coming from validate, not dplyr or magrittr. I think the fact that you get the same behavior with the base R pipe is good evidence of that, but here's a reprex using no dplyr that shows the same issue:

library(validate)
library(magrittr)

## Broken, but no dplyr
iris %>%
    subset(
        foo > 2
    ) %>%
    check_that(
        Sepal.Length < 100
    )
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'dat' in selecting a method for function 'confront': object 'foo' not found

## Also broken, but no dplyr and no pipe
check_that(
    subset(
        iris,
        foo > 2
    ),
    Sepal.Length < 100
)
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'dat' in selecting a method for function 'confront': object 'foo' not found

## Works fine
check_that(
    subset(
        iris,
        Sepal.Length < 6
    ),
    Sepal.Length < 100
)
#> Object of class 'validation'
#> Call:
#>     check_that(subset(iris, Sepal.Length < 6), Sepal.Length < 100)
#> 
#> Rules confronted: 1
#>    With fails   : 0
#>    With missings: 0
#>    Threw warning: 0
#>    Threw error  : 0

Created on 2021-08-15 by the reprex package (v2.0.1)

What I'm saying is that I think that validate::check_that doesn't know what to do when something that gets passed into its dat argument throws an error, which seems to me to be a bug. Let me know if I can be more helpful than this or if it's still not clear what I'm getting at. I think this has nothing to do with dplyr or magrittr though. It really seems to me like it might be an S3-related error in validate or confront, but I'm not 100% sure.

To reiterate from before: My mental model for what should happen in check_that when dat throws an error (like here) is that check_that should throw the same error without adding this other simpleError stuff:

Error in (function (cond)  : 
  error in evaluating the argument 'dat' in selecting a method for function 'confront'

I would expect check_that to error out without throwing that error message, and just do what dplyr does and throw the first error. Here's a base R reprex for the behavior nesting like this gives in subset when there's an error in the inner function:

subset(
    subset(
        iris,
        foo > 2
    ),
    Sepal.Length > 3
)
#> Error in eval(e, x, parent.frame()): object 'foo' not found

Created on 2021-08-15 by the reprex package (v2.0.1)

That's the kind of behavior I'd expect from check_that, too.

Let me know if this doesn't clear up what I'm thinking

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants