Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected/impractical behaviour of rowwise() in combination with mutate_all() #6890

Closed
martijnvanattekum opened this issue Jul 24, 2023 · 4 comments

Comments

@martijnvanattekum
Copy link

When trying to apply a function using mutate_all() to each row of a data frame using rowwise(), it seems the function only considers individual values, rather than the complete row.

Comparing column-wise with rowwise operations:

# Columnwise - considers the complete column when applying the function
library(dplyr)
head(mtcars) %>% mutate_all(rank)

>                   mpg cyl disp hp drat wt qsec vs am gear carb
> Mazda RX4         3.5 3.5  2.5  4  5.5  2  1.0  2  5    5  5.5
> Mazda RX4 Wag     3.5 3.5  2.5  4  5.5  3  2.5  2  5    5  5.5
> Datsun 710        6.0 1.0  1.0  1  4.0  1  4.0  5  5    5  2.0
> Hornet 4 Drive    5.0 3.5  5.0  4  2.0  4  5.0  5  2    2  2.0
> Hornet Sportabout 2.0 6.0  6.0  6  3.0  5  2.5  2  2    2  4.0
> Valiant           1.0 3.5  4.0  2  1.0  6  6.0  5  2    2  2.0

# Rowwise - considers only the individual values within the row when applying the function, hence always returning rank 1
library(dplyr)
head(mtcars) %>% rowwise() %>% mutate_all(rank)

> # A tibble: 6 × 11
> # Rowwise: 
>     mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
> 1     1     1     1     1     1     1     1     1     1     1     1
> 2     1     1     1     1     1     1     1     1     1     1     1
> 3     1     1     1     1     1     1     1     1     1     1     1
> 4     1     1     1     1     1     1     1     1     1     1     1
> 5     1     1     1     1     1     1     1     1     1     1     1
> 6     1     1     1     1     1     1     1     1     1     1     1

# Expected result
# Can be obtained running head(mtcars) %>% apply(1, \(row) rank(row)) %>% t() %>% data.frame()

>                   mpg cyl disp hp drat wt qsec  vs  am gear carb
> Mazda RX4           9 7.0   11 10    4  3    8 1.0 2.0  5.5  5.5
> Mazda RX4 Wag       9 7.0   11 10    4  3    8 1.0 2.0  5.5  5.5
> Datsun 710          9 6.5   11 10    5  4    8 2.0 2.0  6.5  2.0
> Hornet 4 Drive      9 7.0   11 10    5  6    8 2.5 1.0  4.0  2.5
> Hornet Sportabout   9 7.0   11 10    5  6    8 1.5 1.5  4.0  3.0
> Valiant             8 7.0   11 10    4  6    9 2.5 1.0  5.0  2.5

Would it be possible to obtain the expected result with rowwise()?

@DavisVaughan
Copy link
Member

I think this is a more elegant way to do what you want

head(mtcars) |>
  tibble::rownames_to_column() |>
  tidyr::pivot_longer(-rowname) |>
  dplyr::mutate(rank = rank(value), .by = rowname) |>
  tidyr::pivot_wider(id_cols = rowname, names_from = name, values_from = rank)
#> # A tibble: 6 × 12
#>   rowname        mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>   <chr>        <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Mazda RX4        9   7      11    10     4     3     8   1     2     5.5   5.5
#> 2 Mazda RX4 W…     9   7      11    10     4     3     8   1     2     5.5   5.5
#> 3 Datsun 710       9   6.5    11    10     5     4     8   2     2     6.5   2  
#> 4 Hornet 4 Dr…     9   7      11    10     5     6     8   2.5   1     4     2.5
#> 5 Hornet Spor…     9   7      11    10     5     6     8   1.5   1.5   4     3  
#> 6 Valiant          8   7      11    10     4     6     9   2.5   1     5     2.5

Created on 2023-11-03 with reprex v2.0.2

@martijnvanattekum
Copy link
Author

Thank you for your suggestion. Your code shows a more "tidyversy" way of achieving my expected result. However, I supplied both my example and the expected results primarily to show that rowwise() does not consider the whole row when it is applied, which is counterintuitive compared to the standard column-wise operations. This unexpected behavior will persist as long is rowwise's implementation is not changed.

@DavisVaughan
Copy link
Member

I think you have misunderstood how rowwise() is intended to work. It essentially applies a group_by() that is equivalent to every row being its own group. So when you call mutate_all() on a rowwise data frame, that applies the function rows * cols number of times, one for each element of the data frame.

library(dplyr, warn.conflicts = FALSE)

df <- data.frame(
  x = c(1, 2),
  y = c("a", "b")
)

df %>% 
  rowwise() %>% 
  mutate_all(function(x) {
    cat("element:", x, "\n")
    x
  })
#> element: 1 
#> element: 2 
#> element: a 
#> element: b
#> # A tibble: 2 × 2
#> # Rowwise: 
#>       x y    
#>   <dbl> <chr>
#> 1     1 a    
#> 2     2 b

Created on 2023-11-06 with reprex v2.0.2

@martijnvanattekum
Copy link
Author

Thank you for taking the time to explain. Indeed I misinterpreted rowwise() to act as the counterpart of Column-wise operations, as described in the vignette (i.e. functions to be applied to each row). I see now that the rowwise grouping operation is intended for other use cases, so the issue is correctly closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants