Peter’s -R character(0) in lists
My last story ended with two columns, each containing lists.
If both columns contain character(0), then this row needs no further interest. I want to identify columns with differences.
Identify columns with differences
Seems to be a simple task for the filter function from dplyr. But there are two problems:
- how to test for character(0)
- how to test on a list
Test for character(0)
I can solve the first problem by using the function is_empty(). This function is within the library(rlang). Library(purrr) which we will need for the next step, automatically loads rlang.
Extract character (0) from list
The best library for manipulating lists is the purrr library. I am still at the beginning, but I will use it more in the future. The principal functions within the library are map* with different endings. In this special case, we need a map_lgl. The ending refers to the expected output, logical in this case.
The function comprises two parts: map_lgl(.x, .f), .x represents the list the function .f applies to. The list .x is the column with the list, the function is_empty.
ops_diff <- df_ops %>%
mutate(an=map_lgl(.x = alt_neu, .f = is_empty)) %>%
mutate(na=map_lgl(.x = neu_alt, .f = is_empty)) %>%
filter(na == FALSE | an == FALSE)
The filter extract the rows. See the result here:
Result
With the two functions presented in this and the last story, I can filter the rows which have differences in the ICPM-codes before and after revision of the coding.
What’s next
In my next story, I will stick to the subject text and will summarize some great resources to learn more about text mining. Stay tuned.
If you enjoy reading this and want to support my further writing, consider signing up as a Medium member. You’ll get full access to all stories on Medium. If you sign up using my link, I’ll earn a small commission.