Skip to content

Dear Internet Explorer user: Your browser is no longer supported

Please switch to a modern browser such as Microsoft Edge, Mozilla Firefox or Google Chrome to view this website's content.

A substitute for plyr’s mapvalues

The mapvalues() function in plyr was immensely useful, but with plyr’s retirement, it’s time to move on and find something better.

The R package plyr was retired in 2023. In any case, it was redundant because dplyr had been created to take it’s place, although some of the functions differed. One of the functions to be lost was mapvalues(), which I liked and used quite a bit. Thankfully, there is an alternative approach available.

A Reproducible Example

Let’s start with a tiny sample of data from this paper:

df <- structure(list(barcode = c("B2020001c14r10n176", "B2020001c14r08n136", 
                                 "B2020001c02r07n129", "B2020001c13r10n177", "B2020001c07r08n142", 
                                 "B2020001c08r06n106", "B2020001c12r05n086", "B2020001c03r07n116", 
                                 "B2020001c19r08n143", "B2020001c13r07n131", "B2020001c08r10n180"
), line = c("14", "14", "2", "13", "7", "8", "12", "3", "19", "13", "8"), 
filename = c("B2019004t0g076n125_2019-07-29 17-52-46_B2019004_90-deg_46497501_2410_0.png", 
             "B2019004t1g110n001_2019-07-31 00-03-59_B2019004_0-deg_46733801_1065_0.png", 
             "B2019004t1g142n470_2019-08-14 08-40-04_B2019004_90-deg_48091701_2241_0.png", 
             "B2019004t1g004n153_2019-09-04 02-51-08_B2019004_0-deg_49923101_1078_0.png", 
             "B2019004t1g182n471_2019-08-25 08-40-59_B2019004_90-deg_48992601_2245_0.png", 
             "B2019004t1g031n079_2019-08-21 01-29-31_B2019004_0-deg_48532901_2206_0.png", 
             "B2019004t1g155n627_2019-08-11 11-39-30_B2019004_0-deg_47796701_1772_0.png", 
             "B2019004t0g179n347_2019-08-18 06-24-41_B2019004_90-deg_48408201_1754_0.png", 
             "B2019004t1g143n688_2019-09-01 12-44-45_B2019004_90-deg_49741701_1528_0.png", 
             "B2019004t0g262n880_2019-08-25 16-19-18_B2019004_90-deg_49115301_1004_0.png", 
             "B2019004t0g190n444_2019-08-14 08-10-22_B2019004_0-deg_48083801_2402_0.png"
), count_unique_branches = c(2L, 1L, 1L, 2L, 4L, 2L, 2L, 
                             4L, 3L, 5L, 3L)), class = "data.frame", row.names = c(NA, 
                                                                                   -11L))

Here, I have some data about branch numbers related to individual lentil plants, each of which have barcodes. I need to identify which variety each plant belongs to, based on the line number. Thankfully, I have that data in another data frame:

cultivars <- structure(list(variety = c("Aldinga", "CDC Ruby", "CIPAL0717", 
"Cobber", "Commondo", "Cumra", "Digger", "Eston", "ILL2024", 
"ILL7537", "Indianhead", "Matilda", "Nipper", "Northfield", "PBA Bolt", 
"SP1333", "PBA Hallmark XT", "PBA Jumbo2", "PBA Greenfield"), 
    line_number = c("1", "2", "3", "4", "5", "6", "7", "8", "9", 
    "10", "11", "12", "13", "14", "15", "16", "17", "18", "19"
    )), class = "data.frame", row.names = c(NA, -19L))

So what I need to do is pull-out the variety data by matching line_number from cultivars with line from df.

plyr

The plyr method to achieve this was easy. First create a vector (which I’ll call genotype) with the the mapped values:

library(plyr)
genotype <- plyr::mapvalues(df$line, from=cultivars$line_number, to=cultivars$variety)

Then add it to the original data frame (df):

df$genotype <- genotype

dplyr: A better way

The better method is to use dplyr’s left_join() function as follows:

library(tidyverse)
join <- left_join(df, cultivars, join_by(line == line_number))

A left join keeps all observations in x. So in this case, we’re getting dplyr to:

Make sure that the data types that are being matched are of the same type: you can’t match an integer with a character string, for instance.

Also note that an inner_join() would also work here, but could result in the loss of data because unmatched rows in either input are not included in the result.

   

Comments

No comments have yet been submitted. Be the first!

Have Your Say

The following HTML is permitted:
<a href="" title=""> <b> <blockquote cite=""> <code> <em> <i> <q cite=""> <strike> <strong>

Comments will be published subject to the Editorial Policy.