0

I have a dataset in which I want regarding to one of the num_values add new value from the other dataframe.

Here's an example, create some df1

First.Name <- c("John", "Edgar", "Walt", "Jane")
Second.Name <- c("Doe", "Poe", "Whitman", "Austen")
num_value <- runif(4,0,1.2)
df1 <- data.frame(First.Name, Second.Name, num_value)

which has this output

 First.Name Second.Name  num_value
1       John         Doe 0.08137931
2      Edgar         Poe 0.30245512
3       Walt     Whitman 0.62542554
4       Jane      Austen 0.40573224

df2 is defined as

upper_boundary <- seq(0,1.6,0.2)
class_value <- c(1:9)
df2 <- data.frame(upper_boundary, class_value)

and has it's output as

  upper_boundary class_value
1            0.0           1
2            0.2           2
3            0.4           3
4            0.6           4
5            0.8           5
6            1.0           6
7            1.2           7
8            1.4           8
9            1.6           9

What I'd like to do is to add at the end of the first data frame class value from df2. Output should be something like

       First.Name Second.Name  num_value     class_value  
    1       John         Doe 0.08137931         2
    2      Edgar         Poe 0.30245512         3
    3       Walt     Whitman 0.62542554         5
    4       Jane      Austen 0.40573224         4

Thanks in advance

zx8754
  • 46,390
  • 10
  • 104
  • 180
  • Read about function `cut`. – zx8754 Sep 23 '20 at 14:34
  • Possible duplicate of https://stackoverflow.com/q/40380112/680068 – zx8754 Sep 23 '20 at 14:37
  • 1
    FYI ArminCehajic, marking as a duplicate is mostly to allow a previously perfectly-acceptable answer to continue to garner attention, not to take away from your question or the answers here. I believe you can still [accept](https://stackoverflow.com/help/someone-answers) one of these answers if you choose. The "Duplicate" banner at the top of the question indicates to follow-on readers that this is not the first/only solution; in fact, the answers here are likely personalized for this, but the other (linked) questions often have much more complete/details answers. – r2evans Sep 23 '20 at 14:47

2 Answers2

0

cut and findInterval will work here. I'll use the second,

df1$class_value <- df2$class_value[ findInterval(df1$num_value, df2$upper_boundary) ]
df1
#   First.Name Second.Name num_value class_value
# 1       John         Doe 0.1908379           1
# 2      Edgar         Poe 0.5045381           3
# 3       Walt     Whitman 0.1925997           1
# 4       Jane      Austen 0.2674465           2
r2evans
  • 108,754
  • 5
  • 72
  • 122
  • Most likely a dupe, please search before answering. – zx8754 Sep 23 '20 at 14:39
  • 1
    @zx8754, answering a dupe is not always evil. https://meta.stackexchange.com/questions/50358/what-is-with-people-who-answer-questions-that-are-known-to-be-dupes – r2evans Sep 23 '20 at 14:42
0

Try cut like below

df1$class_value <- df2$class_value[as.integer(cut(df1$num_value,df2$upper_boundary))+1]
ThomasIsCoding
  • 80,151
  • 7
  • 17
  • 65