[R] Regex: Combining sub/grepl with ifelse

David Winsemius dwinsemius at comcast.net
Sat Oct 10 00:51:17 CEST 2015


On Oct 9, 2015, at 2:48 PM, Omar André Gonzáles Díaz wrote:

> Thank you, David. You put me in the right direction. 
> 
> At the end, I've used a lot of lines, to my taste, for this task.
> 
> Is there a more elegant way, of doing this?

There are conditional capture-classes in rexex in addition to the variable span {.,.} operators, but when I suggested using more general patterns, you turned them down, with a desire for the more incremental approach.

his pattern extracts a larger number of cases:

 sub("(.*)([a-z]{0,3}[-]{0,1}[0-9]{2}[a-z]{1,2}[0-9]{2,4}[a-z]{0,1})(.*)", "\\2",
ripley.tv$producto,
ignore.case = T)

If I take out the leading space in the capture group then I get all cases extracted but it trims down the size of many of them. This with a trailing character class that has space or "\n" as an separator seems to be very successfull in one pass:

 sub("(.*[ \n])([a-z]{0,3}[-]{0,1}[0-9]{0,2}[a-z]{1,2}[0-9]{2,4}[a-z]{0,1})(.*)", "\\2",
 ripley.tv$producto,
 ignore.case = T)

 [1] "48J6400"     "40J5300"     "TC-40CS600L" "LE28F6600"   "LE40K5000N" 
 [6] "LE32B7000"   "LE32K5000N"  "LE55B8000"   "LE40B8000"   "LE24B8000"  
[11] "TC-42AS610"  "LE50K5000N"  "40JU6500"    "48JU6500"    "50JU6500"   
[16] "55JS9000"    "55JU6500"    "55JU6700"    "55JU7500"    "65JS9000"   
[21] "65JU6500"    "65JU7500"    "75JU6500"    "40LF6350"    "42LF6400"   
[26] "42LF6450"    "49LF6450"    "LF6400"      "43UF6750"    "49UF6750"   
[31] "UF6900"      "49UF7700"    "49UF8500"    "55UF7700"    "65UF7700"   
[36] "55UF8500"    "TC-55CX640W" "TC-50CX640W" "70UF7700"    "UG8700"     
[41] "LF6350"      "KDL-50FA95C" "KDL50W805C"  "KDL-40R354B" "40J5500"    
[46] "50J5500"     "32JH4005"    "50J5300"     "48J5300"     "40J6400"    
[51] "KDL-32R505C" "KDL-40R555C" "55J6400"     "40JH5005"    "43LF5410"   
[56] "32LF585B"    "49LF5900"    "KDL-65W855C" "UN48J6500"   "LE40F1551"  
[61] "TC-32AS600L" "KDL-32R304B" "55EC9300"    "LE32W454F"   "58UF8300"   
[66] "KDL-55W805C" "XBR-49X835C" "XBR-55X855C" "XBR-65X905C" "XBR-75X945C"
[71] "XBR-55X905C" "LC60UE30U"   "LC70UE30U"   "LC80UE30U"   "48J5500"    
[76] "79UG8800"    "65UF9500"    "65UF8500"    "55UF9500"    "32J4300"    
[81] "KDL-48R555C" "55UG8700"    "60UF8500"    "55LF6500"    "32LF550B"   
[86] "47LB5610"    "TC-50AS600L" "XBR-55X855B" "LC70SQ17U"   "XBR-79X905B"
[91] "TC-40A400L"  "XBR-70X855B" "55HU8700"    "LE40D3142"   "TC-42AS650L"
[96] "LC70LE660"   "LE58D3140"  
> 



-- 
David
> 
> 
> ripley.tv$id <- sub("(.*)( [0-9]{2}[a-z]{1}[0-9]{4})(.*)", "\\2",
>                     ripley.tv$producto,
>                     ignore.case = T)
> 
> ripley.tv$id <- sub("(.*)( [0-9]{2}[a-z]{2}[0-9]{4} )(.*)", "\\2",
>                     ripley.tv$id,
>                     ignore.case = T)
> 
> ripley.tv$id <- sub("(.*)( [a-z]{2}[0-9]{2}[a-z]{1}[0-9]{4} )(.*)", "\\2",
>                     ripley.tv$id, ignore.case = T)
> 
> 
> 
> ripley.tv$id <- sub("(.*)([A-Z]{2}\\-[0-9]{2}[A-Z]{2}[0-9]{3}[A-Z]{1})(.*)", "\\2",
>                     ripley.tv$id, ignore.case = T)
> 
> 
> ripley.tv$id <- sub("(.*)([A-Z]{2}[0-9]{2}[A-Z]{1}[0-9]{4}[A-Z]{1})(.*)", "\\2",
>                     ripley.tv$id, ignore.case = T)
> 
> 
> ripley.tv$id <- sub("(.*)([A-Z]{2}[0-9]{2}[A-Z]{1}[0-9]{4})(.*)", "\\2",
>                     ripley.tv$id, ignore.case = T)
> 
> 
> ripley.tv$id <- sub("(.*)([A-Z]{2}\\-[0-9]{2}[A-Z]{2}[0-9]{3})(.*)", "\\2",
>                     ripley.tv$id, ignore.case = T)
> 
> 
> ripley.tv$id <- sub("(.*)([0-9]{2}[A-Z]{2}[0-9]{4})(.*)", "\\2",
>                     ripley.tv$id, ignore.case = T)
> 
> 
> ripley.tv$id <- sub("(.*)([A-Z]{2}[0-9]{4})(.*)", "\\2",
>                     ripley.tv$id, ignore.case = T)
> 
> 
> ripley.tv$id <- sub("(.*)([A-Z]{3}\\-[0-9]{2}[A-Z]{2}[0-9]{2}[A-Z]{1})(.*)", "\\2",
>                     ripley.tv$id, ignore.case = T)
> 
> 
> ripley.tv$id <- sub("(.*)([A-Z]{3}[0-9]{2}[A-Z]{1}[0-9]{3}[A-Z]{1})(.*)", "\\2",
>                     ripley.tv$id, ignore.case = T)
> 
> 
> ripley.tv$id <- sub("(.*)([A-Z]{3}\\-[0-9]{2}[A-Z]{1}[0-9]{3}[A-Z]{1})(.*)", "\\2",
>                     ripley.tv$id, ignore.case = T)
> 
> 
> ripley.tv$id <- sub("(.*)([0-9]{2}[A-Z]{2}[0-9]{3}[A-Z]{1})(.*)", "\\2",
>                     ripley.tv$id, ignore.case = T)
> 
> 
> 
> ripley.tv$id <- sub("(.*)([A-Z]{2}[0-9]{2}[A-Z]{2}[0-9]{2}[A-Z]{1})(.*)", "\\2",
>                     ripley.tv$id, ignore.case = T)
> 
> 
> 
> ripley.tv$id <- sub("(.*)([A-Z]{2}\\-[0-9]{2}[A-Z]{1}[0-9]{3}[A-Z]{1})(.*)", "\\2",
>                     ripley.tv$id, ignore.case = T)
> 
> 
> 
> ripley.tv$id <- sub("(.*)([A-Z]{2}[0-9]{2}[A-Z]{1}[0-9]{3}[A-Z]{1})(.*)", "\\2",
>                     ripley.tv$id, ignore.case = T)
> 
> 
> 
> ripley.tv$id <- sub("(.*)([A-Z]{2}[0-9]{2}[A-Z]{2}[0-9]{3})(.*)", "\\2",
>                     ripley.tv$id, ignore.case = T)
> 
> 
> 2015-10-09 16:16 GMT-05:00 David Winsemius <dwinsemius at comcast.net>:
> 
> On Oct 9, 2015, at 1:50 PM, Omar André Gonzáles Díaz wrote:
> 
> > David,
> >
> > this is a working case. I know that all cases for ID are not covered with my current code.
> >
> > The question is:
> >
> > ID stars as NAs.
> >
> > 1.- How to extract 1 type of ID, and keep the rest of entries as they are.
> >
> > 2.- Then keep the first extraction, and search for second type of ID.
> 
> If you want to use that workflow then you can replace $id with teh output of sub on $procuto the first time and then contiue workign on the results using $id as the input. The sub function returns the original values when there is no match.
> 
> ripley.tv$id <- sub("(.*)( [0-9]{2}[a-z]{1}[0-9]{4})(.*)", "\\2",
> ripley.tv$producto,
> ignore.case = T)
> 
> ripley.tv$id <- sub("(.*)( [0-9]{2}[a-z]{2}[0-9]{4} )(.*)", "\\2",
> ripley.tv$id,
> ignore.case = T)
> 
> ripley.tv$id <- sub("(.*)( [a-z]{2}[0-9]{2}[a-z]{1}[0-9]{4} )(.*)", "\\2",
> ripley.tv$id, ignore.case = T)
> 
> > head(ripley.tv$id, 20)
>  [1] "SMART TV LED FHD 48\" 3D\n48J6400"   " 40J5300"
>  [3] "TV LED FULL HD 40'' TC-40CS600L"     " LE28F6600 "
>  [5] "SMART TV 40\" HD LE40K5000N"         "TV LED HD 32'' LE32B7000"
>  [7] "SMART TV  32'' LE32K5000N"           "TV LED FHD 55\" -\nLE55B8000"
>  [9] " LE40B8000 "                         " LE24B8000 "
> [11] "TV LED FULL HD 42'' TC-42AS610"      "TELEVISOR LED LE50K5000N 50\""
> [13] "SMART TV LED UHD 40\" 40JU6500"      "SMART TV ULTRA HD 48'' 48JU6500"
> [15] " 50JU6500 "                          "SMART TV ULTRA HD 55'' 3D\n55JS9000"
> [17] "SMART TV LED UHD 55\" 55JU6500"      "SMART TV ULTRA HD 55'' 55JU6700"
> [19] " 55JU7500 "                          "SMART TV ULTRA HD 65''\n3D 65JS9000"
> 
> >
> 
> That shows all your failures. You can incremetally fix them I suppose.
> 
> --
> David.
> > 3.- An so on with all types of IDs in product column.
> >
> > I think it can be achieve with some ifelse construction I think... that's why my initial lines of code.
> >
> > Any hint is welcome.
> >
> >
> >
> >
> >
> >
> >
> > 2015-10-09 15:37 GMT-05:00 David Winsemius <dwinsemius at comcast.net>:
> >
> > On Oct 9, 2015, at 12:59 PM, Omar André Gonzáles Díaz wrote:
> >
> > > I need to extract an ID from the product column of my df.
> > >
> > > I was able to extract the ids for some scenearios, but when applying my
> > > code for the next type of ids (there are some other combinations), the
> > > results of my first line of code got NAs.
> > >
> > >
> > > ripley.tv$id <- sub("(.*)( [0-9]{2}[a-z]{1}[0-9]{4})(.*)", "\\2",
> > > ripley.tv$producto,
> > > ignore.case = T)
> >
> > I think you need to examine the implicit logic of your pattern argument against the realities of your data:
> >
> > > as.matrix(head(ripley.tv$producto, 10))
> >       [,1]
> >  [1,] "SMART TV LED FHD 48\" 3D\n48J6400"
> >  [2,] "SMART TV LED FHD 40\" 40J5300"
> >  [3,] "TV LED FULL HD 40'' TC-40CS600L"
> >  [4,] "TELEVISOR LED LE28F6600 28\""
> >  [5,] "SMART TV 40\" HD LE40K5000N"
> >  [6,] "TV LED HD 32'' LE32B7000"
> >  [7,] "SMART TV  32'' LE32K5000N"
> >  [8,] "TV LED FHD 55\" -\nLE55B8000"
> >  [9,] "TV LED LE40B8000 FULL HD 40\""
> > [10,] "TV LE24B8000 LED HD 24\" - NEGRO"
> >
> > So the first value and most of the other values do not have a space before the segment of characters that you want to match, The third item has two character inside the flanking numbers.
> >
> > I tried fixing that by removing the space and putting in optional length values and discovered that some of you id's have only 3 trailing digit-characters:
> >
> >  head( sub("(.*)([0-9]{2}[a-z]{1,2}[0-9]{4})(.*)", "\\2",
> >  ripley.tv$producto,
> >  ignore.case = T), 20
> >   )
> > #-----------
> >  [1] "48J6400"                         "40J5300"
> >  [3] "TV LED FULL HD 40'' TC-40CS600L" "28F6600"
> >  [5] "40K5000"                         "32B7000"
> >  [7] "32K5000"                         "55B8000"
> >  [9] "40B8000"                         "24B8000"
> > [11] "TV LED FULL HD 42'' TC-42AS610"  "50K5000"
> > [13] "40JU6500"                        "48JU6500"
> > [15] "50JU6500"                        "55JS9000"
> > [17] "55JU6500"                        "55JU6700"
> > [19] "55JU7500"                        "65JS9000"
> >
> >  head( sub("(.*)([0-9]{2}[a-z]{1,2}[0-9]{3,4})(.*)", "\\2",
> >  ripley.tv$producto,
> >  ignore.case = T), 20
> >   )
> > #--------------
> >  [1] "48J6400"  "40J5300"  "40CS600"  "28F6600"  "40K5000"  "32B7000"
> >  [7] "32K5000"  "55B8000"  "40B8000"  "24B8000"  "42AS610"  "50K5000"
> > [13] "40JU6500" "48JU6500" "50JU6500" "55JS9000" "55JU6500" "55JU6700"
> > [19] "55JU7500" "65JS9000"
> >
> > >
> > > ripley.tv$id <- sub("(.*)( [0-9]{2}[a-z]{2}[0-9]{4} )(.*)", "\\2",
> > > ripley.tv$producto,
> > > ignore.case = T)
> > >
> > > ripley.tv$id <- sub("(.*)( [a-z]{2}[0-9]{2}[a-z]{1}[0-9]{4} )(.*)", "\\2",
> > > ripley.tv$producto, ignore.case = T)
> > >
> > >
> > > Also I've tried to use the ifelse function, but got "/2" as result.
> >
> > I think you will find that `ifelse` delivers strange results when used with factors as values. It strips attributes.
> >
> >
> >
> > --
> > david.
> > >
> > >
> > > ripley.tv <- ripley.tv %>%
> > >        mutate(id = NA,
> > >              id = ifelse(grepl("(.*)([0-9]{2}[A-Z]{1}[0-9]{4})(.*)",
> > > ripley.tv$producto) == T, "\\2",id),
> > >              id = ifelse(grepl("(.*)([0-9]{2}[A-Z]{2}[0-9]{4})(.*)",
> > > ripley.tv$producto) == T, "\\2",id),
> > >              id =
> > > ifelse(grepl("(.*)([A-Z]{2}[0-9]{2}[A-Z]{1}[0-9]{4})(.*)", ripley.tv$producto)
> > > == T, "\\2",id))
> > >
> > >
> > >
> > > And also str_extract:
> > >
> > >
> > > ripley.tv$id <- str_extract(ripley.tv$producto, "[0-9]{2}[A-Z]{1}[0-9]{4}")
> > > ripley.tv$id <- str_extract(ripley.tv$producto, "[0-9]{2}[A-Z]{2}[0-9]{4}")
> > > ripley.tv$id <- str_extract(ripley.tv$producto,
> > > "[A-Z]{2}[0-9]{2}[A-Z]{1}[0-9]{4}")
> > >
> > > This is my data:
> > >
> > >
> > > ripley.tv <- structure(list(id = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
> > > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
> > > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
> > > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
> > > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
> > > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
> > > NA, NA, NA, NA, NA, NA, NA), marca = c("SAMSUNG", "SAMSUNG",
> > > "PANASONIC", "HAIER", "HAIER", "HAIER", "HAIER", "HAIER", "HAIER",
> > > "HAIER", "PANASONIC", "HAIER", "SAMSUNG", "SAMSUNG", "SAMSUNG",
> > > "SAMSUNG", "SAMSUNG", "SAMSUNG", "SAMSUNG", "SAMSUNG", "SAMSUNG",
> > > "SAMSUNG", "SAMSUNG", "LG", "LG", "LG", "LG", "LG", "LG", "LG",
> > > "LG", "LG", "LG", "LG", "LG", "LG", "PANASONIC", "PANASONIC",
> > > "LG", "LG", "LG", "SONY", "SONY", "SONY", "SAMSUNG", "SAMSUNG",
> > > "SAMSUNG", "SAMSUNG", "SAMSUNG", "SAMSUNG", "SONY", "SONY", "SAMSUNG",
> > > "SAMSUNG", "LG", "LG", "LG", "SONY", "SAMSUNG", "AOC", "PANASONIC",
> > > "SONY", "LG", "AOC", "LG", "SONY", "SONY", "SONY", "SONY", "SONY",
> > > "SONY", "SHARP", "SHARP", "SHARP", "SAMSUNG", "LG", "LG", "LG",
> > > "LG", "SAMSUNG", "SONY", "LG", "LG", "LG", "LG", "LG", "PANASONIC",
> > > "SONY", "SHARP", "SONY", "PANASONIC", "SONY", "SAMSUNG", "AOC",
> > > "PANASONIC", "SHARP", "AOC"), producto = c("SMART TV LED FHD 48\" 3D
> > > 48J6400",
> > > "SMART TV LED FHD 40\" 40J5300", "TV LED FULL HD 40'' TC-40CS600L",
> > > "TELEVISOR LED LE28F6600 28\"", "SMART TV 40\" HD LE40K5000N",
> > > "TV LED HD 32'' LE32B7000", "SMART TV  32'' LE32K5000N", "TV LED FHD 55\" -
> > > LE55B8000",
> > > "TV LED LE40B8000 FULL HD 40\"", "TV LE24B8000 LED HD 24\" - NEGRO",
> > > "TV LED FULL HD 42'' TC-42AS610", "TELEVISOR LED LE50K5000N 50\"",
> > > "SMART TV LED UHD 40\" 40JU6500", "SMART TV ULTRA HD 48'' 48JU6500",
> > > "SMART TV 50JU6500 LED UHD 50\" - NEGRO", "SMART TV ULTRA HD 55'' 3D
> > > 55JS9000",
> > > "SMART TV LED UHD 55\" 55JU6500", "SMART TV ULTRA HD 55'' 55JU6700",
> > > "SMART TV CURVO 55JU7500 LED UHD 55\" 3D - NEGRO", "SMART TV ULTRA HD 65''
> > > 3D 65JS9000",
> > > "SMART TV 65JU6500 LED UHD 65\"", "SMART TV ULTRA HD 65'' 65JU7500",
> > > "SMART TV LED UHD 75\" 75JU6500", "SMART TV WEB OS 40\" FULL HD 40LF6350",
> > > "SMART TV 3D 42\" FULL HD 42LF6400", "TV LED 42\" FULL HD CINEMA 3D
> > > 42LF6450",
> > > "TV LED 49\" FULL HD CINEMA 3D 49LF6450", "SMART TV LF6400 49\" FULL HD
> > > 3D",
> > > "TV 43UF6750 43\" ULTRA HD 4K", "TV 49\" ULTRA HD 4K 49UF6750",
> > > "TV LED 49\" ULTRA HD SMART UF6900", "SMART TV 49UF7700 49\" ULTRA HD 4K",
> > > "SMART TV 49UF8500 49\" ULTRA HD 4K 3D", "TV LED 55\" CINEMA 3D SMART TV
> > > 55UF7700",
> > > "SMART TV 65UF7700 65\" ULTRA HD 4K", "SMART TV 55UF8500 55\" ULTRA HD 4K
> > > 3D",
> > > "TV LED 55\" ULTRA HD 4K SMART TC-55CX640W", "TV LED 50\" ULTRA HD 4K SMART
> > > TC-50CX640W",
> > > "SMART TV 70UF7700 3D ULTRA HD 70\"", "TV LED CURVO 65\" ULTRA HD 4K CINEMA
> > > SMART UG8700",
> > > "TV LED 60\" FULL HD SMART LF6350", "SMART TV KDL-50FA95C 50\" FULL HD 3D",
> > > "SMART TV KDL50W805C 50\" FULL HD 3D", "TV LED 40\" FULL HD KDL-40R354B",
> > > "SMART TV LED FULL HD 40'' 40J5500", "SMART TV LED FULL HD 50'' 50J5500",
> > > "TV LED HD 32'' 32JH4005", "SMART TV LED FULL HD 50\" 50J5300",
> > > "SMART TV LED 48\" FULL HD 48J5300", "SMART TV FULL HD 40'' 3D 40J6400",
> > > "TV LED 32\" HD SMART KDL-32R505C", "TV LED 40\" SMART FULL HD KDL-40R555C
> > > - NEGRO",
> > > "SMART TV LED FHD 55\" 3D 55J6400", "TV 40JH5005 LED FHD 40\" - NEGRO",
> > > "TV 43\" FULL HD 43LF5410", "SMART TV 32LF585B LED HD 32\" - BLANCO",
> > > "TV LED 49\" FULL HD SMART 49LF5900", "SMART TV 65\" FULL HD 3D
> > > KDL-65W855C",
> > > "SMART TV LED FHD 48\" UN48J6500", "TV LED 40\" FULL HD LE40F1551",
> > > "TV LED 32'' SMART HD TC-32AS600L", "TV LED 32'' HD KDL-32R304B",
> > > "TV OLED 55\" SMART 3D FULL HD 55EC9300 PLATEADO", "TV LED HD 32''
> > > LE32W454F",
> > > "TV LED 58\" ULTRA HD SMART 58UF8300", "TV LED 55\" FULL HD SMART 3D
> > > KDL-55W805C",
> > > "TV LED 49\" ULTRA HD 4K XBR-49X835C", "TV LED 55\" ULTRA HD 4K
> > > XBR-55X855C",
> > > "TV LED ULTRA DELGADO 55\" ULTRA HD 4K XBR-65X905C", "TV LED 75\" ULTRA HD
> > > 4K 3D XBR-75X945C",
> > > "TV LED ULTRA DELGADO 55\" ULTRA HD 4K XBR-55X905C", "SMART TV LED 60''
> > > ULTRA HD 4K LC60UE30U",
> > > "SMART TV LED 70'' ULTRA HD 4K LC70UE30U", "SMART TV LED 80'' ULTRA HD 4K
> > > LC80UE30U",
> > > "SMART TV LED FULL HD 48'' 48J5500", "SMART TV CURVO 79UG8800 79\" ULTRA HD
> > > 4K 3D",
> > > "SMART TV 65UF9500 65\" ULTRA HD 4K 3D", "SMART TV 65UF8500 65\" ULTRA HD
> > > 4K 3D",
> > > "SMART TV 55UF9500 55\" ULTRA HD 4K 3D", "SMART TV LED HD 32\" 32J4300",
> > > "TV LED 48\" SMART FULL HD KDL-48R555C - NEGRO", "SMART TV 55UG8700 55\"
> > > ULTRA HD 4K 3D",
> > > "SMART TV 60UF8500 60\" ULTRA HD 4K 3D", "SMART TV 55LF6500 55\" FULL HD
> > > 3D",
> > > "TV 32LF550B 32\" HD", "TV LED 47\" FULL HD 47LB5610", "TV LED FULL HD 50''
> > > TC-50AS600L",
> > > "TV SMART LED 55\" UHD 3D XBR-55X855B", "TV LED FULL HD 4K LC70SQ17U 70''",
> > > "TV LED SMART UHD 79\" XBR-79X905B", "TV LED FULL HD 40'' TC-40A400L",
> > > "TV LED SMART UHD 70\" XBR-70X855B", "SMART TV UHD 55'' 3D CURVO 55HU8700",
> > > "TV FULL HD LE40D3142 40\" - NEGRO", "TELEVISOR LED 42\" TC-42AS650L",
> > > "SMART TV LCD FHD 70\" LC70LE660", "TV LED FULL HD 58'' LE58D3140"
> > > ), pulgadas = c(48L, 40L, 40L, 28L, 40L, 32L, 32L, 55L, 40L,
> > > 24L, 42L, 50L, 40L, 48L, 50L, 55L, 55L, 55L, 55L, 65L, 65L, 65L,
> > > 75L, 40L, 42L, 42L, 49L, 49L, 43L, 49L, 49L, 49L, 49L, 55L, 65L,
> > > 55L, 55L, 50L, 70L, 65L, 60L, 50L, 50L, 40L, 40L, 50L, 32L, 50L,
> > > 48L, 40L, 32L, 40L, 55L, 40L, 43L, 32L, 49L, 65L, 48L, 40L, 32L,
> > > 32L, 55L, 32L, 58L, 55L, 49L, 55L, 55L, 75L, 55L, 60L, 70L, 80L,
> > > 48L, 79L, 65L, 65L, 55L, 32L, 48L, 55L, 60L, 55L, 32L, 47L, 50L,
> > > 55L, 70L, 79L, 40L, 70L, 55L, 40L, 42L, 70L, 58L), precio.antes = c(2799L,
> > > 1799L, 1699L, 599L, 1299L, 699L, 999L, 1999L, 999L, 499L, 1899L,
> > > 1799L, 2499L, 3999L, 3699L, 10999L, 4299L, 5499L, 6999L, 14999L,
> > > 8999L, 9999L, 14599L, 1999L, 2299L, 2299L, 2899L, 2999L, 2299L,
> > > 23992L, 3599L, 3799L, 4799L, 4999L, 8499L, 5999L, 4999L, 3999L,
> > > 11999L, 10999L, 4399L, 4499L, 3799L, 1399L, 2299L, 2799L, 999L,
> > > 2199L, 2299L, 2299L, 1299L, 1699L, 3499L, 1399L, 1549L, 1299L,
> > > 2399L, 6499L, 2999L, 999L, 1249L, 999L, 14999L, 799L, 5999L,
> > > 4499L, 4999L, 6499L, 12999L, 24999L, 8999L, 5999L, 7599L, 14999L,
> > > 2499L, 29999L, 13999L, 9999L, 9699L, 1299L, 2399L, 6999L, 7999L,
> > > 3699L, 999L, 1899L, 2999L, 7999L, 8499L, 24999L, 1399L, 13999L,
> > > 8499L, 999L, 2599L, 5799L, 2399L), precio.nuevo = c(2299, 1399,
> > > 1299, 549, 1099, 629, 799, 1699, 849, 439, 1499, 1549, 1759.2,
> > > 2099.3, 2309.3, 7699.3, 2799.3, 3639.3, 4899.3, 10499.3, 5109.3,
> > > 6999.3, 10219.3, 1399, 1599, 1599, 2199, 2199, 1299, 23992, 2299,
> > > 2299, 2899, 2999, 5999, 3899, 4999, 3999, 8999, 6999, 4099, 3999,
> > > 3499, 1299, 1799, 2399, 799, 2199, 1799, 1999, 1199, 1599, 2999,
> > > 1199, 1399, 1099, 1999, 5999, 2799, 999, 1199, 949, 7999, 799,
> > > 5299, 4299, 3999, 5999, 11999, 23999, 7999, 5699, 7599, 14499,
> > > 2399, 29999, 11999, 8999, 7499, 1099, 2199, 6599, 7099, 3599,
> > > 899, 1599, 2199, 4999, 6499, 19999, 1399, 9999, 5999, 999, 2599,
> > > 5699, 2399), dif.precios = c(500, 400, 400, 50, 200, 70, 200,
> > > 300, 150, 60, 400, 250, 739.8, 1899.7, 1389.7, 3299.7, 1499.7,
> > > 1859.7, 2099.7, 4499.7, 3889.7, 2999.7, 4379.7, 600, 700, 700,
> > > 700, 800, 1000, 0, 1300, 1500, 1900, 2000, 2500, 2100, 0, 0,
> > > 3000, 4000, 300, 500, 300, 100, 500, 400, 200, 0, 500, 300, 100,
> > > 100, 500, 200, 150, 200, 400, 500, 200, 0, 50, 50, 7000, 0, 700,
> > > 200, 1000, 500, 1000, 1000, 1000, 300, 0, 500, 100, 0, 2000,
> > > 1000, 2200, 200, 200, 400, 900, 100, 100, 300, 800, 3000, 2000,
> > > 5000, 0, 4000, 2500, 0, 0, 100, 0), dif.porcentual = c(17.86,
> > > 22.23, 23.54, 8.35, 15.4, 10.01, 20.02, 15.01, 15.02, 12.02,
> > > 21.06, 13.9, 29.6, 47.5, 37.57, 30, 34.88, 33.82, 30, 30, 43.22,
> > > 30, 30, 30.02, 30.45, 30.45, 24.15, 26.68, 43.5, 0, 36.12, 39.48,
> > > 39.59, 40.01, 29.42, 35.01, 0, 0, 25, 36.37, 6.82, 11.11, 7.9,
> > > 7.15, 21.75, 14.29, 20.02, 0, 21.75, 13.05, 7.7, 5.89, 14.29,
> > > 14.3, 9.68, 15.4, 16.67, 7.69, 6.67, 0, 4, 5.01, 46.67, 0, 11.67,
> > > 4.45, 20, 7.69, 7.69, 4, 11.11, 5, 0, 3.33, 4, 0, 14.29, 10,
> > > 22.68, 15.4, 8.34, 5.72, 11.25, 2.7, 10.01, 15.8, 26.68, 37.5,
> > > 23.53, 20, 0, 28.57, 29.42, 0, 0, 1.72, 0), rangos = c("S/.1500 - S/.2500",
> > > "S/.500 - S/.1500", "S/.500 - S/.1500", "S/.500 - S/.1500", "S/.500 -
> > > S/.1500",
> > > "S/.500 - S/.1500", "S/.500 - S/.1500", "S/.1500 - S/.2500",
> > > "S/.500 - S/.1500", "< S/.500", "S/.500 - S/.1500", "S/.1500 - S/.2500",
> > > "S/.1500 - S/.2500", "S/.1500 - S/.2500", "S/.1500 - S/.2500",
> > > "> S/.4,500", "S/.2500 - S/.3500", "S/.3500 - S/.4500", "> S/.4,500",
> > > "> S/.4,500", "> S/.4,500", "> S/.4,500", "> S/.4,500", "S/.500 - S/.1500",
> > > "S/.1500 - S/.2500", "S/.1500 - S/.2500", "S/.1500 - S/.2500",
> > > "S/.1500 - S/.2500", "S/.500 - S/.1500", "> S/.4,500", "S/.1500 - S/.2500",
> > > "S/.1500 - S/.2500", "S/.2500 - S/.3500", "S/.2500 - S/.3500",
> > > "> S/.4,500", "S/.3500 - S/.4500", "> S/.4,500", "S/.3500 - S/.4500",
> > > "> S/.4,500", "> S/.4,500", "S/.3500 - S/.4500", "S/.3500 - S/.4500",
> > > "S/.2500 - S/.3500", "S/.500 - S/.1500", "S/.1500 - S/.2500",
> > > "S/.1500 - S/.2500", "S/.500 - S/.1500", "S/.1500 - S/.2500",
> > > "S/.1500 - S/.2500", "S/.1500 - S/.2500", "S/.500 - S/.1500",
> > > "S/.1500 - S/.2500", "S/.2500 - S/.3500", "S/.500 - S/.1500",
> > > "S/.500 - S/.1500", "S/.500 - S/.1500", "S/.1500 - S/.2500",
> > > "> S/.4,500", "S/.2500 - S/.3500", "S/.500 - S/.1500", "S/.500 - S/.1500",
> > > "S/.500 - S/.1500", "> S/.4,500", "S/.500 - S/.1500", "> S/.4,500",
> > > "S/.3500 - S/.4500", "S/.3500 - S/.4500", "> S/.4,500", "> S/.4,500",
> > > "> S/.4,500", "> S/.4,500", "> S/.4,500", "> S/.4,500", "> S/.4,500",
> > > "S/.1500 - S/.2500", "> S/.4,500", "> S/.4,500", "> S/.4,500",
> > > "> S/.4,500", "S/.500 - S/.1500", "S/.1500 - S/.2500", "> S/.4,500",
> > > "> S/.4,500", "S/.3500 - S/.4500", "S/.500 - S/.1500", "S/.1500 - S/.2500",
> > > "S/.1500 - S/.2500", "> S/.4,500", "> S/.4,500", "> S/.4,500",
> > > "S/.500 - S/.1500", "> S/.4,500", "> S/.4,500", "S/.500 - S/.1500",
> > > "S/.2500 - S/.3500", "> S/.4,500", "S/.1500 - S/.2500")), .Names = c("id",
> > > "marca", "producto", "pulgadas", "precio.antes", "precio.nuevo",
> > > "dif.precios", "dif.porcentual", "rangos"), class = "data.frame", row.names
> > > = c(NA,
> > > -97L))
> > >
> > >       [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > David Winsemius
> > Alameda, CA, USA
> >
> >
> 
> David Winsemius
> Alameda, CA, USA
> 
> 

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list