Clearly you are being too specific about the structure of the sku. In the 
absence of better information about the sku you need to focus on identifying 
the delimiters and position of the sku... one way might be:

ecommerce$sku  <- sub( "^(.*)[ \n]+([^ \n]+)$", "\\2", ecommerce$producto )

Please learn to post using plain text format, as HTML corrupts the latter on 
this mailing list. The option exists in your email client (including the GMail 
Web interface if that is what you use).
-- 
Sent from my phone. Please excuse my brevity.

On August 27, 2017 9:18:52 AM PDT, "Omar André Gonzáles Díaz" 
<oma.gonza...@gmail.com> wrote:
>Hello, I need some help with regex.
>
>I have this to sentences. I need to extract both "49MU6300" and
>"LE32S5970"
>and put them in a new colum "SKU".
>
>A) SMART TV UHD 49'' CURVO 49MU6300
>B) SMART TV HD 32'' LE32S5970
>
>DataFrame for testing:
>
>ecommerce <- data.frame(a = c(1,2), producto = c("SMART TV UHD 49''
>CURVO
>49MU6300",
>                             "SMART TV HD 32'' LE32S5970"))
>
>
>I'm using gsub like this:
>
>1.- This would capture A as intended but only "32S5970" from B (missing
>"LE").
>
>ecommerce$sku <- gsub("(.*)([0-9]{2}[a-zA-Z]{1,2}[0-9]{2,4})(.*)",
>"\\2",
>ecommerce$producto)
>
>
>2.- This would capture "LE32S5970" but not "49MU6300".
>
>ecommerce$sku <-
>gsub("(.*)([a-zA-Z]{2}[0-9]{2}[a-zA-Z]{1,2}[0-9]{2,4})(.*)", "\\2",
>ecommerce$producto)
>
>
>3.- If I make the 2 first letter optional with:
>
>ecommerce$sku <-
>gsub("(.*)([a-zA-Z]?{2}[0-9]{2}[a-zA-Z]{1,2}[0-9]{2,4})(.*)", "\\2",
>ecommerce$producto)
>
>
>"49MU6300" is capture, but again only "32S5970" from B (missing "LE").
>
>
>What should I do? How would you approche it?
>
>       [[alternative HTML version deleted]]
>
>______________________________________________
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to