Dear John, thanks a lot for the detailed answer. Yes, I am not an expert in R language and when a problem comes in, I google it or post it on these forums. (I have just a little bit experience of ML in R).
On Thu, Feb 17, 2022 at 8:21 PM John Fox <j...@mcmaster.ca> wrote: > Dear Nega gupta, > > On 2022-02-17 1:54 p.m., Neha gupta wrote: > > Hello everyone > > > > I have a dataset with output variable "bug" having the following values > (at > > the bottom of this email). My advisor asked me to provide data > distribution > > of bugs with 0 values and bugs with more than 0 values. > > > > data = readARFF("synapse.arff") > > data2 = readARFF("synapse.arff") > > data$bug > > library(tidyverse) > > data %>% > > filter(bug == 0) > > data2 %>% > > filter(bug >= 1) > > boxplot(data2$bug, data$bug, range=0) > > > > But both the graphs are exactly the same, how is it possible? Where I am > > doing wrong? > > As it turns out, you're doing several things wrong. > > First, you're not using pipes and filter() correctly. That is, you don't > do anything with the filtered versions of the data sets. You're > apparently under the incorrect impression that filtering modifies the > original data set. > > Second, you're greatly complicating a simple problem. You don't need to > read the data twice and keep two versions of the data set. As well, > processing the data with pipes and filter() is entirely unnecessary. The > following code works: > > with(data, boxplot(bug[bug == 0], bug[bug >= 1], range=0)) > > Third, and most fundamentally, the parallel boxplots you're apparently > trying to construct don't really make sense. The first "boxplot" is just > a horizontal line at 0 and so conveys no information. Why not just plot > the nonzero values if that's what you're interested in? > > Fourth, you didn't share your data in a convenient form. I was able to > reconstruct them via > > bug <- scan() > 0 1 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 1 0 0 0 0 0 > 0 4 1 0 > 0 1 0 0 0 0 0 0 1 0 3 2 0 0 0 0 3 0 0 0 0 2 0 0 0 1 0 0 0 0 1 1 1 0 0 > 0 0 0 0 > 1 1 2 1 0 1 0 0 0 2 2 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 5 0 0 0 0 0 0 > 7 0 0 1 > 0 1 1 0 2 0 3 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 3 2 1 1 0 0 0 0 0 0 > 0 1 0 0 > 0 0 0 0 0 0 0 0 0 1 0 1 0 0 3 0 0 1 0 1 3 0 0 0 0 0 0 0 0 1 0 4 1 1 0 > 0 0 0 1 > 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 1 0 0 0 0 0 > > data <- data.frame(bug) > > Finally, it's better not to post to the list in plain-text email, rather > than html (as the posting guide suggests). > > I hope this helps, > John > > > > > > > data$bug > > [1] 0 1 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 1 0 0 0 > 0 0 > > 0 4 1 0 > > [40] 0 1 0 0 0 0 0 0 1 0 3 2 0 0 0 0 3 0 0 0 0 2 0 0 0 1 0 0 0 0 1 1 1 > 0 0 > > 0 0 0 0 > > [79] 1 1 2 1 0 1 0 0 0 2 2 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 5 0 0 0 0 > 0 0 > > 7 0 0 1 > > [118] 0 1 1 0 2 0 3 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 3 2 1 1 0 0 0 0 > 0 0 > > 0 1 0 0 > > [157] 0 0 0 0 0 0 0 0 0 1 0 1 0 0 3 0 0 1 0 1 3 0 0 0 0 0 0 0 0 1 0 4 1 > 1 0 > > 0 0 0 1 > > [196] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 1 0 0 0 0 0 > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > -- > John Fox, Professor Emeritus > McMaster University > Hamilton, Ontario, Canada > web: https://socialsciences.mcmaster.ca/jfox/ > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.