Many thanks to all of you for your responses.

So, I will try to be clearer with a larger example. Te end of my mail is the 
more important to understand what I am trying to do. I am trying to simulate 
data to fit a linear mixed model (nested not crossed). More precisely, I would 
love to get at the end of the process, a table (.txt) with columns and rows. 
Column 1 and Rows will be the 2000 pupils and the columns the different 
variables : Column 2 = classes ; Column 3 = teachers, Column 4 = schools ; 
Column 5 = gender (boy or girl) ; Column 6 = mark in Frecnh

Pupils are nested  in classes, classes are nested in schools. The teacher are 
part of the process.

I want to simulate a dataset with n=2000 pupils, 100 classes, 50 teachers and 
10 schools.
- Pupils n°1 to pupils n°2000 (p1, p2, p3, p4, ..., p2000)
- Classes n°1 to classes n°100 (c1, c2, c3, c4,..., c100)
- Teachers n°1 to teacher n°50 ( t1, t2, t3, t4, ..., t50)
- Schools n°1 to chool n°10 (s1, s2, s3, s4, ..., s10)

The nested structure is as followed : 

-- School 1 with teacher 1 to teacher 5 (t1, t2, t3, t4 and t5) with classes 1 
to classes 10 (c1, c2, c3, c4, c5, c6, c7, c8,c9,c10), pupils n°1 to pupils 
n°200 (p1, p2, p3, p4,..., p200).

-- School 2 with teacher 6 to teacher 10, with classes 11 to classes 20, pupils 
n°201 to pupils n°400

-- and so on

The table (.txt) I would love to get at the end is the following :

        Class    Teacher    School    gender    Mark
1       c1        t1                s1            boy        5
2       c1        t1                s1            boy        5.5
3       c1        t1                s1            girl        4.5
4       c1        t1                s1            girl        6
5       c1        t1                s1            boy       3.5
6       ...        ....                ....            .....        .....       
        

The first 20 rows with c1, with t1, with s1, gender (randomly slected) and mark 
(andomly selected) from 1 to 6
The rows 21 to 40 with c2 with t1 with s1
The rows 41 to 60 with c3 with t2 with s1
The rows 61 to 80 with c4 with t2 with s1
The rows 81 to 100 with c5 with t3 with s1
The rows 101 to 120 with c6 with t3 with s1
The rows 121 to 140 with c7 with t4 with s1
The rows 141 to 160 with c8 with t4 with s1
The rows 161 to 180 with c9 with t5 with s1
The rows 181 to 200 with c10 with t5 with s1

The rows 201 to 220 with c11 with t6 with s2
The rows 221 to 240 with c12 with t6 with s2

And so on...

Is it possible to do that ? Or am I dreaming ?


Le dimanche 19 mai 2019 à 10:45:43 UTC+2, Linus Chen <linus.l.c...@gmail.com> a 
écrit : 





Dear varin sacha,

I think it will help us help you, if you give a clearer description of
what exactly you want.

I assume the situation is that you know what a data structure you
want, but do not know
how to conveniently create such structure.
And that is where others can help you.
So, please, describe the wanted data structure more thoroughly,
ideally with example.

Thanks,
Lei

On Sat, May 18, 2019 at 10:04 PM varin sacha via R-help
<r-help@r-project.org> wrote:
>
> Dear Boris,
>
> Yes, top-down, no problem. Many thanks, but in your code did you not forget 
> "teacher" ? As a reminder teacher has to be nested with classes. I mean the 
> 50 pupils belonging to C1 must be with (teacher 1) T1, the 50 pupils 
> belonging to C2 with T2, the 50 pupils belonging to C3 with T3 and so on.
>
> Best,
>
>
> Le samedi 18 mai 2019 à 16:52:48 UTC+2, Boris Steipe 
> <boris.ste...@utoronto.ca> a écrit :
>
>
>
>
>
> Can you build your data top-down?
>
>
>
> schools <- paste("s", 1:6, sep="")
>
> classes <- character()
> for (school in schools) {
>  classes <- c(classes, paste(school, paste("c", 1:5, sep=""), sep = "."))
> }
>
> pupils <- character()
> for (class in classes) {
>  pupils <- c(pupils, paste(class, paste("p", 1:10, sep=""), sep = "."))
> }
>
>
>
> B.
>
>
>
> > On 2019-05-18, at 09:57, varin sacha via R-help <r-help@r-project.org> 
> > wrote:
> >
> > Dear R-Experts,
> >
> > In a data simulation, I would like a balanced distribution with a nested 
> > structure for classroom and teacher (not for school). I mean 50 pupils 
> > belonging to C1, 50 other pupils belonging to C2, 50 other pupils belonging 
> > to C3 and so on. Then I want the 50 pupils belonging to C1 with T1, the 50 
> > pupils belonging to C2 with T2, the 50 pupils belonging to C3 with T3 and 
> > so on. The school don’t have to be nested, I just want a balanced 
> > distribution, I mean 60 pupils in S1, 60 other pupils in S2 and so on.
> > Here below the reproducible example.
> > Many thanks for your help.
> >
> > ##############
> > set.seed(123)
> > # Génération aléatoire des colonnes
> > pupils<-1:300
> > classroom<-sample(c("C1","C2","C3","C4","C5","C6"),300,replace=T)  
> > teacher<-sample(c("T1","T2","T3","T4","T5","T6"),300,replace=T)  
> > school<-sample(c("S1","S2","S3","S4","S5"),300,replace=T)
>
> > ##############
> >
> > ______________________________________________
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

>
> ______________________________________________
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to