On Mon, May 25, 2020 at 9:32 PM Gedare Bloom <ged...@rtems.org> wrote:
> On Mon, May 25, 2020 at 5:39 AM Utkarsh Rai <utkarsh.ra...@gmail.com> > wrote: > > > > > > On Fri, May 22, 2020, at 10:59 AM Gedare Bloom <ged...@rtems.org> wrote: > >> > >> > This means that our low-level design for providing thread stack > protection may look something like this:- > >> > > >> > 1. For MPU based processors, the number of protected stacks will > depend on the number of protection domains i.e. for MPUs with 8 protection > domains we can have 7 protected stacks ( 1 of the region will be assigned > for global data). For MMU based system we will have a section (a page of > size 1MB) for global data and task address space will be divided into > smaller pages, page sizes will be decided by keeping in mind the number of > TLB entries, in a manner I have described above in the thread. > >> > > >> There is value to defining a few of the global regions. I'll assume > >> R/W/X permissions. Then code (.text) should be R/X. read-only data > >> sections should be grouped together and made R. Data sections should > >> be RW. And then stacks should be added to the end. The linker scripts > >> should be used to group the related sections together. I think some > >> ARM BSPs do some of this already. That seems like a minimally useful > >> configuration for most users that would care, they want to have also > >> protection of code from accidental overwrite, and probably data too, > >> and non-executable data in general. You also may have to consider a > >> few more permission complications (shared/cacheable) depending on the > >> hardware. > > > > > > The low-level mmu implementation for ARMv7 BSPS has an > 'ARMV7_CP15_START_DEFAULT_SECTIONS' which lists out various regions with > appropriate permissions and then are grouped by a linker script. This > should be the standard way of handling the placement of statically > allocated regions. > > > >> > 2. The protection, size, page table, and sharing attributes of each > created thread will be tracked. > >> > > >> I'd rather we not be calling this a page table. MPU-based systems > >> don't have a notion of page table. But maybe it is OK as long as we > >> understand that you mean the data structure responsible for mapping > >> out the address space. I'm not sure what you mean by size, unless you > >> refer to that thread's stack. > >> > >> > 3. At every context switch, these attributes will be updated, the > static-global regions will be assigned a global ASID and will not change > during the switch only the protected regions will be updated. > >> > > >> Yes, assuming the hardware supports ASIDs and a global attribute. > >> > >> I don't know if you will be able to pin the global entries in > >> hardware. You'll want to keep an eye out for that. If not, you might > >> need to do something in software to ensure they don't get evicted > >> (e.g., touch them all before finishing a context switch assuming LRU > >> replacement). > >> > >> > 4. Whenever we share stacks, the page table entries of the shared > stack, with the access bits as specified by the mmap/shm high-level APIs > will be installed to the current thread. This is different from simply > providing the page table base address of the shared thread-stack ( what if > the user wants to make the shared thread only readable from another thread > while the 'original' thread is r/w enabled?) We will also have to update > the TLB by installing the shared regions while the global regions remain > untouched. > >> > > >> > >> Correct. I think we need to make a design decision whether a stack can > >> exceed one page. It will simplify things if we can assume that, but it > >> may limit applications unnecessarily. Have to think on that. > > > > > > If we go with the above assumption, we will need to increase the size of > the page i.e. pages of 16Kib or 64Kib. Most of the applications won't > require stacks of this size and will result in wasted memory for each > thread. I think it would be better if we have multiple pages, as most of > the applications will have stacks that may fit in a single 4KiB page anyway. > > > > I mis-typed. I meant I think we can assume stacks fit in one page. It > would be impossible to deal with otherwise. > > >> > >> The "page table base address" points to the entire structure that maps > >> out a thread's address space, so you'd have to walk it to find the > >> entry/entries for its stack. So, definitely not something you'd want > >> to do. > >> > >> The shm/mmap should convey the privileges to the requesting thread > >> asking to share. This will result in adding the shared entry/entries > >> to that thread's address space, with the appropriately set > >> permissions. So, if the entry is created with read-only permission, > >> then that is how the thread will be sharing. The original thread's > >> entry should not be modified by the addition of an entry in another > >> thread for the same memory region. > >> > >> I lean toward thinking it is better to always pay for the TLB miss at > >> the context switch, which might mean synthesizing accesses to the > >> entries that might have been evicted in case hardware restricts the > >> ability of sw to install/manipulate TLB entries directly. That is > >> something worth looking at more though. There is definitely a tradeoff > >> between predictable costs and throughput performance. It might be > >> worth implementing both approaches. > >> > >> Gedare > > > > > > We also need to consider the cases where the stack sharing would be > necessary- > > > > - We can have explicit cases where an application gets the attributes of > a thread by pthread_attr_getstack() and then access this from another > thread. > > > > - An implicit case would be when a thread places the address of an > object from its stack onto a message queue and we have other threads > accessing it, in general, all blocking reads (sockets, files etc.) will > share stacks. > > > > This will be documented so that the user first shares the required > stacks and then performs the above operations. > > > > Yes. It may also be worth thinking whether we can/should "relocate" > stacks when they get shared and spare TLB entries are low. This would > be a dynamic way to consolidate regions, while a static way would rely > on some configuration method to declare ahead of time which stacks may > be shared, or to require the stack allocator (hook) to manage that > kind of complexity. > Sorry but I am not sure I clearly understand what you are trying to suggest. Does relocating stacks mean moving them to the same virtual address as the thread-stack it is being shared with but with different ASID?
_______________________________________________ devel mailing list devel@rtems.org http://lists.rtems.org/mailman/listinfo/devel