Hello Joel, On Friday 22 of August 2014 17:25:24 Joel Sherrill wrote: > Pushed. > > Followups can just be subsequent patches.
thanks, you are faster than light ... As for the RTEMS timekeeping code, I can imagine how it could look better. I do not like Clock_driver_nanoseconds_since_last_tick. I am not even sure if it is really used by TOD (i.e. in ticker test seems to print rounded values on our board). In the fact I would like to see RTEMS to work completely tickless on hardware with modern free runing timebase and easily updated compare event hardware. That would allow to implement all POSIX time related functions with resolution limited only by hardware. Scheduler is a question. Wen more than one task of same priority are ready to run then tick is easiest but even in such case slice time can be computed and only event for its overflow timer event is set. But all that is huge amount of work. I would start with easier side now. It is necessary to have reliable timebase. Consider 64 bit value running at some clock source speed. It is really hard to have that reliable on PC hardware, the common base 8254 can be used for that but access is horribly slow. All other mechanisms (HPET, TSC) are problematic - need probe and check that they are correct and synchronous between cores, do not change with sleep modes etc. Really difficult task which is solved by thousands lines of code by Linux kernel. But ARM and PowerPC based systems usually provide reasonable timer source register which is synchronized over all cores. Unfortuantelly, ARM ones provide usually only 32 bits wide register. I have solved problem how to extend that 32 bit counter to 64 bit for one my friend who worked at BlackBerry. Their phones platform uses Cortex-A and QNX. The design constrains has been given by usecase - userspace events timestamping in QML profiller. This adds constrain that code can be called on more cores concurrently, using mutex would degrade performance horribly, privileged instructions cannot be used and value available from core was only 32 bit. I have designed for him attached code fragments and he has written some Qt derived code which is was used in Q10 phone debugging builds. The main ideas is to write extension to more than 60 bits without locking and use GCC builtin atomic support to ensure that counter overflow results only in single increment of higher value part. The only requirement for correct function is that clockCyclesExt() is called at least once per half of the counter overflow period and its execution is not interrupted for longer than equivalent time. Code even minimizes cache write contention cases. What do you think about use of this approach in RTEMS? Then next step is to base timing on values which are not based on the ticks. I have seen that discussion about NTP time format (integer seconds + 1/2^32 fractions). Other option is 64bit nsec which is better regard 2038 overflow problem. The priority queue for finegrained timers ordering is tough task. It would worth to have all operations with additional paremeter about required precision for each interval/time event etc ... But that is for longer discussion and incremental solution. I cannot provide my full time for such enhancements anyway. But it could be nice project if funding is found. I have friend who has grants from ESA to develop theory for precise time sources fussion (atomic clocks etc) and works on real hardware for satelite based clock synchronization too. We have spoken about Linux kernel NTP time synchronization and PLL loop long time ago and both gone to same conclusion how it should be done right way. I would be interresting to have this solution in RTEMS as well. But to do it right it would require some agency/company funded project. We have even networking cards with full IEEE-1588 HW support there for Intel and some articles about our findings regarding problem to synchronize time where most problematic part are latencies between ETHERNET card hardware and CPU core. They are even more problematic than precise time over local ETHERNET LAN ... So I think that there is enough competent people to come with something usesfull. But most of them cannot afford to work on it only for their pleassure. OK, that some dump of my ideas. I need to switch to other HW testing now to sustain our company and university project above sea level. Best wishes, Pavel
/* gcc -Wall atomic-extend-cc.c */ /*************************************************************************** * * * Copyright (c) 2014, Pavel Pisa <p...@cmp.felk.cvut.cz> * * All rights reserved. * * * * Redistribution and use in source and binary forms, with or without * * modification, are permitted provided that the following conditions are * * met: * * * * 1. Redistributions of source code must retain the above copyright * * notice, this list of conditions and the following disclaimer * * - or license is changed to one of following standard licenses * * BSD, GPL (even with linking exception), LGPL, MPL * * * * 2. Redistributions in binary form must reproduce the above copyright * * notice, this list of conditions and the following disclaimer in the * * documentation and/or other materials provided with the * * distribution. * * * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * * ***************************************************************************/ #include <stdint.h> /* for GCC 4.8 when C11 is implemented */ /* #include <stdatomic.h> */ uint32_t hw_cnt; uint64_t extended_cnt; uint32_t hw_cnt_test_seq[]={ 0x10000000, 0x20000000, 0x20000000, 0x20000001, 0x20000002, 0xC0000000, 0xF0000000, }; uint64_t ClockCycles(void) { static int seq_pos = 0; if(seq_pos>=sizeof(hw_cnt_test_seq)/sizeof(*hw_cnt_test_seq)) seq_pos = 0; hw_cnt = hw_cnt_test_seq[seq_pos++]; return (uint64_t)hw_cnt << 32; } #if 0 uint64_t clockCyclesExt() { static uint32_t wrap_count = 0; static uint32_t recent_cc = 0; mutex.lock(); uint32_t cc = (uint32_t)(ClockCycles() >> 32); if(cc < recent_cc) wrap_count++; recent_cc = cc; uint64_t ret = wrap_count; mutex.unlock(); ret <<= 32; ret += cc; return ret; } #else /* Has to be one or more bits, additional bits result in relaxed requirement for relative timing of calls for minimal required frequncy of the calls. Minimal frequency for 1 bit is given by CC/2, for more there is allowed jitter up period almost equal to CC */ #define CC_HI_MASK 0xC0000000 uint64_t clockCyclesExt() { static uint32_t cc_wrap_and_hi = 0; uint32_t wahi, expected; uint32_t cc; #if (__GNUC__ * 1000 + __GNUC_MINOR__) >= 4007 wahi = __atomic_load_n(&cc_wrap_and_hi, __ATOMIC_SEQ_CST); #else /* OLD GCC */ wahi = *(volatile uint32_t *)&cc_wrap_and_hi; __sync_synchronize(); #endif /* OLD GCC */ cc = (uint32_t)(ClockCycles() >> 32); if ((cc ^ wahi) & CC_HI_MASK) { /* Slow path */ expected = wahi; if (cc < wahi) wahi++; wahi &= ~CC_HI_MASK; wahi |= cc & CC_HI_MASK; #if (__GNUC__ * 1000 + __GNUC_MINOR__) >= 4007 /* __atomic_compare_exchange_n (type *ptr, type *expected, type desired, bool weak, int success_memmodel, int failure_memmodel) */ __atomic_compare_exchange_n(&cc_wrap_and_hi, &expected, wahi, 0, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST); #else /* OLD GCC */ /*bool __sync_bool_compare_and_swap (type *ptr, type oldval type newval, ...)*/ __sync_bool_compare_and_swap(&cc_wrap_and_hi, expected, wahi); #endif /* OLD GCC */ } uint64_t ret = wahi & ~CC_HI_MASK; ret <<= 32; ret |= cc; return ret; } #endif #include <stdio.h> #include <inttypes.h> int main(int argc, char *argv[]) { int i; for(i = 0; i < 100; i++) { extended_cnt = clockCyclesExt(); printf("%016"PRIx64"\n", extended_cnt); } return 0; }
/* g++ -Wall -std=c++11 atomic-extend-cc.cpp */ /*************************************************************************** * * * Copyright (c) 2014, Pavel Pisa <p...@cmp.felk.cvut.cz> * * All rights reserved. * * * * Redistribution and use in source and binary forms, with or without * * modification, are permitted provided that the following conditions are * * met: * * * * 1. Redistributions of source code must retain the above copyright * * notice, this list of conditions and the following disclaimer * * - or license is changed to one of following standard licenses * * BSD, GPL (even with linking exception), LGPL, MPL * * * * 2. Redistributions in binary form must reproduce the above copyright * * notice, this list of conditions and the following disclaimer in the * * documentation and/or other materials provided with the * * distribution. * * * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * * ***************************************************************************/ #include <atomic> uint32_t hw_cnt; uint64_t extended_cnt; uint64_t ClockCycles(void) { return (uint64_t)hw_cnt << 32; } #if 0 uint64_t clockCyclesExt() { static uint32_t wrap_count = 0; static uint32_t recent_cc = 0; mutex.lock(); uint32_t cc = (uint32_t)(ClockCycles() >> 32); if(cc < recent_cc) wrap_count++; recent_cc = cc; uint64_t ret = wrap_count; mutex.unlock(); ret <<= 32; ret += cc; return ret; } #else /* Has to be one or more bits, additional bits result in relaxed requirement for relative timing of calls for minimal required frequncy of the calls. Minimal frequency for 1 bit is given by CC/2, for more there is allowed jitter up period almost equal to CC */ #define CC_HI_MASK 0xC0000000 uint64_t clockCyclesExt() { static std::atomic<uint32_t> cc_wrap_and_hi = ATOMIC_VAR_INIT(0); uint32_t wahi, expected; wahi = cc_wrap_and_hi.load(std::memory_order_seq_cst); uint32_t cc = (uint32_t)(ClockCycles() >> 32); if ((cc ^ wahi) & CC_HI_MASK) { /* Slow path */ expected = wahi; if (cc < wahi) wahi++; wahi &= ~CC_HI_MASK; wahi |= cc & CC_HI_MASK; /* __atomic_compare_exchange_n (type *ptr, type *expected, type desired, bool weak, int success_memmodel, int failure_memmodel) */ cc_wrap_and_hi.compare_exchange_strong(expected, wahi, std::memory_order_acq_rel); } uint64_t ret = wahi & ~CC_HI_MASK; ret <<= 32; ret |= cc; return ret; } #endif int main(int argc, char *argv[]) { extended_cnt = clockCyclesExt(); return 0; }
_______________________________________________ devel mailing list devel@rtems.org http://lists.rtems.org/mailman/listinfo/devel