On 20/12/2018 01:46, Chris Johns wrote:
I have read the ticket and not the patch.
On 20/12/2018 08:00, Joel Sherrill wrote:
How does this relate.to <http://relate.to> the existing infrastructure for the
capture engine? It seems duplicative.
Yes it is confusing. With this change we would have 3 possible trace systems,
the capture engine, rtems-tld and this. I would like just one way to trace a
live system.
This patch just adds per-processor ring buffers to record events. The
capture engine does the same and in addition has support for triggers
and filters. The rtems-tld generates trace events, it could use the
stuff from this patch to record them. This was actually the original use
case for this work which I started some months ago for the tracing GSoC
project.
I am OK with the capture engine being replaced with something new and better
that provides all the functionality of the capture engine, ie triggers, floors
etc. The ticket does not state if this is what will happen. It would be good if
this can be clarified. Filtering at runtime is an important performance
consideration.
Event recording is a separate problem from filtering and triggering
events. This patch set only addresses the recording. It tries to do this
as efficient as possible to be able to record high frequency events. It
intentionally does not deal with filtering and triggering. There is a
user extension to generate thread events, but this is just a special
user of this event recoding.
It is also not clear to me why the capture engine cannot be improved to have the
capture method presented here integrated. Sure, currently it does not have the
low level performance because of the locks used but it is not clear to me why
this code cannot be changed?
The locks could be removed, the real problem are the record items.
/**
* @brief The 32-bit format record item.
*/
typedef struct {
uint32_t event;
uint32_t data;
} rtems_record_item_32;
/**
* @brief A capture timestamp.
*
* This is a nanosecond capture timestamp
*/
typedef uint64_t rtems_capture_time;
/*
* @brief Capture record.
*
* This is a record that is written into
* the buffer. The events includes the priority of the task
* at the time of the context switch.
*/
typedef struct rtems_capture_record
{
size_t size;
uint32_t events;
rtems_id task_id;
rtems_capture_time time;
} RTEMS_PACKED rtems_capture_record;
So, this is 8 bytes vs. 20 bytes (this is 250% more and a big deal for
high frequency events). This capture engine ring buffer is more
complicated due to the support for variable length records. I uses
memcpy() to copy the data into the record item. Please compare this with
the rtems_record_produce() assembly code in the ticket.
Getting the timestamps in the capture engine is expensive. In this patch
the CPU counter is used in combination with periodic uptime timestamps
for synchronization. A 22-bit field is used for the timestamp. This is
enough to a support 4GHz CPU counter with a 1000Hz clock tick, the T4240
has a CPU counter frequency of 1.5GHz.
Can we please establish the requirements and how this sits with what we have
sorted out before reviewing patches and details?
My use case for this is a nasty bug. We have about 50 nodes running an
RTEMS application connected via Ethernet. The nodes operate 24/7.
Approximately every 60 days one of the nodes crashes and this leads to a
20 seconds service interruption. Nobody was harmed so far, but it is
still an issue which needs to be fixed. I need a fast (faster than the
capture engine) event recording mechanism for post-mortem analysis.
The basic requirement is recording of high frequency events on high end
embedded SMP machines, e.g. 1.5GHz T4240 with 24 processors and 10Gbit/s
Ethernet.
An important part of any trace work is host integration and the tools to manage
this. Getting data off via TCP, UDP, file etc is only part of the problem so it
would good to know what the integration path on the host is.
The patch set includes a TCP client and server, see test case record01.
Attached is a simple command line tool which I used on Linux to get the
event records (32-bit big endian RTEMS machine, 64-bit little endian
development machine). Some files of this patch are intended to be
compiled on the development machine.
Sorry but I have no time to review this and consider it until next year.
No problem, take your time. I work on this since April this year from
time to time, so it can wait a couple of more weeks.
--
Sebastian Huber, embedded brains GmbH
Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.
Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
/*
* SPDX-License-Identifier: BSD-2-Clause
*
* Copyright (C) 2018 embedded brains GmbH
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/
#include <rtems/recorddata.h>
#include <rtems/recordclient.h>
#include <sys/socket.h>
#include <assert.h>
#include <getopt.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <inttypes.h>
#include <netinet/in.h>
#include <arpa/inet.h>
static const struct option longopts[] = {
{ "help", 0, NULL, 'h' },
{ "host", 1, NULL, 'H' },
{ "port", 1, NULL, 'p' },
{ NULL, 0, NULL, 0 }
};
static const char *host = "127.0.0.1";
static uint16_t port = 1234;
static void usage( char **argv )
{
printf(
"%s --host=HOST --port=PORT\n"
"\n"
"Mandatory arguments to long options are mandatory for short options too.\n"
" -h, --help print this help text\n"
" -H, --host=HOST the host IPv4 address of the record server\n"
" -p, --port=PORT the TCP port of the record server\n",
argv[ 0 ]
);
}
static int connect_client( void )
{
struct sockaddr_in in_addr;
int fd;
int rv;
fd = socket( PF_INET, SOCK_STREAM, 0 );
assert( fd >= 0 );
memset( &in_addr, 0, sizeof( in_addr ) );
in_addr.sin_family = AF_INET;
in_addr.sin_port = htons( port );
in_addr.sin_addr.s_addr = inet_addr( host );
rv = connect( fd, (struct sockaddr *) &in_addr, sizeof( in_addr ) );
assert( rv == 0 );
return fd;
}
static rtems_record_client_status handler(
uint32_t seconds,
uint32_t nanoseconds,
uint32_t cpu,
rtems_record_event event,
uint64_t data,
void *arg
)
{
const char *event_text;
(void) arg;
if (
event == RTEMS_RECORD_UPTIME_LOW
|| event == RTEMS_RECORD_UPTIME_HIGH
) {
return RTEMS_RECORD_CLIENT_SUCCESS;
}
if ( seconds != 0 && nanoseconds != 0 ) {
printf( "%" PRIu32 ".%09" PRIu32 ":", seconds, nanoseconds );
} else {
printf( "*:" );
}
event_text = rtems_record_event_text( event );
if ( event_text != NULL ) {
printf( "%" PRIu32 ":%s:%" PRIx64 "\n", cpu, event_text, data );
} else {
printf( "%" PRIu32 ":%i:%" PRIx64 "\n", cpu, event, data );
}
return RTEMS_RECORD_CLIENT_SUCCESS;
}
int main( int argc, char **argv )
{
rtems_record_client_context ctx;
int fd;
int rv;
int opt;
int longindex;
while (
(opt = getopt_long(argc, argv, "hH:p:", &longopts[0], &longindex)) != -1
) {
switch (opt) {
case 'h':
usage(argv);
exit(EXIT_SUCCESS);
break;
case 'H':
host = optarg;
break;
case 'p':
port = (uint16_t) strtoul(optarg, NULL, 10);
break;
default:
exit(EXIT_FAILURE);
break;
}
}
fd = connect_client();
rtems_record_client_init( &ctx, handler, NULL );
while ( true ) {
int buf[ 1024 ];
ssize_t n;
n = recv( fd, buf, sizeof( buf ), 0 );
if ( n >= 0 ) {
rtems_record_client_run( &ctx, buf, (size_t) n );
} else {
break;
}
}
rv = close( fd );
assert( rv == 0 );
return 0;
}
_______________________________________________
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel