Re: [PATCH] tests: Explicitly unset DEBUGINFOD_URLS.

2020-02-24 Thread Mark Wielaard
On Fri, 2020-02-21 at 13:49 +0100, Mark Wielaard wrote:
> If DEBUGINFOD_URLS is set various tests will try to query the debuginfod
> server which can stall the tests a bit. If other evironment variables
> like DEBUGINFOD_PROGRESS are set it will make various tests fail because
> the expected output doesn't match. Tests should PASS without needing a
> debuginfod server, unless they test (and set) one themselves.

Pushed to master.


patch to commit soon: PR25375 debuginfod prefetching

2020-02-24 Thread Frank Ch. Eigler
Hi -

This patch has been baking on my public servers awhile and can make a
huge difference in performance.  It's not something immediately
obvious how or whether to test, as it's a pure performance improvement.
Planning to push shortly.

commit ae8b89716116a6df124f8700f77460b5e97c12c4 (origin/fche/pr25375)
Author: Frank Ch. Eigler 
Date:   Sat Jan 25 18:43:07 2020 -0500

PR25375: fdcache prefetching to reduce repeated archive decompression

diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx
index 0acd70e4a916..7f432d7d9c41 100644
--- a/debuginfod/debuginfod.cxx
+++ b/debuginfod/debuginfod.cxx
@@ -355,6 +355,8 @@ static const struct argp_option options[] =
{ "fdcache-fds", ARGP_KEY_FDCACHE_FDS, "NUM", 0, "Maximum number of archive 
files to keep in fdcache.", 0 },
 #define ARGP_KEY_FDCACHE_MBS 0x1002
{ "fdcache-mbs", ARGP_KEY_FDCACHE_MBS, "MB", 0, "Maximum total size of 
archive file fdcache.", 0 },
+#define ARGP_KEY_FDCACHE_PREFETCH 0x1003
+   { "fdcache-prefetch", ARGP_KEY_FDCACHE_PREFETCH, "NUM", 0, "Number of 
archive files to prefetch into fdcache.", 0 },
{ NULL, 0, NULL, 0, NULL, 0 }
   };
 
@@ -394,6 +396,7 @@ static regex_t file_exclude_regex;
 static bool traverse_logical;
 static long fdcache_fds;
 static long fdcache_mbs;
+static long fdcache_prefetch;
 static string tmpdir;
 
 static void set_metric(const string& key, int64_t value);
@@ -476,6 +479,9 @@ parse_opt (int key, char *arg,
 case ARGP_KEY_FDCACHE_MBS:
   fdcache_mbs = atol (arg);
   break;
+case ARGP_KEY_FDCACHE_PREFETCH:
+  fdcache_prefetch = atol (arg);
+  break;
 case ARGP_KEY_ARG:
   source_paths.insert(string(arg));
   break;
@@ -975,14 +981,14 @@ class libarchive_fdcache
 string archive;
 string entry;
 string fd;
-long fd_size_mb; // rounded up megabytes
+double fd_size_mb; // slightly rounded up megabytes
   };
   deque lru; // @head: most recently used
   long max_fds;
   long max_mbs;
 
 public:
-  void intern(const string& a, const string& b, string fd, off_t sz)
+  void intern(const string& a, const string& b, string fd, off_t sz, bool 
front_p)
   {
 {
   unique_lock lock(fdcache_lock);
@@ -995,11 +1001,15 @@ class libarchive_fdcache
   break; // must not continue iterating
 }
 }
-  long mb = ((sz+1023)/1024+1023)/1024;
+  double mb = (sz+65535)/1048576.0; // round up to 64K block
   fdcache_entry n = { a, b, fd, mb };
-  lru.push_front(n);
+  if (front_p)
+lru.push_front(n);
+  else
+lru.push_back(n);
 if (verbose > 3)
-  obatched(clog) << "fdcache interned a=" << a << " b=" << b << " fd=" << 
fd << " mb=" << mb << endl;
+  obatched(clog) << "fdcache interned a=" << a << " b=" << b
+ << " fd=" << fd << " mb=" << mb << " front=" << front_p 
<< endl;
 }
 
 this->limit(max_fds, max_mbs); // age cache if required
@@ -1022,6 +1032,17 @@ class libarchive_fdcache
 return -1;
   }
 
+  int probe(const string& a, const string& b) // just a cache residency check 
- don't modify LRU state, don't open
+  {
+unique_lock lock(fdcache_lock);
+for (auto i = lru.begin(); i < lru.end(); i++)
+  {
+if (i->archive == a && i->entry == b)
+  return true;
+  }
+return false;
+  }
+
   void clear(const string& a, const string& b)
   {
 unique_lock lock(fdcache_lock);
@@ -1047,7 +1068,7 @@ class libarchive_fdcache
 this->max_mbs = maxmbs;
 
 long total_fd = 0;
-long total_mb = 0;
+double total_mb = 0.0;
 for (auto i = lru.begin(); i < lru.end(); i++)
   {
 // accumulate totals from most recently used one going backward
@@ -1117,6 +1138,7 @@ handle_buildid_r_match (int64_t b_mtime,
   return 0;
 }
 
+  // check for a match in the fdcache first
   int fd = fdcache.lookup(b_source0, b_source1);
   while (fd >= 0) // got one!; NB: this is really an if() with a possible 
branch out to the end
 {
@@ -1152,6 +1174,7 @@ handle_buildid_r_match (int64_t b_mtime,
   // NB: see, we never go around the 'loop' more than once
 }
 
+  // no match ... grumble, must process the archive
   string archive_decoder = "/dev/null";
   string archive_extension = "";
   for (auto&& arch : scan_archives)
@@ -1196,8 +1219,19 @@ handle_buildid_r_match (int64_t b_mtime,
   if (rc != ARCHIVE_OK)
 throw archive_exception(a, "cannot open archive from pipe");
 
-  while(1) // parse cpio archive entries
+  // archive traversal is in three stages:
+  // 1) skip entries whose names do not match the requested one
+  // 2) extract the matching entry name (set r = result)
+  // 3) extract some number of prefetched entries (just into fdcache)
+  // 4) abort any further processing
+  struct MHD_Response* r = 0; // will set in stage 2
+  unsigned prefetch_count = fdcache_prefetch; // will decrement in stage 3
+
+  while(r == 0 || prefetch_count > 0) // stage 1, 2, or 3

[Bug debuginfod/25369] make DEBUGINFOD_PROGRESS prettier

2020-02-24 Thread fche at redhat dot com
https://sourceware.org/bugzilla/show_bug.cgi?id=25369

Frank Ch. Eigler  changed:

   What|Removed |Added

   Assignee|unassigned at sourceware dot org   |fche at redhat dot com

--- Comment #1 from Frank Ch. Eigler  ---
drafting extensions to the client side api to make this usable

-- 
You are receiving this mail because:
You are on the CC list for the bug.

PR25369 rfc: debuginfod client api extension for progressfn prettying etc.

2020-02-24 Thread Frank Ch. Eigler
Hi -

As a part of PR25369, I propose this small set of client api
extensions, requested by gdb developers and needed by personal
experience.  (I plan to roll it out on my debuginfod servers shortly
to let it soak.)  An end-user visible immediate difference is in the
DEBUGINFOD_PROGRESS=1 format message, which now looks like this:

Downloading from debuginfod /
Downloading from debuginfod -
Downloading from debuginfod \
Downloading from 
http://very:8222/buildid/817dcbd2ce0ecce19f22cbe5e136b6d9196428aa/executable 
433130328/433130328

This latter is a bit long and should probably be abbreviated, wdyt?

I'd start reviewing this from the debuginfod.h header file outward.


commit 4ace515d6b9ce92ea4a808eba4d608851ee9b56d (fche/pr25369)
Author: Frank Ch. Eigler 
Date:   Mon Feb 24 22:23:55 2020 -0500

PR25369: client api extensions, progress prettying

This batch of changes extends the client api with a group of optional
functions that allow:
- debuginfod to pass along User-Agent and appropriate XFF to
  others, important for tracking in federated mode
- client applications to query URLs, get/set a void* parameter
- let the DEBUGINFOD_PROGRESS=1 progress handler show pretty URLs

diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c
index 186aa90a53bd..8d321aead2de 100644
--- a/debuginfod/debuginfod-client.c
+++ b/debuginfod/debuginfod-client.c
@@ -74,11 +74,22 @@
   #include 
 #endif
 
+
 struct debuginfod_client
 {
   /* Progress/interrupt callback function. */
   debuginfod_progressfn_t progressfn;
 
+  /* Stores user data. */
+  void* user_data;
+
+  /* Accumulates outgoing http header names/values. */
+  int user_agent_set_p; /* affects add_default_headers */
+  struct curl_slist *headers;
+
+  /* The 'winner' CURL handle for downloading, if any. */
+  CURL *target_handle;
+
   /* Can contain all other context, like cache_path, server_urls,
  timeout or other info gotten from environment variables, the
  handle data, etc. So those don't have to be reparsed and
@@ -291,8 +302,11 @@ debuginfod_clean_cache(debuginfod_client *c,
 
 
 static void
-add_extra_headers(CURL *handle)
+add_default_headers(debuginfod_client *client)
 {
+  if (client->user_agent_set_p)
+return;
+
   /* Compute a User-Agent: string to send.  The more accurately this
  describes this host, the likelier that the debuginfod servers
  might be able to locate debuginfo for us. */
@@ -352,7 +366,7 @@ add_extra_headers(CURL *handle)
 }
 
   char *ua = NULL;
-  rc = asprintf(& ua, "%s/%s,%s,%s/%s",
+  rc = asprintf(& ua, "User-Agent: %s/%s,%s,%s/%s",
 PACKAGE_NAME, PACKAGE_VERSION,
 utspart ?: "",
 id ?: "",
@@ -361,7 +375,7 @@ add_extra_headers(CURL *handle)
 ua = NULL;
 
   if (ua)
-curl_easy_setopt(handle, CURLOPT_USERAGENT, (void*) ua); /* implicit 
strdup */
+(void) debuginfod_add_http_header (client, ua);
 
   free (ua);
   free (id);
@@ -532,9 +546,6 @@ debuginfod_query_server (debuginfod_client *c,
 && (i == 0 || server_urls[i - 1] == url_delim_char))
   num_urls++;
 
-  /* Tracks which handle should write to fd. Set to the first
- handle that is ready to write the target file to the cache.  */
-  CURL *target_handle = NULL;
   struct handle_data *data = malloc(sizeof(struct handle_data) * num_urls);
 
   /* Initalize handle_data with default values. */
@@ -556,10 +567,11 @@ debuginfod_query_server (debuginfod_client *c,
   char *server_url = strtok_r(server_urls, url_delim, &strtok_saveptr);
 
   /* Initialize each handle.  */
-  for (int i = 0; i < num_urls && server_url != NULL; i++)
+  c->target_handle = NULL; /* reset for this lookup */
+ for (int i = 0; i < num_urls && server_url != NULL; i++)
 {
   data[i].fd = fd;
-  data[i].target_handle = &target_handle;
+  data[i].target_handle = &c->target_handle;
   data[i].handle = curl_easy_init();
 
   if (data[i].handle == NULL)
@@ -603,7 +615,7 @@ debuginfod_query_server (debuginfod_client *c,
   curl_easy_setopt(data[i].handle, CURLOPT_NOSIGNAL, (long) 1);
   curl_easy_setopt(data[i].handle, CURLOPT_AUTOREFERER, (long) 1);
   curl_easy_setopt(data[i].handle, CURLOPT_ACCEPT_ENCODING, "");
-  add_extra_headers(data[i].handle);
+  curl_easy_setopt(data[i].handle, CURLOPT_HTTPHEADER, c->headers);
 
   curl_multi_add_handle(curlm, data[i].handle);
   server_url = strtok_r(NULL, url_delim, &strtok_saveptr);
@@ -618,9 +630,9 @@ debuginfod_query_server (debuginfod_client *c,
   curl_multi_wait(curlm, NULL, 0, 1000, NULL);
 
   /* If the target file has been found, abort the other queries.  */
-  if (target_handle != NULL)
+  if (c->target_handle != NULL)
 for (int i = 0; i < num_urls; i++)
-  if (data[i].handle != target_handle)
+  if (data[i].handle != c->target_handle)
 curl_multi_remove_handle(curlm, data[i].handle);
 
   CURLMcod

Re: PR25369 rfc: debuginfod client api extension for progressfn prettying etc.

2020-02-24 Thread Simon Marchi
On 2020-02-24 10:35 p.m., Frank Ch. Eigler wrote:
> Hi -
> 
> As a part of PR25369, I propose this small set of client api
> extensions, requested by gdb developers and needed by personal
> experience.  (I plan to roll it out on my debuginfod servers shortly
> to let it soak.)  An end-user visible immediate difference is in the
> DEBUGINFOD_PROGRESS=1 format message, which now looks like this:
> 
> Downloading from debuginfod /
> Downloading from debuginfod -
> Downloading from debuginfod \
> Downloading from 
> http://very:8222/buildid/817dcbd2ce0ecce19f22cbe5e136b6d9196428aa/executable 
> 433130328/433130328
> 
> This latter is a bit long and should probably be abbreviated, wdyt?
> 
> I'd start reviewing this from the debuginfod.h header file outward.
> 
> 
> commit 4ace515d6b9ce92ea4a808eba4d608851ee9b56d (fche/pr25369)
> Author: Frank Ch. Eigler 
> Date:   Mon Feb 24 22:23:55 2020 -0500
> 
> PR25369: client api extensions, progress prettying
> 
> This batch of changes extends the client api with a group of optional
> functions that allow:
> - debuginfod to pass along User-Agent and appropriate XFF to
>   others, important for tracking in federated mode
> - client applications to query URLs, get/set a void* parameter
> - let the DEBUGINFOD_PROGRESS=1 progress handler show pretty URLs

Hi Frank,

As far as I am concerned (with my GDB hat), this looks good.

The "URL" section of the man page could maybe talk about the lifetime of the
returned string, specifically say that it's only guaranteed to be valid during
the execution of the callback.  If you need it for longer, then you need to copy
it.

Thanks!

Simon