[PATCH] D141230: [clang-format-diff.py] give clang-format-diff a job pool (10x speed)

2023-01-08 Thread Sean Maher via Phabricator via cfe-commits
seanptmaher created this revision.
seanptmaher added a reviewer: MyDeveloperDay.
Herald added a project: All.
seanptmaher requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

This patch changes the implementation of clang-format-diff.py to 
start up many clang-format processes in parallel in order to speed
up clang-format-diff.py by several orders of magnitude on large
patches (less than a minute instead of several tens of minutes).


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D141230

Files:
  clang/tools/clang-format/clang-format-diff.py

Index: clang/tools/clang-format/clang-format-diff.py
===
--- clang/tools/clang-format/clang-format-diff.py
+++ clang/tools/clang-format/clang-format-diff.py
@@ -25,6 +25,7 @@
 
 import argparse
 import difflib
+import os
 import re
 import subprocess
 import sys
@@ -99,46 +100,63 @@
   ['-lines', str(start_line) + ':' + str(end_line)])
 
   # Reformat files containing changes in place.
-  for filename, lines in lines_by_file.items():
-if args.i and args.verbose:
-  print('Formatting {}'.format(filename))
-command = [args.binary, filename]
-if args.i:
-  command.append('-i')
-if args.sort_includes:
-  command.append('-sort-includes')
-command.extend(lines)
-if args.style:
-  command.extend(['-style', args.style])
-if args.fallback_style:
-  command.extend(['-fallback-style', args.fallback_style])
-
+  lbf = list(lines_by_file.items())
+  procs = None
+  try:
+procs = [None for i in range(len(os.sched_getaffinity(0)) * 8)]
+  except AttributeError as e:
+# os.sched_getaffinity isn't defined on all platforms.
+import multiprocessing
 try:
-  p = subprocess.Popen(command,
-   stdout=subprocess.PIPE,
-   stderr=None,
-   stdin=subprocess.PIPE,
-   universal_newlines=True)
-except OSError as e:
-  # Give the user more context when clang-format isn't
-  # found/isn't executable, etc.
-  raise RuntimeError(
-'Failed to run "%s" - %s"' % (" ".join(command), e.strerror))
-
-stdout, stderr = p.communicate()
-if p.returncode != 0:
-  sys.exit(p.returncode)
+  procs = [None for i in range(multiprocessing.cpu_count())]
+except NotImplementedError as e:
+  # Fallback to 8 concurrent processes in the case that CPU count cannot be
+  # determined.
+  procs = [None for i in range(8)]
 
-if not args.i:
-  with open(filename) as f:
-code = f.readlines()
-  formatted_code = StringIO(stdout).readlines()
-  diff = difflib.unified_diff(code, formatted_code,
-  filename, filename,
-  '(before formatting)', '(after formatting)')
-  diff_string = ''.join(diff)
-  if len(diff_string) > 0:
-sys.stdout.write(diff_string)
+  while lbf:
+for i, proc in enumerate(procs):
+  if not lbf:
+break
+  if proc is None or proc.poll() is not None:
+if proc is not None:
+  stdout, stderr = proc.communicate()
+  if proc.returncode != 0:
+sys.exit(proc.returncode)
+  if not args.i:
+with open(filename) as f:
+  code = f.readlines()
+formatted_code = StringIO(stdout).readlines()
+diff = difflib.unified_diff(code, formatted_code,
+filename, filename,
+'(before formatting)', '(after formatting)')
+diff_string = ''.join(diff)
+if len(diff_string) > 0:
+  sys.stdout.write(diff_string)
+filename, lines = lbf.pop()
+if args.i and args.verbose:
+  print('Formatting {}'.format(filename))
+command = [args.binary, filename]
+if args.i:
+  command.append('-i')
+if args.sort_includes:
+  command.append('-sort-includes')
+command.extend(lines)
+if args.style:
+  command.extend(['-style', args.style])
+if args.fallback_style:
+  command.extend(['-fallback-style', args.fallback_style])
+try:
+  procs[i] = (subprocess.Popen(command,
+stdout=subprocess.PIPE,
+stderr=None,
+stdin=subprocess.PIPE,
+universal_newlines=True))
+except OSError as e:
+  # Give the user more context when clang-format isn't
+  # found/isn't executable, etc.
+  raise RuntimeError(
+  'Failed to run "%s" - %s"' % (" ".join(command), e.strerror))
 
 if __name__ == '__main__':
   main()
___
cfe-commits mailing list

[PATCH] D141230: [clang-format-diff.py] give clang-format-diff a job pool (10x speed)

2023-01-08 Thread Sean Maher via Phabricator via cfe-commits
seanptmaher updated this revision to Diff 487243.
seanptmaher added a comment.

Update python formatting


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141230/new/

https://reviews.llvm.org/D141230

Files:
  clang/tools/clang-format/clang-format-diff.py


Index: clang/tools/clang-format/clang-format-diff.py
===
--- clang/tools/clang-format/clang-format-diff.py
+++ clang/tools/clang-format/clang-format-diff.py
@@ -147,11 +147,11 @@
 if args.fallback_style:
   command.extend(['-fallback-style', args.fallback_style])
 try:
-  procs[i] = (subprocess.Popen(command,
-stdout=subprocess.PIPE,
-stderr=None,
-stdin=subprocess.PIPE,
-universal_newlines=True))
+  procs[i] = subprocess.Popen(command,
+  stdout=subprocess.PIPE,
+  stderr=None,
+  stdin=subprocess.PIPE,
+  universal_newlines=True)
 except OSError as e:
   # Give the user more context when clang-format isn't
   # found/isn't executable, etc.


Index: clang/tools/clang-format/clang-format-diff.py
===
--- clang/tools/clang-format/clang-format-diff.py
+++ clang/tools/clang-format/clang-format-diff.py
@@ -147,11 +147,11 @@
 if args.fallback_style:
   command.extend(['-fallback-style', args.fallback_style])
 try:
-  procs[i] = (subprocess.Popen(command,
-stdout=subprocess.PIPE,
-stderr=None,
-stdin=subprocess.PIPE,
-universal_newlines=True))
+  procs[i] = subprocess.Popen(command,
+  stdout=subprocess.PIPE,
+  stderr=None,
+  stdin=subprocess.PIPE,
+  universal_newlines=True)
 except OSError as e:
   # Give the user more context when clang-format isn't
   # found/isn't executable, etc.
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D141230: [clang-format-diff.py] give clang-format-diff a job pool (10x speed)

2023-01-09 Thread Sean Maher via Phabricator via cfe-commits
seanptmaher updated this revision to Diff 487407.
seanptmaher added a comment.

Possibly fix the patchset? Sorry I've never used phabricator berfore.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141230/new/

https://reviews.llvm.org/D141230

Files:
  clang/tools/clang-format/clang-format-diff.py

Index: clang/tools/clang-format/clang-format-diff.py
===
--- clang/tools/clang-format/clang-format-diff.py
+++ clang/tools/clang-format/clang-format-diff.py
@@ -25,6 +25,7 @@
 
 import argparse
 import difflib
+import os
 import re
 import subprocess
 import sys
@@ -99,46 +100,63 @@
   ['-lines', str(start_line) + ':' + str(end_line)])
 
   # Reformat files containing changes in place.
-  for filename, lines in lines_by_file.items():
-if args.i and args.verbose:
-  print('Formatting {}'.format(filename))
-command = [args.binary, filename]
-if args.i:
-  command.append('-i')
-if args.sort_includes:
-  command.append('-sort-includes')
-command.extend(lines)
-if args.style:
-  command.extend(['-style', args.style])
-if args.fallback_style:
-  command.extend(['-fallback-style', args.fallback_style])
-
+  lbf = list(lines_by_file.items())
+  procs = None
+  try:
+procs = [None for i in range(len(os.sched_getaffinity(0)) * 8)]
+  except AttributeError as e:
+# os.sched_getaffinity isn't defined on all platforms.
+import multiprocessing
 try:
-  p = subprocess.Popen(command,
-   stdout=subprocess.PIPE,
-   stderr=None,
-   stdin=subprocess.PIPE,
-   universal_newlines=True)
-except OSError as e:
-  # Give the user more context when clang-format isn't
-  # found/isn't executable, etc.
-  raise RuntimeError(
-'Failed to run "%s" - %s"' % (" ".join(command), e.strerror))
-
-stdout, stderr = p.communicate()
-if p.returncode != 0:
-  sys.exit(p.returncode)
+  procs = [None for i in range(multiprocessing.cpu_count())]
+except NotImplementedError as e:
+  # Fallback to 8 concurrent processes in the case that CPU count cannot be
+  # determined.
+  procs = [None for i in range(8)]
 
-if not args.i:
-  with open(filename) as f:
-code = f.readlines()
-  formatted_code = StringIO(stdout).readlines()
-  diff = difflib.unified_diff(code, formatted_code,
-  filename, filename,
-  '(before formatting)', '(after formatting)')
-  diff_string = ''.join(diff)
-  if len(diff_string) > 0:
-sys.stdout.write(diff_string)
+  while lbf:
+for i, proc in enumerate(procs):
+  if not lbf:
+break
+  if proc is None or proc.poll() is not None:
+if proc is not None:
+  stdout, stderr = proc.communicate()
+  if proc.returncode != 0:
+sys.exit(proc.returncode)
+  if not args.i:
+with open(filename) as f:
+  code = f.readlines()
+formatted_code = StringIO(stdout).readlines()
+diff = difflib.unified_diff(code, formatted_code,
+filename, filename,
+'(before formatting)', '(after formatting)')
+diff_string = ''.join(diff)
+if len(diff_string) > 0:
+  sys.stdout.write(diff_string)
+filename, lines = lbf.pop()
+if args.i and args.verbose:
+  print('Formatting {}'.format(filename))
+command = [args.binary, filename]
+if args.i:
+  command.append('-i')
+if args.sort_includes:
+  command.append('-sort-includes')
+command.extend(lines)
+if args.style:
+  command.extend(['-style', args.style])
+if args.fallback_style:
+  command.extend(['-fallback-style', args.fallback_style])
+try:
+  procs[i] = subprocess.Popen(command,
+  stdout=subprocess.PIPE,
+  stderr=None,
+  stdin=subprocess.PIPE,
+  universal_newlines=True)
+except OSError as e:
+  # Give the user more context when clang-format isn't
+  # found/isn't executable, etc.
+  raise RuntimeError(
+  'Failed to run "%s" - %s"' % (" ".join(command), e.strerror))
 
 if __name__ == '__main__':
   main()
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D141230: [clang-format-diff.py] give clang-format-diff a job pool (10x speed)

2023-01-12 Thread Sean Maher via Phabricator via cfe-commits
seanptmaher updated this revision to Diff 488706.
seanptmaher added a comment.

Add jobs flag


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141230/new/

https://reviews.llvm.org/D141230

Files:
  clang/tools/clang-format/clang-format-diff.py

Index: clang/tools/clang-format/clang-format-diff.py
===
--- clang/tools/clang-format/clang-format-diff.py
+++ clang/tools/clang-format/clang-format-diff.py
@@ -25,6 +25,7 @@
 
 import argparse
 import difflib
+import os
 import re
 import subprocess
 import sys
@@ -34,7 +35,6 @@
 else:
 from io import BytesIO as StringIO
 
-
 def main():
   parser = argparse.ArgumentParser(description=__doc__,
formatter_class=
@@ -65,6 +65,9 @@
   'file to use.')
   parser.add_argument('-binary', default='clang-format',
   help='location of binary to use for clang-format')
+  parser.add_argument('-j', default=1, type=int, metavar='N',
+  help='number of concurrent clang-format processes to spawn in '
+  'parallel')
   args = parser.parse_args()
 
   # Extract changed lines for each file.
@@ -99,46 +102,59 @@
   ['-lines', str(start_line) + ':' + str(end_line)])
 
   # Reformat files containing changes in place.
-  for filename, lines in lines_by_file.items():
-if args.i and args.verbose:
-  print('Formatting {}'.format(filename))
-command = [args.binary, filename]
-if args.i:
-  command.append('-i')
-if args.sort_includes:
-  command.append('-sort-includes')
-command.extend(lines)
-if args.style:
-  command.extend(['-style', args.style])
-if args.fallback_style:
-  command.extend(['-fallback-style', args.fallback_style])
-
-try:
-  p = subprocess.Popen(command,
-   stdout=subprocess.PIPE,
-   stderr=None,
-   stdin=subprocess.PIPE,
-   universal_newlines=True)
-except OSError as e:
-  # Give the user more context when clang-format isn't
-  # found/isn't executable, etc.
-  raise RuntimeError(
-'Failed to run "%s" - %s"' % (" ".join(command), e.strerror))
-
-stdout, stderr = p.communicate()
-if p.returncode != 0:
-  sys.exit(p.returncode)
-
-if not args.i:
-  with open(filename) as f:
-code = f.readlines()
-  formatted_code = StringIO(stdout).readlines()
-  diff = difflib.unified_diff(code, formatted_code,
-  filename, filename,
-  '(before formatting)', '(after formatting)')
-  diff_string = ''.join(diff)
-  if len(diff_string) > 0:
-sys.stdout.write(diff_string)
+  lbf = list(lines_by_file.items())
+  procs = [None for i in range(args.j)]
+  while lbf:
+spawned_one = False
+for i, proc in enumerate(procs):
+  if not lbf:
+break
+  if proc is None or proc.poll() is not None:
+if proc is not None:
+  stdout, stderr = proc.communicate()
+  if proc.returncode != 0:
+sys.exit(proc.returncode)
+  if not args.i:
+with open(filename) as f:
+  code = f.readlines()
+formatted_code = StringIO(stdout).readlines()
+diff = difflib.unified_diff(code, formatted_code,
+filename, filename,
+'(before formatting)',
+'(after formatting)')
+diff_string = ''.join(diff)
+if len(diff_string) > 0:
+  sys.stdout.write(diff_string)
+filename, lines = lbf.pop()
+spawned_one = True
+if args.i and args.verbose:
+  print('Formatting {}'.format(filename))
+command = [args.binary, filename]
+if args.i:
+  command.append('-i')
+if args.sort_includes:
+  command.append('-sort-includes')
+command.extend(lines)
+if args.style:
+  command.extend(['-style', args.style])
+if args.fallback_style:
+  command.extend(['-fallback-style', args.fallback_style])
+try:
+  procs[i] = subprocess.Popen(command,
+  stdout=subprocess.PIPE,
+  stderr=None,
+  stdin=subprocess.PIPE,
+  universal_newlines=True)
+except OSError as e:
+  # Give the user more context when clang-format isn't
+  # found/isn't executable, etc.
+  raise RuntimeError(
+  'Failed to run "%s" - %s"' % (" ".join(command), e.strerror))
+# If we didn't spawn a single process after iterating through the whole
+# list, wait on one of them to finish until we iterat

[PATCH] D141230: [clang-format-diff.py] give clang-format-diff a job pool (10x speed)

2023-01-12 Thread Sean Maher via Phabricator via cfe-commits
seanptmaher added a comment.

Yeah, the jobs flag is a much better implementation. Thanks for the comments. 
I've implemented that.  I was indeed making the assumption that this is all 
that's running, etc. as you mentioned.

The use case I was using this with was running diff formatting of very large 
diffs (automated mass refactors changing thousands of files) on the chromium 
codebase -- I'm not sure how to "include" such a use case as you mentioned. I 
could write a benchmark if you'd like?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141230/new/

https://reviews.llvm.org/D141230

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D141230: [clang-format-diff.py] give clang-format-diff a job pool (10x speed)

2023-01-12 Thread Sean Maher via Phabricator via cfe-commits
seanptmaher updated this revision to Diff 488708.
seanptmaher added a comment.

fix the formatting


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141230/new/

https://reviews.llvm.org/D141230

Files:
  clang/tools/clang-format/clang-format-diff.py

Index: clang/tools/clang-format/clang-format-diff.py
===
--- clang/tools/clang-format/clang-format-diff.py
+++ clang/tools/clang-format/clang-format-diff.py
@@ -25,6 +25,7 @@
 
 import argparse
 import difflib
+import os
 import re
 import subprocess
 import sys
@@ -34,7 +35,6 @@
 else:
 from io import BytesIO as StringIO
 
-
 def main():
   parser = argparse.ArgumentParser(description=__doc__,
formatter_class=
@@ -65,6 +65,9 @@
   'file to use.')
   parser.add_argument('-binary', default='clang-format',
   help='location of binary to use for clang-format')
+  parser.add_argument('-j', default=1, type=int, metavar='N',
+  help='number of concurrent clang-format processes to spawn in '
+  'parallel')
   args = parser.parse_args()
 
   # Extract changed lines for each file.
@@ -99,46 +102,59 @@
   ['-lines', str(start_line) + ':' + str(end_line)])
 
   # Reformat files containing changes in place.
-  for filename, lines in lines_by_file.items():
-if args.i and args.verbose:
-  print('Formatting {}'.format(filename))
-command = [args.binary, filename]
-if args.i:
-  command.append('-i')
-if args.sort_includes:
-  command.append('-sort-includes')
-command.extend(lines)
-if args.style:
-  command.extend(['-style', args.style])
-if args.fallback_style:
-  command.extend(['-fallback-style', args.fallback_style])
-
-try:
-  p = subprocess.Popen(command,
-   stdout=subprocess.PIPE,
-   stderr=None,
-   stdin=subprocess.PIPE,
-   universal_newlines=True)
-except OSError as e:
-  # Give the user more context when clang-format isn't
-  # found/isn't executable, etc.
-  raise RuntimeError(
-'Failed to run "%s" - %s"' % (" ".join(command), e.strerror))
-
-stdout, stderr = p.communicate()
-if p.returncode != 0:
-  sys.exit(p.returncode)
-
-if not args.i:
-  with open(filename) as f:
-code = f.readlines()
-  formatted_code = StringIO(stdout).readlines()
-  diff = difflib.unified_diff(code, formatted_code,
-  filename, filename,
-  '(before formatting)', '(after formatting)')
-  diff_string = ''.join(diff)
-  if len(diff_string) > 0:
-sys.stdout.write(diff_string)
+  lbf = list(lines_by_file.items())
+  procs = [None for i in range(args.j)]
+  while lbf:
+spawned_one = False
+for i, proc in enumerate(procs):
+  if not lbf:
+break
+  if proc is None or proc.poll() is not None:
+if proc is not None:
+  stdout, stderr = proc.communicate()
+  if proc.returncode != 0:
+sys.exit(proc.returncode)
+  if not args.i:
+with open(filename) as f:
+  code = f.readlines()
+formatted_code = StringIO(stdout).readlines()
+diff = difflib.unified_diff(code, formatted_code,
+filename, filename,
+'(before formatting)',
+'(after formatting)')
+diff_string = ''.join(diff)
+if len(diff_string) > 0:
+  sys.stdout.write(diff_string)
+filename, lines = lbf.pop()
+spawned_one = True
+if args.i and args.verbose:
+  print('Formatting {}'.format(filename))
+command = [args.binary, filename]
+if args.i:
+  command.append('-i')
+if args.sort_includes:
+  command.append('-sort-includes')
+command.extend(lines)
+if args.style:
+  command.extend(['-style', args.style])
+if args.fallback_style:
+  command.extend(['-fallback-style', args.fallback_style])
+try:
+  procs[i] = subprocess.Popen(command,
+  stdout=subprocess.PIPE,
+  stderr=None,
+  stdin=subprocess.PIPE,
+  universal_newlines=True)
+except OSError as e:
+  # Give the user more context when clang-format isn't
+  # found/isn't executable, etc.
+  raise RuntimeError(
+  'Failed to run "%s" - %s"' % (" ".join(command), e.strerror))
+# If we didn't spawn a single process after iterating through the whole
+# list, wait on one of them to finish until we i

[PATCH] D141230: [clang-format-diff.py] give clang-format-diff a job pool (10x speed)

2023-01-13 Thread Sean Maher via Phabricator via cfe-commits
seanptmaher updated this revision to Diff 489084.
seanptmaher added a comment.

Fix bug in -j implementation where proper waiting wasn't happening


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141230/new/

https://reviews.llvm.org/D141230

Files:
  clang/tools/clang-format/clang-format-diff.py

Index: clang/tools/clang-format/clang-format-diff.py
===
--- clang/tools/clang-format/clang-format-diff.py
+++ clang/tools/clang-format/clang-format-diff.py
@@ -25,6 +25,7 @@
 
 import argparse
 import difflib
+import os
 import re
 import subprocess
 import sys
@@ -34,6 +35,21 @@
 else:
 from io import BytesIO as StringIO
 
+def process_subprocess_result(proc, args):
+  stdout, stderr = proc.communicate()
+  if proc.returncode != 0:
+sys.exit(proc.returncode)
+  if not args.i:
+with open(filename) as f:
+  code = f.readlines()
+formatted_code = StringIO(stdout).readlines()
+diff = difflib.unified_diff(code, formatted_code,
+filename, filename,
+'(before formatting)',
+'(after formatting)')
+diff_string = ''.join(diff)
+if len(diff_string) > 0:
+  sys.stdout.write(diff_string)
 
 def main():
   parser = argparse.ArgumentParser(description=__doc__,
@@ -65,6 +81,9 @@
   'file to use.')
   parser.add_argument('-binary', default='clang-format',
   help='location of binary to use for clang-format')
+  parser.add_argument('-j', default=1, type=int, metavar='N',
+  help='number of concurrent clang-format processes to spawn in '
+  'parallel')
   args = parser.parse_args()
 
   # Extract changed lines for each file.
@@ -99,46 +118,54 @@
   ['-lines', str(start_line) + ':' + str(end_line)])
 
   # Reformat files containing changes in place.
-  for filename, lines in lines_by_file.items():
-if args.i and args.verbose:
-  print('Formatting {}'.format(filename))
-command = [args.binary, filename]
-if args.i:
-  command.append('-i')
-if args.sort_includes:
-  command.append('-sort-includes')
-command.extend(lines)
-if args.style:
-  command.extend(['-style', args.style])
-if args.fallback_style:
-  command.extend(['-fallback-style', args.fallback_style])
-
-try:
-  p = subprocess.Popen(command,
-   stdout=subprocess.PIPE,
-   stderr=None,
-   stdin=subprocess.PIPE,
-   universal_newlines=True)
-except OSError as e:
-  # Give the user more context when clang-format isn't
-  # found/isn't executable, etc.
-  raise RuntimeError(
-'Failed to run "%s" - %s"' % (" ".join(command), e.strerror))
-
-stdout, stderr = p.communicate()
-if p.returncode != 0:
-  sys.exit(p.returncode)
-
-if not args.i:
-  with open(filename) as f:
-code = f.readlines()
-  formatted_code = StringIO(stdout).readlines()
-  diff = difflib.unified_diff(code, formatted_code,
-  filename, filename,
-  '(before formatting)', '(after formatting)')
-  diff_string = ''.join(diff)
-  if len(diff_string) > 0:
-sys.stdout.write(diff_string)
+  lbf = list(lines_by_file.items())
+  procs = [None for i in range(args.j)]
+  while lbf:
+spawned_one = False
+for i, proc in enumerate(procs):
+  if not lbf:
+break
+  if proc is not None and proc.poll() is not None:
+process_subprocess_result(proc, args)
+# Set to None to flag the slot as free to start a new process
+procs[i] = None
+proc = None
+  if proc is None:
+filename, lines = lbf.pop()
+spawned_one = True
+if args.i and args.verbose:
+  print('Formatting {}'.format(filename))
+command = [args.binary, filename]
+if args.i:
+  command.append('-i')
+if args.sort_includes:
+  command.append('-sort-includes')
+command.extend(lines)
+if args.style:
+  command.extend(['-style', args.style])
+if args.fallback_style:
+  command.extend(['-fallback-style', args.fallback_style])
+try:
+  procs[i] = subprocess.Popen(command,
+  stdout=subprocess.PIPE,
+  stderr=None,
+  stdin=subprocess.PIPE,
+  universal_newlines=True)
+except OSError as e:
+  # Give the user more context when clang-format isn't
+  # found/isn't executable, etc.
+  raise RuntimeError(
+  'Failed to run "%s" - %s"' % (" ".join(command), e.strerror))
+# If we didn't spawn a single proc

[PATCH] D141230: [clang-format-diff.py] give clang-format-diff a job pool (10x speed)

2023-01-13 Thread Sean Maher via Phabricator via cfe-commits
seanptmaher added a comment.

It's slightly hard to test this because it involves making a huge diff of a git 
repo. If you're willing to jump through the hoops of cloning chromium 
(honestly, this is a pain in the ass. It's too much hard drive space and you've 
got to install depot_tools. If you don't already have google code on your 
machine, you're probably going to want this to be a temporary thing) and 
running a few commands, I think I've figured out how you could test this.

First, clone the repo using the instructions you find here: 
https://www.chromium.org/developers/how-tos/get-the-code/

Then, checkout commit hash 0e9d17d1b6414621228115ee535c7becf67e1c62.

Then, apply the patch file that I've just attached. It is a 6000-file change, 
where I've changed a bunch of includes.
F26167714: 0001-remove-ALL-INCLUDES.patch 

Then, finally, to format that patch using the  clang-format-diff.py file, run 
the following:

  $ time git diff -U0 --no-color --relative HEAD^ | 
buildtools/clang_format/script/clang-format-diff.py -p1 -i

On my machine, that gives:

  
  Executed in   16.56 minsfish   external
 usr time   15.01 mins1.02 millis   15.01 mins
 sys time1.70 mins0.71 millis1.70 mins

This will take quite some time. After it finishes, save a diff of the 
filesystem with `git add . && git diff --cached > ../slow-diff`, (important not 
to save it in your working directory or else you'd mess up the next diff!)
Then `git reset . && git restore .` to start over.

Then, run the same command but with the new clang-format-diff.py, and pass it 
as many jobs as your computer could handle.

  $ time git diff -U0 --no-color --relative HEAD^ | 
path/to/the/new/clang-format-diff.py -p1 -i -j{16,128}

On my machine, that runs in just above 20 seconds.

  
  Executed in   22.24 secsfish   external
 usr time   16.71 mins0.00 millis   16.71 mins
 sys time2.37 mins2.05 millis2.37 mins

Finally, save a diff of this again, like something like so: `git add . && git 
diff --cached > ../fast-diff`

Then finally you can diff the two diffs:

  spvw@this ~/Documents/code/chromium/src ((eb2c565c…))> diff ../slow-diff 
../fast-diff
  spvw@this ~/Documents/code/chromium/src ((eb2c565c…))>

Showing that the two are identical.

Are there other ways in which you'd want to test this?

i'm not sure how to write unit-test-like tests for this, so if you'd like me 
to, I'd appreciate a pointer...


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141230/new/

https://reviews.llvm.org/D141230

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D141230: [clang-format-diff.py] give clang-format-diff a job pool (10x speed)

2023-01-13 Thread Sean Maher via Phabricator via cfe-commits
seanptmaher updated this revision to Diff 489086.
seanptmaher added a comment.

No need to include os anymore, I'm not checking number of cores


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141230/new/

https://reviews.llvm.org/D141230

Files:
  clang/tools/clang-format/clang-format-diff.py

Index: clang/tools/clang-format/clang-format-diff.py
===
--- clang/tools/clang-format/clang-format-diff.py
+++ clang/tools/clang-format/clang-format-diff.py
@@ -34,6 +34,21 @@
 else:
 from io import BytesIO as StringIO
 
+def process_subprocess_result(proc, args):
+  stdout, stderr = proc.communicate()
+  if proc.returncode != 0:
+sys.exit(proc.returncode)
+  if not args.i:
+with open(filename) as f:
+  code = f.readlines()
+formatted_code = StringIO(stdout).readlines()
+diff = difflib.unified_diff(code, formatted_code,
+filename, filename,
+'(before formatting)',
+'(after formatting)')
+diff_string = ''.join(diff)
+if len(diff_string) > 0:
+  sys.stdout.write(diff_string)
 
 def main():
   parser = argparse.ArgumentParser(description=__doc__,
@@ -65,6 +80,9 @@
   'file to use.')
   parser.add_argument('-binary', default='clang-format',
   help='location of binary to use for clang-format')
+  parser.add_argument('-j', default=1, type=int, metavar='N',
+  help='number of concurrent clang-format processes to spawn in '
+  'parallel')
   args = parser.parse_args()
 
   # Extract changed lines for each file.
@@ -99,46 +117,54 @@
   ['-lines', str(start_line) + ':' + str(end_line)])
 
   # Reformat files containing changes in place.
-  for filename, lines in lines_by_file.items():
-if args.i and args.verbose:
-  print('Formatting {}'.format(filename))
-command = [args.binary, filename]
-if args.i:
-  command.append('-i')
-if args.sort_includes:
-  command.append('-sort-includes')
-command.extend(lines)
-if args.style:
-  command.extend(['-style', args.style])
-if args.fallback_style:
-  command.extend(['-fallback-style', args.fallback_style])
-
-try:
-  p = subprocess.Popen(command,
-   stdout=subprocess.PIPE,
-   stderr=None,
-   stdin=subprocess.PIPE,
-   universal_newlines=True)
-except OSError as e:
-  # Give the user more context when clang-format isn't
-  # found/isn't executable, etc.
-  raise RuntimeError(
-'Failed to run "%s" - %s"' % (" ".join(command), e.strerror))
-
-stdout, stderr = p.communicate()
-if p.returncode != 0:
-  sys.exit(p.returncode)
-
-if not args.i:
-  with open(filename) as f:
-code = f.readlines()
-  formatted_code = StringIO(stdout).readlines()
-  diff = difflib.unified_diff(code, formatted_code,
-  filename, filename,
-  '(before formatting)', '(after formatting)')
-  diff_string = ''.join(diff)
-  if len(diff_string) > 0:
-sys.stdout.write(diff_string)
+  lbf = list(lines_by_file.items())
+  procs = [None for i in range(args.j)]
+  while lbf:
+spawned_one = False
+for i, proc in enumerate(procs):
+  if not lbf:
+break
+  if proc is not None and proc.poll() is not None:
+process_subprocess_result(proc, args)
+# Set to None to flag the slot as free to start a new process
+procs[i] = None
+proc = None
+  if proc is None:
+filename, lines = lbf.pop()
+spawned_one = True
+if args.i and args.verbose:
+  print('Formatting {}'.format(filename))
+command = [args.binary, filename]
+if args.i:
+  command.append('-i')
+if args.sort_includes:
+  command.append('-sort-includes')
+command.extend(lines)
+if args.style:
+  command.extend(['-style', args.style])
+if args.fallback_style:
+  command.extend(['-fallback-style', args.fallback_style])
+try:
+  procs[i] = subprocess.Popen(command,
+  stdout=subprocess.PIPE,
+  stderr=None,
+  stdin=subprocess.PIPE,
+  universal_newlines=True)
+except OSError as e:
+  # Give the user more context when clang-format isn't
+  # found/isn't executable, etc.
+  raise RuntimeError(
+  'Failed to run "%s" - %s"' % (" ".join(command), e.strerror))
+# If we didn't spawn a single process after iterating through the whole
+# list, wait on one of them to finish until we iterate through aga

[PATCH] D141230: [clang-format-diff.py] give clang-format-diff a job pool (10x speed)

2023-01-13 Thread Sean Maher via Phabricator via cfe-commits
seanptmaher added a comment.

Thanks a lot for the review, by the way.

I realize the patch wasn't up to snuff initially, but glad to get it in order 
with your help.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141230/new/

https://reviews.llvm.org/D141230

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D141230: [clang-format-diff.py] give clang-format-diff a job pool (10x speed)

2023-01-15 Thread Sean Maher via Phabricator via cfe-commits
seanptmaher added a comment.

I can't actually commit this -- could one of you? Or has it already been done? 
I remember reading that someone had to manually push the patch upstream.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141230/new/

https://reviews.llvm.org/D141230

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D141230: [clang-format-diff.py] give clang-format-diff a job pool (10x speed)

2023-03-02 Thread Sean Maher via Phabricator via cfe-commits
seanptmaher added a comment.

Hey,

Sure. The name is Sean Maher and the email is s...@chromium.org

Thanks for your help.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141230/new/

https://reviews.llvm.org/D141230

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D141230: [clang-format-diff.py] give clang-format-diff a job pool (10x speed)

2023-02-16 Thread Sean Maher via Phabricator via cfe-commits
seanptmaher added a comment.

Hey, bump on this. Still waiting on the commit.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141230/new/

https://reviews.llvm.org/D141230

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits