Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-11-17 Thread Chris Jerdonek
On Sat, Nov 17, 2012 at 10:55 AM, Chris Angelico wrote: > On Sun, Nov 18, 2012 at 5:47 AM, Chris Jerdonek > wrote: >> On Thu, Oct 4, 2012 at 2:46 PM, wrote: >>> I really fail to see what problem people have with large source files. >>> What is it that you want to do that can be done easier if i

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-11-17 Thread Chris Angelico
On Sun, Nov 18, 2012 at 5:47 AM, Chris Jerdonek wrote: > On Thu, Oct 4, 2012 at 2:46 PM, wrote: >> I really fail to see what problem people have with large source files. >> What is it that you want to do that can be done easier if it's multiple >> files? > > One thing is browse or link to such c

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-11-17 Thread Chris Jerdonek
[Apologies for resurrecting a few-weeks old thread.] On Thu, Oct 4, 2012 at 2:46 PM, wrote: > > Zitat von Victor Stinner : > >> I only see one argument against such refactoring: it will be harder to >> backport/forwardport bugfixes. > > I'm opposed for a different reason: I think it will be *har

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread Stephen J. Turnbull
Antoine Pitrou writes: > Well, "tangled monolithic mess" is quite true about unicodeobject.c, > IMO. s/object.c// and your point remains valid. Just reading the table of contents for UTR#17 (http://www.unicode.org/reports/tr17/) should convince you that it's not going to be easy to produce an

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread Antoine Pitrou
On Thu, 25 Oct 2012 08:13:53 -0700 Larry Hastings wrote: > > I'm all for good software engineering practice. But can you cite > objective reasons why large source files are provably bad? Not "tangled > monolithic messes", not poorly-factored code. I agree that those are > bad--but so far no

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread Larry Hastings
On 10/24/2012 03:15 PM, Nick Coghlan wrote: Breaking such files up into separately compiled modules serves two purposes: 1. It proves that the code *isn't* a tangled monolithic mess; 2. It enlists the compilation toolchain's assistance in ensuring that remains the case in the future. Eith

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread Antoine Pitrou
Le 25/10/2012 00:15, Nick Coghlan a écrit : However, -1 on the "faux modularity" idea of breaking up the files on disk, but still exposing them to the compiler and linker as a monolithic block, though. That would be completely missing the point of why large source files are bad. I disagree wit

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread Antoine Pitrou
Le 25/10/2012 02:03, Nick Coghlan a écrit : speed.python.org is also making progress, and once that is up and running (which will happen well before any Python 3.4 release) it will be possible to compare the numbers between 3.3 and trunk to help determine the validity of any concerns regarding o

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread Nick Coghlan
On Thu, Oct 25, 2012 at 8:07 PM, Maciej Fijalkowski wrote: >> >> I think you misunderstood. What I described is the reason for having >> the base codecs in unicodeobject.c. >> >> I think we all agree that inlining has a positive effect on >> performance. The scale of the effect depends on the used

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread Serhiy Storchaka
On 25.10.12 12:49, M.-A. Lemburg wrote: I think you misunderstood. What I described is the reason for having the base codecs in unicodeobject.c. For example PyUnicode_FromStringAndSize and PyUnicode_FromString are thin wrappers around PyUnicode_DecodeUTF8Stateful. I think this is a reason to

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread Maciej Fijalkowski
> > I think you misunderstood. What I described is the reason for having > the base codecs in unicodeobject.c. > > I think we all agree that inlining has a positive effect on > performance. The scale of the effect depends on the used compiler > and platform. > Well. Inlining can have positive or n

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread Serhiy Storchaka
On 25.10.12 12:18, Maciej Fijalkowski wrote: I challenge you to find a benchmark that is being significantly affected (>15%) with the split proposed by Victor. It does not even have to be a real-world one, although that would definitely buy it more credibility. I see 10% slowdown for UTF-8 deco

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread M.-A. Lemburg
On 25.10.2012 11:18, Maciej Fijalkowski wrote: > On Thu, Oct 25, 2012 at 8:57 AM, M.-A. Lemburg wrote: >> On 25.10.2012 08:42, Nick Coghlan wrote: Why are any of these codecs here in unicodeobjectland in the first place? Sure, they're needed so that Python can find its own stuff, b

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread Maciej Fijalkowski
On Thu, Oct 25, 2012 at 8:57 AM, M.-A. Lemburg wrote: > On 25.10.2012 08:42, Nick Coghlan wrote: >>> Why are any of these codecs here in unicodeobjectland in the first >>> place? Sure, they're needed so that Python can find its own stuff, >>> but in principle *any* codec could be needed. Is it j

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread M.-A. Lemburg
On 25.10.2012 08:42, Nick Coghlan wrote: > unicodeobject.c is too big, and should be restructured to make any > natural modularity explicit, and provide an easier path for users that > want to understand how the unicode implementation works. You can also achieve that goal by structuring the code i

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-24 Thread M.-A. Lemburg
On 25.10.2012 08:42, Nick Coghlan wrote: >> Why are any of these codecs here in unicodeobjectland in the first >> place? Sure, they're needed so that Python can find its own stuff, >> but in principle *any* codec could be needed. Is it just an heuristic >> that the codecs needed for 99% of the wo

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-24 Thread Nick Coghlan
On Thu, Oct 25, 2012 at 2:22 PM, Stephen J. Turnbull wrote: > Nick Coghlan writes: > > > OK, I need to weigh in after seeing this kind of reply. Large source files > > are discouraged in general because they're a code smell that points > > strongly towards a *lack of modularity* within a *compl

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-24 Thread Stephen J. Turnbull
Nick Coghlan writes: > OK, I need to weigh in after seeing this kind of reply. Large source files > are discouraged in general because they're a code smell that points > strongly towards a *lack of modularity* within a *complex piece of > functionality*. Sure, but large numbers of tiny source

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-24 Thread Nick Coghlan
On Thu, Oct 25, 2012 at 8:37 AM, Barry Warsaw wrote: > On Oct 25, 2012, at 08:15 AM, Nick Coghlan wrote: > >>OK, I need to weigh in after seeing this kind of reply. Large source files >>are discouraged in general because they're a code smell that points >>strongly towards a *lack of modularity* wi

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-24 Thread Barry Warsaw
On Oct 25, 2012, at 08:15 AM, Nick Coghlan wrote: >OK, I need to weigh in after seeing this kind of reply. Large source files >are discouraged in general because they're a code smell that points >strongly towards a *lack of modularity* within a *complex piece of >functionality*. Modularity is goo

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-24 Thread Nick Coghlan
On Oct 25, 2012 2:06 AM, "Larry Hastings" wrote: > > On 10/23/2012 09:29 AM, Georg Brandl wrote: >> >> Especially since you're suggesting a huge number of new files, I question the >> argument of better navigability. > > > FWIW I'm -1 on it too. I don't see what the big deal is with "large" sourc

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-24 Thread Larry Hastings
On 10/23/2012 09:29 AM, Georg Brandl wrote: Especially since you're suggesting a huge number of new files, I question the argument of better navigability. FWIW I'm -1 on it too. I don't see what the big deal is with "large" source files. If you have difficulty finding your way around unicod

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-23 Thread Georg Brandl
On 10/23/2012 10:22 AM, Benjamin Peterson wrote: > 2012/10/22 Victor Stinner : >> Hi, >> >> I forked CPython repository to work on my "split unicodeobject.c" project: >> http://hg.python.org/sandbox/split-unicodeobject.c >> >> The result is 10 files (included the existing unicodeobject.c): >> >>

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-23 Thread Amaury Forgeot d'Arc
2012/10/23 Antoine Pitrou : > I agree with Marc-André, there's no point in compiling those files > separately. #include'ing them in the master unicodeobject.c file is fine. I also find the unicodeobject.c difficult to navigate. Even if we don't split the file, I'd advocate a better presentation of

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-23 Thread Antoine Pitrou
Le 23/10/2012 12:05, Victor Stinner a écrit : Such a restructuring should not result in compilers no longer being able to optimize code by inlining functions in one of the most important basic types we have in Python 3. I agree that performances are important. But I'm not convinced than moving

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-23 Thread Victor Stinner
> Such a restructuring should not result in compilers > no longer being able to optimize code by inlining functions > in one of the most important basic types we have in Python 3. I agree that performances are important. But I'm not convinced than moving functions has a real impact on performances

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-23 Thread M.-A. Lemburg
On 23.10.2012 10:22, Benjamin Peterson wrote: > 2012/10/22 Victor Stinner : >> Hi, >> >> I forked CPython repository to work on my "split unicodeobject.c" project: >> http://hg.python.org/sandbox/split-unicodeobject.c >> >> The result is 10 files (included the existing unicodeobject.c): >> >> 117

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-23 Thread Benjamin Peterson
2012/10/22 Victor Stinner : > Hi, > > I forked CPython repository to work on my "split unicodeobject.c" project: > http://hg.python.org/sandbox/split-unicodeobject.c > > The result is 10 files (included the existing unicodeobject.c): > > 1176 Objects/unicodecharmap.c > 1678 Objects/unicodecodec

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-07 Thread martin
Zitat von Victor Stinner : The amount of code will not be reduced, but now you also need to guess what file some piece of functionality may be in. How do you search a piece of code? I type / in vim, or Ctrl-s (incremental search) in Emacs. If you search for a function by its name, it does

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-07 Thread Chris Angelico
On Mon, Oct 8, 2012 at 8:17 AM, Victor Stinner wrote: > Another problem with huge files is to handle "dependencies" with > static functions. If the function A calls the function B which calls > the function C, you have to order A, B and C "correctly" if these > functions are private and not declar

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-07 Thread Benjamin Peterson
2012/10/7 Victor Stinner : > Another problem with huge files is to handle "dependencies" with > static functions. If the function A calls the function B which calls > the function C, you have to order A, B and C "correctly" if these > functions are private and not declared at the top of the file.

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-07 Thread Victor Stinner
> The amount of code will not be reduced, but now you also need to guess what > file some piece of functionality may be in. How do you search a piece of code? If you search for a function by its name, it does not matter in which file it is defined if you an IDE or vim/emacs with a correct configur

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-05 Thread Chris Jerdonek
On Thu, Oct 4, 2012 at 6:49 PM, Stephen J. Turnbull wrote: > Chris Jerdonek writes: > > > You can create multiple files this way. I just verified it. But the > > problem happens with merging. You will create merge conflicts in the > > deleted portions of every split file on every merge. The

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-05 Thread M.-A. Lemburg
Victor Stinner wrote: > Hi, > > I would like to split the huge unicodeobject.c file into smaller > files. It's just the longest C file of CPython: 14,849 lines. > > I don't know exactly how to split it, but first I would like to know > if you would agree with the idea. > > Example: > - Objects/

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Benjamin Peterson
2012/10/4 Antoine Pitrou : > On Thu, 04 Oct 2012 23:46:57 +0200 > mar...@v.loewis.de wrote: >> >> Zitat von Victor Stinner : >> >> > I only see one argument against such refactoring: it will be harder to >> > backport/forwardport bugfixes. >> >> I'm opposed for a different reason: I think it will b

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Stephen J. Turnbull
Chris Jerdonek writes: > You can create multiple files this way. I just verified it. But the > problem happens with merging. You will create merge conflicts in the > deleted portions of every split file on every merge. There may be a > way to avoid this that I don't know about though (i.e.

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Antoine Pitrou
On Thu, 04 Oct 2012 23:46:57 +0200 mar...@v.loewis.de wrote: > > Zitat von Victor Stinner : > > > I only see one argument against such refactoring: it will be harder to > > backport/forwardport bugfixes. > > I'm opposed for a different reason: I think it will be *harder* to maintain. > The amoun

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Eric V. Smith
On 10/4/2012 4:30 PM, Victor Stinner wrote: > Hi, > > I would like to split the huge unicodeobject.c file into smaller > files. It's just the longest C file of CPython: 14,849 lines. What problem are you trying to solve? -- Eric. ___ Python-Dev mailin

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Chris Jerdonek
On Thu, Oct 4, 2012 at 4:31 PM, Benjamin Peterson wrote: > 2012/10/4 Victor Stinner : >>> I am not siding with either side of the change yet, but an additional >>> argument against is that history may become less convenient to >>> navigate and track (e.g. hg annotate may lose information depending

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Victor Stinner
2012/10/5 Benjamin Peterson : > 2012/10/4 Victor Stinner : >> If new files are created using "hg cp unicodeobject.c >> unicode/newfile.c", the historic is kept. > > Yes, but you can only create one file that way. You can create as many files as you want. Try: --- hg cp unicodeobject.c unicode2.c h

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Benjamin Peterson
2012/10/4 Victor Stinner : >> I am not siding with either side of the change yet, but an additional >> argument against is that history may become less convenient to >> navigate and track (e.g. hg annotate may lose information depending on >> how the split is done). > > If new files are created usi

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Victor Stinner
> I am not siding with either side of the change yet, but an additional > argument against is that history may become less convenient to > navigate and track (e.g. hg annotate may lose information depending on > how the split is done). If new files are created using "hg cp unicodeobject.c unicode/

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Chris Jerdonek
On Thu, Oct 4, 2012 at 1:30 PM, Victor Stinner wrote: > I would like to split the huge unicodeobject.c file into smaller > files. It's just the longest C file of CPython: 14,849 lines. > ... > I only see one argument against such refactoring: it will be harder to > backport/forwardport bugfixes.

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Benjamin Peterson
2012/10/4 Victor Stinner : > 2012/10/4 Benjamin Peterson : >> 2012/10/4 Victor Stinner : >>> I only see one argument against such refactoring: it will be harder to >>> backport/forwardport bugfixes. >> >> I imagine it could also prevent inlining of hot paths. > > It depends how the code is compiled

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Victor Stinner
2012/10/4 Benjamin Peterson : > 2012/10/4 Victor Stinner : >> I only see one argument against such refactoring: it will be harder to >> backport/forwardport bugfixes. > > I imagine it could also prevent inlining of hot paths. It depends how the code is compiled. The stringlib is splitted in many .

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread martin
Zitat von Victor Stinner : I only see one argument against such refactoring: it will be harder to backport/forwardport bugfixes. I'm opposed for a different reason: I think it will be *harder* to maintain. The amount of code will not be reduced, but now you also need to guess what file some p

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Benjamin Peterson
2012/10/4 Victor Stinner : > I only see one argument against such refactoring: it will be harder to > backport/forwardport bugfixes. I imagine it could also prevent inlining of hot paths. -- Regards, Benjamin ___ Python-Dev mailing list Python-Dev@py

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-04 Thread Andrew Svetlov
I like the idea. From my perspective better to use subdirectory to sake of easy finding in grep style. On Thu, Oct 4, 2012 at 11:30 PM, Victor Stinner wrote: > Hi, > > I would like to split the huge unicodeobject.c file into smaller > files. It's just the longest C file of CPython: 14,849 lines.