what to do with multiple BOMs
Channeling unicode text experts and xml people: I have xml entity with initial bytes ff fe ff fe which the file command says is UTF-16, little-endian text. I agree, but what should be done about the additional BOM. A test output made many years ago seems to keep the extra BOM. The xml context is xml file 014.xml ]> &e;\xef\xbb\xbfdata' which implies seems as though the extra BOM in the entity has been kept and processed into a different BOM meaning utf8. I think the test file is wrong and that multiple BOM chars in the entiry should have been removed. Am I right? -- Robin Becker -- https://mail.python.org/mailman/listinfo/python-list
Re: on perhaps unloading modules?
Chris Angelico writes: > On Tue, Aug 17, 2021 at 4:02 AM Greg Ewing > wrote: >> The second best way would be to not use import_module, but to >> exec() the student's code. That way you don't create an entry in >> sys.modules and don't have to worry about somehow unloading the >> module. > > I would agree with this. If you need to mess around with modules and > you don't want them to be cached, avoid the normal "import" mechanism, > and just exec yourself a module's worth of code. Sounds like a plan. Busy, haven't been able to try it out. But I will. Soon. Thank you! -- https://mail.python.org/mailman/listinfo/python-list
Re: on perhaps unloading modules?
Martin Di Paola writes: > This may not answer your question but it may provide an alternative > solution. > > I had the same challenge that you an year ago so may be my solution will > work for you too. > > Imagine that you have a Markdown file that *documents* the expected > results. > > This is the final exam, good luck! > > First I'm going to load your code (the student's code): > > ```python import student > ``` > > Let's see if you programmed correctly a sort algorithm > > ```python data = [3, 2, 1, 3, 1, 9] student.sort_numbers(data) > [1, 1, 2, 3, 3, 9] > ``` > > Let's now if you can choose the correct answer: > > ```python t = ["foo", "bar", "baz"] student.question1(t) > "baz" > ``` > > Now you can run the snippets of code with: > >byexample -l python the_markdown_file.md > > What byexample does is to run the Python code, capture the output and > compare it with the expected result. > > In the above example "student.sort_numbers" must return the list > sorted. > That output is compared by byexample with the list written below. > > Advantages? Each byexample run is independent of the other and the > snippet of codes are executed in a separated Python process. byexample > takes care of the IPC. > > I don't know the details of your questions so I'm not sure if byexample > will be the tool for you. In my case I evaluate my students giving them > the Markdown and asking them to code the functions so they return the > expected values. Currently procedures in one question are used in another question. Nevertheless, perhaps I could (in other tests) design something different. Although, to be honest, I would rather not have to use something like Markdown because that means more syntax for students. > Depending of how many students you have you may considere to > complement this with INGInious. It is designed to run students' > assignments assuming nothing on the untrusted code. > > Links: > > https://byexamples.github.io/byexample/ > https://docs.inginious.org/en/v0.7/ INGInious looks pretty interesting. Thank you! -- https://mail.python.org/mailman/listinfo/python-list
Re: some problems for an introductory python test
Chris Angelico writes: > On Tue, Aug 17, 2021 at 3:51 AM Hope Rouselle > wrote: >> >> Chris Angelico writes: >> >> Wow, I kinda feel the same as you here. I think this justifies >> >> perhaps >> >> using a hardware solution. (Crazy idea?! Lol.) >> > >> > uhhh Yes. Very crazy idea. Can't imagine why anyone would >> > ever >> > think about doing that. >> >> Lol. Really? I mean a certain panic button. You know the GNU Emacs. >> It has this queue with the implications you mentioned --- as much as it >> can. (It must of course get the messages from the system, otherwise it >> can't do anything about it.) And it has the panic button C-g. The >> keyboard has one the highest precedences in hardware interrupts, >> doesn't >> it not? A certain very important system could have a panic button that >> invokes a certain debugger, say, for a crisis-moment. >> >> But then this could be a lousy engineering strategy. I am not an >> expert >> at all in any of this. But I'm surprised with your quick >> dismissal. :-) >> >> > Certainly nobody in his right mind would have WatchCat listening on >> > the serial port's Ring Indicator interrupt, and then grab a paperclip >> > to bridge the DTR and RI pins on an otherwise-unoccupied serial port >> > on the back of the PC. (The DTR pin was kept high by the PC, and >> > could >> > therefore be used as an open power pin to bring the RI high.) >> >> Why not? Misuse of hardware? Too precious of a resource? >> >> > If you're curious, it's pins 4 and 9 - diagonally up and in from the >> > short >> > corner. >> > http://www.usconverters.com/index.php?main_page=page&id=61&chapter=0 >> >> You know your pins! That's impressive. I thought the OS itself could >> use something like that. The fact that they never do... Says >> something, >> doesn't it? But it's not too obvious to me. >> >> > And of COURSE nobody would ever take an old serial mouse, take the >> > ball out of it, and turn it into a foot-controlled signal... although >> > that wasn't for WatchCat, that was for clipboard management >> > between my >> > app and a Windows accounting package that we used. But that's a >> > separate story. >> >> Lol. I feel you're saying you would. :-) > > This was all a figure of speech, and the denials were all tongue in > cheek. Not only am I saying we would, but we *did*. All of the above. Cool! :-) > The Ring Indicator trick was one of the best, since we had very little > other use for serial ports, and it didn't significantly impact the > system during good times, but was always reliable when things went > wrong. > > (And when I posted it, I could visualize the port and knew which pins > to bridge, but had to go look up a pinout to be able to say their pin > numbers and descriptions.) Nice! >> I heard of Python for the first time in the 90s. I worked at an ISP. >> Only one guy was really programming there, Allaire ColdFusion. But, >> odd enough, we used to say we would ``write a script in Python'' when >> we meant to say we were going out for a smoke. I think that was >> precisely because nobody knew that ``Python'' really was. I never >> expected it to be a great language. I imagined it was something like >> Tcl. (Lol, no offense at all towards Tcl.) > > Haha, that's a weird idiom! Clueless people --- from Rio de Janeiro area in Brazil. :-) It was effectively just an in-joke. > Funny you should mention Tcl. > > https://docs.python.org/3/library/tkinter.html Cool! Speaking of GUIs and Python, that Google software called Backup and Sync (which I think it's about to be obsoleted by Google Drive) is written in Python --- it feels a bit heavy. The GUI too seems a bit slow sometimes. Haven't tried their ``Google Drive'' as a replacement yet. -- https://mail.python.org/mailman/listinfo/python-list
Re: what to do with multiple BOMs
On 2021-08-19 14:07, Robin Becker wrote: Channeling unicode text experts and xml people: I have xml entity with initial bytes ff fe ff fe which the file command says is UTF-16, little-endian text. I agree, but what should be done about the additional BOM. A test output made many years ago seems to keep the extra BOM. The xml context is xml file 014.xml ]> &e;\xef\xbb\xbfdata' which implies seems as though the extra BOM in the entity has been kept and processed into a different BOM meaning utf8. I think the test file is wrong and that multiple BOM chars in the entiry should have been removed. Am I right? The use of a BOM b'\xef\xbb\xbf' at the start of a UTF-8 file is a Windows thing. It's not used on non-Windows systems. Putting it in the middle, e.g. b'\xef\xbb\xbfdata', just looks wrong. It looks like the contents of a UTF-8 file, with a BOM because it originated on a Windows system, were read in without stripping the BOM first. -- https://mail.python.org/mailman/listinfo/python-list
Re: what to do with multiple BOMs
By the rules of Unicode, that character, if not the very first character of the file, should be treated as a “zero-width non-breaking space”, it is NOT a BOM character there. It’s presence in the files is almost certainly an error, and being caused by broken software or software processing files in a manner that it wasn’t designed for. > On Aug 19, 2021, at 1:48 PM, Robin Becker wrote: > > Channeling unicode text experts and xml people: > > I have xml entity with initial bytes ff fe ff fe which the file command says > is > UTF-16, little-endian text. > > I agree, but what should be done about the additional BOM. > > A test output made many years ago seems to keep the extra BOM. The xml > context is > > > xml file 014.xml > > > ]> > &e; > the entitity file 014.ent is bombomdata > > b'\xff\xfe\xff\xfed\x00a\x00t\x00a\x00' > > The old saved test output of processing is > > b'\xef\xbb\xbfdata' > > which implies seems as though the extra BOM in the entity has been kept and > processed into a different BOM meaning utf8. > > I think the test file is wrong and that multiple BOM chars in the entiry > should have been removed. > > Am I right? > -- > Robin Becker > > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: ANN: Dogelog Runtime, Prolog to the Moon (2021)
Woa! The JavaScript JIT compiler is quite impressive. I now ported Dogelog runtime to Python as well, so that I can compare JavaScript and Python, and tested without clause indexing: between(L,H,L) :- L =< H. between(L,H,X) :- L < H, Y is L+1, between(Y,H,X). setup :- between(1,255,N), M is N//2, assertz(edge(M,N)), fail. setup :- edge(M,N), assertz(edge2(N,M)), fail. setup. anc(X,Y) :- edge(X, Y). anc(X,Y) :- edge(X, Z), anc(Z, Y). anc2(X,Y) :- edge2(Y, X). anc2(X,Y) :- edge2(Y, Z), anc2(X, Z). :- setup. :- time((between(1,10,_), anc2(0,255), fail; true)). :- time((between(1,10,_), anc(0,255), fail; true)). The results are: /* Python 3.10.0rc1 */ % Wall 188 ms, trim 0 ms % Wall 5537 ms, trim 0 ms /* JavaScript Chrome 92.0.4515.159 */ % Wall 5 ms, trim 0 ms % Wall 147 ms, trim 0 ms -- https://mail.python.org/mailman/listinfo/python-list
Re: ANN: Dogelog Runtime, Prolog to the Moon (2021)
Thats a factor 37.8 faster! I tested the a variant of the Albufeira instructions Prolog VM aka ZIP, which was also the inspiration for SWI-Prolog. Open Source: The Python Version of the Dogelog Runtime https://github.com/jburse/dogelog-moon/tree/main/devel/runtimepy The Python Test Harness https://gist.github.com/jburse/bf6c01c7524f2611d606cb88983da9d6#file-test-py Mostowski Collapse schrieb: Woa! The JavaScript JIT compiler is quite impressive. I now ported Dogelog runtime to Python as well, so that I can compare JavaScript and Python, and tested without clause indexing: between(L,H,L) :- L =< H. between(L,H,X) :- L < H, Y is L+1, between(Y,H,X). setup :- between(1,255,N), M is N//2, assertz(edge(M,N)), fail. setup :- edge(M,N), assertz(edge2(N,M)), fail. setup. anc(X,Y) :- edge(X, Y). anc(X,Y) :- edge(X, Z), anc(Z, Y). anc2(X,Y) :- edge2(Y, X). anc2(X,Y) :- edge2(Y, Z), anc2(X, Z). :- setup. :- time((between(1,10,_), anc2(0,255), fail; true)). :- time((between(1,10,_), anc(0,255), fail; true)). The results are: /* Python 3.10.0rc1 */ % Wall 188 ms, trim 0 ms % Wall 5537 ms, trim 0 ms /* JavaScript Chrome 92.0.4515.159 */ % Wall 5 ms, trim 0 ms % Wall 147 ms, trim 0 ms -- https://mail.python.org/mailman/listinfo/python-list
