Re: Fridays for Functions (Was: PilCon 2020)

Alexander Burger Fri, 03 Jul 2020 12:09:24 -0700

Hi Domingo,

> I used this two lua scripts to get/generate then:
> 
> ====
> 
> --local html = io.open("roseta-code.html"):read("*a")
> local base_url = "http://rosettacode.org";
> local html = io.popen("curl " .. base_url ..
> "/wiki/Category:PicoLisp"):read("*a")
> 
> html = html:gsub("<h2>Pages in category
> \"PicoLisp\"</h2>.-</div></div></div><div
> class=\"printfooter\">",
>     function(m)
>         return m:gsub("(/wiki[^\"]+)(\" title=)" ,
>             function(link, title)
>                 return base_url .. link .. "#PicoLisp" .. title
>             end)
>     end)
> 
> print(html)
> 
> ====
> 
> ====
> 
> local html = io.open("roseta-code.html"):read("*a")
> --local base_url = "http://rosettacode.org";
> --local html = io.popen("curl " .. base_url ..
> "/wiki/Category:PicoLisp"):read("*a")
> 
> local code_links = html:match("<h2>Pages in
> category
> \"PicoLisp\"</h2>.-</div></div></div><div
> class=\"printfooter\">")
> 
> for link, title in code_links:gmatch("href=\"([^\"]+).->([^<]+)") do
>     print( string.format("-{^{%s %s}}", link, title))
> end
> 
> ====



Thanks a lot!

For completeness, here is the same (if I understand your scripts correctly - I
have no idea of lua) in PicoLisp:


   (in '(curl "-s" "http://rosettacode.org/wiki/Category:PicoLisp";)
      (from "Pages in category \"PicoLisp\"")
      (while (== "<a href=\"" (from "<a href=\"" "printfooter"))
         (prin "-{\^{http://rosettacode.org"; (till "\"") "#PicoLisp ")
         (from "title=\"")
         (prinl (till "\"") "}}") ) )



As you see, no need for regexps, and no need for the overhead slurping the whole
stuff into memory. Just simple stream processing.

☺/ A!ex

-- 
UNSUBSCRIBE: mailto:[email protected]?subject=Unsubscribe

Re: Fridays for Functions (Was: PilCon 2020)

Reply via email to