Crusier wrote:
> Hi Python Tutors,
>
> I am currently able to strip down to the string I want. However, I
> have problems with the JSON script and I am not sure how to slice it
> into a dictionary.
>
> import urllib
> import json
> import requests
>
> from bs4 import BeautifulSoup
>
>
> url =
Hey Crusier/ (And Others...)
For your site...
As Alan mentioned, its a mix of html/jscript/etc..
So, you're going (or perhaps should) need to extract just the
json/struct that you need, and then go from there. I speak of
experience, as I've had to hande a number of sites that are
essentially jus
On 13/12/15 07:44, Crusier wrote:
> Dear All,
>
> I am trying to scrap the following website, however, I have
> encountered some problems. As you can see, I am not really familiar
> with regex and I hope you can give me some pointers to how to solve
> this problem.
I'm not sure why you mention re
On 12Oct2015 21:21, Crusier wrote:
I am using Python 3.4. I am trying to do some web scraping at this moment.
I got stuck because there is an IndexError: list index out of range if I
put stock_code = (18). My desire output is that the script is able to
detect print out the recent price whether i
Hello, I have personally found this tutorial to be helpful. Check it out:
https://www.youtube.com/watch?v=3xQTJi2tqgk Thank you.
On Tuesday, September 29, 2015 12:05 PM, Joel Goldstick
wrote:
On Tue, Sep 29, 2015 at 11:47 AM, Crusier wrote:
> Hi
>
> I have recently finished rea
>> Hi
>>
>> I have recently finished reading "Starting out with Python" and I
>> really want to do some web scraping. Please kindly advise where I can
>> get more information about BeautifulSoup. It seems that Documentation
>> is too hard for me.
>>
>> Furthermore, I have tried to scrap this site b
Crusier wrote:
> I have recently finished reading "Starting out with Python" and I
> really want to do some web scraping. Please kindly advise where I can
> get more information about BeautifulSoup. It seems that Documentation
> is too hard for me.
If you tell us what you don't understand and wh
On Tue, Sep 29, 2015 at 11:47 AM, Crusier wrote:
> Hi
>
> I have recently finished reading "Starting out with Python" and I
> really want to do some web scraping. Please kindly advise where I can
> get more information about BeautifulSoup. It seems that Documentation
> is too hard for me.
>
> Fur
Thanks, urlparse.urljoin did the trick.
Akash- the problem with directly prefixing url to the link is that the url most
of the times contains not just the page address but also parameters and
fragments.
Andreas Kostyrka <[EMAIL PROTECTED]> wrote: * Akash [061129 20:54]:
> On 11/30/06, Shitiz Ba
* Akash <[EMAIL PROTECTED]> [061129 20:54]:
> On 11/30/06, Shitiz Bansal <[EMAIL PROTECTED]> wrote:
> > I am using beautiful soup for extracting links from a web page.
> > Most pages use relative links in their pages which is causing a problem. Is
> > there any library to extract complete links or
On 11/30/06, Shitiz Bansal <[EMAIL PROTECTED]> wrote:
> I am using beautiful soup for extracting links from a web page.
> Most pages use relative links in their pages which is causing a problem. Is
> there any library to extract complete links or do i have to parse this
> myself?
>
Beautiful Soup
Bob Tanner wrote:
> Kent Johnson wrote:
>
>
>>>Is there a way to insert a node with Beautiful Soup?
>>
>>BS doesn't really seem to be set up to support this. The Tags in a soup
>>are kept in a linked
>
>
> What would the appropriate technology to use?
You might also email the author of BS and
Bob Tanner wrote:
> Kent Johnson wrote:
>
>
>>>Is there a way to insert a node with Beautiful Soup?
>>
>>BS doesn't really seem to be set up to support this. The Tags in a soup
>>are kept in a linked
>
>
> What would the appropriate technology to use?
Fredrik Lundh's elementtidy uses the Tidy
Kent Johnson wrote:
>> Is there a way to insert a node with Beautiful Soup?
>
> BS doesn't really seem to be set up to support this. The Tags in a soup
> are kept in a linked
What would the appropriate technology to use?
I tried the xml modules, but they fail on the parsing of the html.
--
B
Bob Tanner wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Is there a way to insert a node with Beautiful Soup?
BS doesn't really seem to be set up to support this. The Tags in a soup are
kept in a linked list by their next attribute so you will have to find the
right Tag, break t
On 10/4/05, Andrew P <[EMAIL PROTECTED]> wrote:
Oops, Paul is probably right. I thought urllib2 opened local
files in the absence of an identifier like "http://". Bad assumption on my part. I remembered that
behavior from somewhere else, maybe urllib.
The following function could be useful here
On 10/4/05, Andrew P <[EMAIL PROTECTED]> wrote:
Oops, Paul is probably right. I thought urllib2 opened local
files in the absence of an identifier like "http://". Bad assumption on my part. I remembered that
behavior from somewhere else, maybe urllib.
The following function could be useful here
Oops, Paul is probably right. I thought urllib2 opened local
files in the absence of an identifier like "http://". Bad assumption on my part. I remembered that
behavior from somewhere else, maybe urllib.
That path beginning with "\\C:\\" could still bite you, however. Good luck,
Andrew
_
With error messages like that, the interesting bits are usually at the end:
OSError: [Errno 2] No such file or directory:
'\\C:\\Python24\\FRE_word_list.htm
That should read "C:\\Python24\\FRE_word_list.htm".
I use UNIX-style paths, which work fine for me under Windows, so it
would just be "/Py
How did you change it to look at the file on your PC?
You appear to have told urllib2 to use "FRE_word_list.htm", it cannot
find that online so tried to look for it on your local disk at
'\\C:\\Python24\\FRE_word_list.htm
I would suggest that you either put your local html on a web server
and send
Hi Danny,
> If you have a moment, do you mind doing this on your system?
>
Here you go:
>>> import types
>>> print types.StringTypes
(, )
>>> import sys
>>> print sys.version
2.3.4 (#2, May 29 2004, 03:31:27)
[GCC 3.3.3 (Debian 20040417)]
>>> print type(u'hello' in types.StringTypes
True
>>>sys
> Here you go:
>
> >>> import types
> >>> print types.StringTypes
> (, )
> >>> import sys
> >>> print sys.version
> 2.3.4 (#2, May 29 2004, 03:31:27)
> [GCC 3.3.3 (Debian 20040417)]
> >>> print type(u'hello' in types.StringTypes
> True
> >>>sys.getdefaultencoding()
> 'ascii'
[CCing Leonard Richa
grouchy wrote:
> Hi,
>
> I'm having bang-my-head-against-a-wall moments trying to figure all of this
> out.
>
from BeautifulSoup import BeautifulSoup
>>>
file = urllib.urlopen("http://www.google.com/search?q=beautifulsoup";)
file = file.read().decode("utf-8")
soup = BeautifulSoup
On Thu, 25 Aug 2005, grouchy wrote:
> >>>file = urllib.urlopen("http://www.google.com/search?q=beautifulsoup";)
> >>>file = file.read().decode("utf-8")
> >>>soup = BeautifulSoup(file)
> >>>results = soup('p','g')
> >>> x = results[1].a.renderContents()
> >>> type(x)
>
> >>> print x
> Matt Croy
24 matches
Mail list logo