date:20190614

[Tutor] Download audios & videos using web scraping from news website or facebook

2019-06-14 Thread Sijin John

Hello Sir/Mam, 
I am trying to Download audios & videos using web scraping from news website 
(eg: https://www.bbc.com/news/video_and_audio/headlines) or Facebook & I 
could't. So in real scenario is it really possible to download audios/videos 
using python code ? 

Thanks & Regards 
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Download audios & videos using web scraping from news website or facebook

2019-06-14 Thread Alan Gauld via Tutor

On 14/06/2019 07:35, Sijin John wrote:
> I am trying to Download audios & videos using web scraping from news website 
> (eg: https://www.bbc.com/news/video_and_audio/headlines) or Facebook & I 
> could't. 
> So in real scenario is it really possible to download audios/videos using 
> python code ? 

Of course, just as its possible to do it in any other language. In fact
there are several specialist libraries available to make the task easier.

It may not be legal however and the web site may have taken steps to
prevent you from succeeding or at least make it very difficult.
But that has nothing to do with Python, it would be just as difficult
in any language.

So, if you are having difficulty the problem likely lies with
1) your code and how you are using the tools.
2) the website you are scraping having anti-scraping measures in place.

But since you haven't shown us any code ewe can't really
comment or make any suggestions.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Download audios & videos using web scraping from news website or facebook

2019-06-14 Thread Steven D'Aprano

On Fri, Jun 14, 2019 at 11:35:53AM +0500, Sijin John wrote:

> I am trying to Download audios & videos using web scraping from news 
> website (eg: https://www.bbc.com/news/video_and_audio/headlines) or 
> Facebook & I could't. So in real scenario is it really possible to 
> download audios/videos using python code ?

Please don't mistake "I don't know how to do this" for "this cannot be 
done".

https://youtube-dl.org/

Scraping websites, especially scraping them for videos, can be *very* 
complex.


-- 
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

[Tutor] os.is_file and os.is_dir missing from CPython 3.8.0b?

2019-06-14 Thread Tom Hale


I'm trying to use os.is_dir, but I'm not finding it or os.is_file.

What am I missing here?

Python 3.8.0b1 (tags/v3.8.0b1:3b5deb01, Jun 13 2019, 22:28:20)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
:>>> import os
:>>> print(os.__dict__.keys())
dict_keys(['__name__', '__doc__', '__package__', '__loader__', 
'__spec__', '__file__', '__cached__', '__builtins__', 'abc', 'sys', 
'st', '__all__', '_exists', '_get_exports_list', 'name', 'linesep', 
'stat', 'access', 'ttyname', 'chdir', 'chmod', 'fchmod', 'chown', 
'fchown', 'lchown', 'chroot', 'ctermid', 'getcwd', 'getcwdb', 'link', 
'listdir', 'lstat', 'mkdir', 'nice', 'getpriority', 'setpriority', 
'posix_spawn', 'posix_spawnp', 'readlink', 'copy_file_range', 'rename', 
'replace', 'rmdir', 'symlink', 'system', 'umask', 'uname', 'unlink', 
'remove', 'utime', 'times', 'execv', 'execve', 'fork', 
'register_at_fork', 'sched_get_priority_max', 'sched_get_priority_min', 
'sched_getparam', 'sched_getscheduler', 'sched_rr_get_interval', 
'sched_setparam', 'sched_setscheduler', 'sched_yield', 
'sched_setaffinity', 'sched_getaffinity', 'openpty', 'forkpty', 
'getegid', 'geteuid', 'getgid', 'getgrouplist', 'getgroups', 'getpid', 
'getpgrp', 'getppid', 'getuid', 'getlogin', 'kill', 'killpg', 'setuid', 
'seteuid', 'setreuid', 'setgid', 'setegid', 'setregid', 'setgroups', 
'initgroups', 'getpgid', 'setpgrp', 'wait', 'wait3', 'wait4', 'waitid', 
'waitpid', 'getsid', 'setsid', 'setpgid', 'tcgetpgrp', 'tcsetpgrp', 
'open', 'close', 'closerange', 'device_encoding', 'dup', 'dup2', 
'lockf', 'lseek', 'read', 'readv', 'pread', 'preadv', 'write', 'writev', 
'pwrite', 'pwritev', 'sendfile', 'fstat', 'isatty', 'pipe', 'pipe2', 
'mkfifo', 'mknod', 'major', 'minor', 'makedev', 'ftruncate', 'truncate', 
'posix_fallocate', 'posix_fadvise', 'putenv', 'unsetenv', 'strerror', 
'fchdir', 'fsync', 'sync', 'fdatasync', 'WCOREDUMP', 'WIFCONTINUED', 
'WIFSTOPPED', 'WIFSIGNALED', 'WIFEXITED', 'WEXITSTATUS', 'WTERMSIG', 
'WSTOPSIG', 'fstatvfs', 'statvfs', 'confstr', 'sysconf', 'fpathconf', 
'pathconf', 'abort', 'getloadavg', 'urandom', 'setresuid', 'setresgid', 
'getresuid', 'getresgid', 'getxattr', 'setxattr', 'removexattr', 
'listxattr', 'get_terminal_size', 'cpu_count', 'get_inheritable', 
'set_inheritable', 'get_blocking', 'set_blocking', 'scandir', 'fspath', 
'getrandom', 'memfd_create', 'environ', 'F_OK', 'R_OK', 'W_OK', 'X_OK', 
'NGROUPS_MAX', 'TMP_MAX', 'WCONTINUED', 'WNOHANG', 'WUNTRACED', 
'O_RDONLY', 'O_WRONLY', 'O_RDWR', 'O_NDELAY', 'O_NONBLOCK', 'O_APPEND', 
'O_DSYNC', 'O_RSYNC', 'O_SYNC', 'O_NOCTTY', 'O_CREAT', 'O_EXCL', 
'O_TRUNC', 'O_LARGEFILE', 'O_PATH', 'O_TMPFILE', 'PRIO_PROCESS', 
'PRIO_PGRP', 'PRIO_USER', 'O_CLOEXEC', 'O_ACCMODE', 'SEEK_HOLE', 
'SEEK_DATA', 'O_ASYNC', 'O_DIRECT', 'O_DIRECTORY', 'O_NOFOLLOW', 
'O_NOATIME', 'EX_OK', 'EX_USAGE', 'EX_DATAERR', 'EX_NOINPUT', 
'EX_NOUSER', 'EX_NOHOST', 'EX_UNAVAILABLE', 'EX_SOFTWARE', 'EX_OSERR', 
'EX_OSFILE', 'EX_CANTCREAT', 'EX_IOERR', 'EX_TEMPFAIL', 'EX_PROTOCOL', 
'EX_NOPERM', 'EX_CONFIG', 'ST_RDONLY', 'ST_NOSUID', 'ST_NODEV', 
'ST_NOEXEC', 'ST_SYNCHRONOUS', 'ST_MANDLOCK', 'ST_WRITE', 'ST_APPEND', 
'ST_NOATIME', 'ST_NODIRATIME', 'ST_RELATIME', 'POSIX_FADV_NORMAL', 
'POSIX_FADV_SEQUENTIAL', 'POSIX_FADV_RANDOM', 'POSIX_FADV_NOREUSE', 
'POSIX_FADV_WILLNEED', 'POSIX_FADV_DONTNEED', 'P_PID', 'P_PGID', 
'P_ALL', 'WEXITED', 'WNOWAIT', 'WSTOPPED', 'CLD_EXITED', 'CLD_DUMPED', 
'CLD_TRAPPED', 'CLD_CONTINUED', 'F_LOCK', 'F_TLOCK', 'F_ULOCK', 
'F_TEST', 'RWF_DSYNC', 'RWF_HIPRI', 'RWF_SYNC', 'RWF_NOWAIT', 
'POSIX_SPAWN_OPEN', 'POSIX_SPAWN_CLOSE', 'POSIX_SPAWN_DUP2', 
'SCHED_OTHER', 'SCHED_FIFO', 'SCHED_RR', 'SCHED_BATCH', 'SCHED_IDLE', 
'SCHED_RESET_ON_FORK', 'XATTR_CREATE', 'XATTR_REPLACE', 
'XATTR_SIZE_MAX', 'RTLD_LAZY', 'RTLD_NOW', 'RTLD_GLOBAL', 'RTLD_LOCAL', 
'RTLD_NODELETE', 'RTLD_NOLOAD', 'RTLD_DEEPBIND', 'GRND_RANDOM', 
'GRND_NONBLOCK', 'MFD_CLOEXEC', 'MFD_ALLOW_SEALING', 'MFD_HUGETLB', 
'MFD_HUGE_SHIFT', 'MFD_HUGE_MASK', 'MFD_HUGE_64KB', 'MFD_HUGE_512KB', 
'MFD_HUGE_1MB', 'MFD_HUGE_2MB', 'MFD_HUGE_8MB', 'MFD_HUGE_16MB', 
'MFD_HUGE_32MB', 'MFD_HUGE_256MB', 'MFD_HUGE_512MB', 'MFD_HUGE_1GB', 
'MFD_HUGE_2GB', 'MFD_HUGE_16GB', 'pathconf_names', 'confstr_names', 
'sysconf_names', 'error', 'waitid_result', 'stat_result', 
'statvfs_result', 'sched_param', 'times_result', 'uname_result', 
'terminal_size', 'DirEntry', '_exit', 'path', 'curdir', 'pardir', 'sep', 
'pathsep', 'defpath', 'extsep', 'altsep', 'devnull', 'supports_dir_fd', 
'supports_effective_ids', 'supports_fd', 'supports_follow_symlinks', 
'SEEK_SET', 'SEEK_CUR', 'SEEK_END', 'makedirs', 'removedirs', 'renames', 
'walk', 'fwalk', '_fwalk', 'execl', 'execle', 'execlp', 'execlpe', 
'execvp', 'execvpe', '_execvpe', 'get_exec_path', 'MutableMapping', 
'_Environ', '_putenv', '_unsetenv', 'getenv', 'supports_bytes_environ', 
'environb', 'getenvb', 'fsencode', 'fsdecode', 'P_WAIT', 'P_NOWAIT', 
'P_NOWAITO', '_spawnvef', 'spawnv', 'spawnve

Re: [Tutor] os.is_file and os.is_dir missing from CPython 3.8.0b?

2019-06-14 Thread Alan Gauld via Tutor

On 14/06/2019 15:53, Tom Hale wrote:

> I'm trying to use os.is_dir, but I'm not finding it or os.is_file.

I've never heard of these functions, but I'm still on v3.6, never
having found a reason to upgrade. So I assume...

> Python 3.8.0b1 (tags/v3.8.0b1:3b5deb01, Jun 13 2019, 22:28:20)

...these are new introductions in 3.8?

If so, how do they differ from the os.path.isfile()
and os.path.isdir() functions that already exist?
Could you use those as alternatives?

As to why the new functions aren't showing up, I've no
idea, sorry.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] os.is_file and os.is_dir missing from CPython 3.8.0b?

2019-06-14 Thread Peter Otten

Tom Hale wrote:

> I'm trying to use os.is_dir, but I'm not finding it or os.is_file.
> 
> What am I missing here?

Scroll up a bit in the documentation:

https://docs.python.org/3.8/library/os.html#os.DirEntry

Both is_file() and is_dir() are methods of the DirEntry object.

See also

https://docs.python.org/3.8/library/os.path.html#os.path.isfile

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Download audios & videos using web scraping from news website or facebook

2019-06-14 Thread Mats Wichmann

On 6/14/19 12:35 AM, Sijin John wrote:
> Hello Sir/Mam, 
> I am trying to Download audios & videos using web scraping from news website 
> (eg: https://www.bbc.com/news/video_and_audio/headlines) or Facebook & I 
> could't. So in real scenario is it really possible to download audios/videos 
> using python code ? 

as others have pointed out, remove the "using python code" from this
question to be more accurate.

Modern media-serving websites often try quite hard to not have you be
able to grab the media objects except on their terms.  This often means
they are delivered in a streaming manner through a (non-open) player
app, and there never is a file at all that you're allowed to directly
access.  And there's often some wrapping which ends up delivering
advertising, because that's probably how the site monetizes their
content.  If you're expected to access it, there's usually an API for
that, which you would use rather than scraping.

Obviously people figure out ways, as the youtube downloader shows.

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

[Tutor] deleting elements out of a list.

2019-06-14 Thread mhysnm1964

All,

 

I am not sure how to tackle this issue. I am using Windows 10 and Python 3.6
from Activestate. 

 

I have a list of x number of elements. Some of the elements are have similar
words in them. For example:

 

Dog food Pal

Dog Food Pal qx1323 

Cat food kitty

Absolute cleaning inv123

Absolute Domestic cleaning inv 222

Absolute d 

Fitness first 02/19

Fitness first

 

I wish to remove duplicates. I could use the collection.Count method. This
fails due to the strings are not unique, only some of the words are. My
thinking and is only rough sudo code as I am not sure how to do this and
wish to learn and not sure how to do without causing gtraceback errors. I
want to delete the match pattern from the list of strings. Below is my
attempt and I hope this makes sense. 

 

 

description = load_files() # returns a list

for text in description:

words = text.split()

for i in enumerate(words):

Word = ' '.join(words[:i])

print (word)

answer = input('Keep word?')

if answer == 'n':

continue 

for i, v in enumerate(description):

if word in description[i]:

description.pop[i]



 

The initial issues I see with the above is the popping of an element from
description list will cause a error. If I copy the description list into a
new list. And use the new list for the outer loop. I will receive multiple
occurrences of the same text. This could be addressed by a if test. But I am
wondering if there is a better method. 2nd code example:

 

description = load_files() # returns a list

search_txt = description.copy() # I have not verify if this is the right
syntax for the copy method.]

for text in search_txt:

words = text.split()

for i in enumerate(words):

Word = ' '.join(words[:i])

print (word)

answer = input('Keep word (ynq)?')

if answer == 'n':

continue 

elif answer = 'q':

break

for i, v in enumerate(description):

if word in description[i]:

description.pop[i]



 

Any improvements?

 

Sean 

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

[Tutor] Differences between while and for

2019-06-14 Thread mhysnm1964

All,

 

In C, Perl and other languages. While only uses a conditional statement and
for uses an iteration. In python while and for seems to be the same and I
cannot see the difference. Python does not have an until (do while) where
the test is done at the end of the loop. Permitting a once through the loop
block. Am I correct or is there a difference and if so what is it?

 

Why doesn't Python have an until statement?

 

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

[Tutor] Download audios & videos using web scraping from news website or facebook

Re: [Tutor] Download audios & videos using web scraping from news website or facebook

Re: [Tutor] Download audios & videos using web scraping from news website or facebook

[Tutor] os.is_file and os.is_dir missing from CPython 3.8.0b?

Re: [Tutor] os.is_file and os.is_dir missing from CPython 3.8.0b?

Re: [Tutor] os.is_file and os.is_dir missing from CPython 3.8.0b?

Re: [Tutor] Download audios & videos using web scraping from news website or facebook

[Tutor] deleting elements out of a list.

[Tutor] Differences between while and for

9 matches

Site Navigation

Mail list logo

Footer information