[Tutor] Download audios & videos using web scraping from news website or facebook
Hello Sir/Mam, I am trying to Download audios & videos using web scraping from news website (eg: https://www.bbc.com/news/video_and_audio/headlines) or Facebook & I could't. So in real scenario is it really possible to download audios/videos using python code ? Thanks & Regards ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Download audios & videos using web scraping from news website or facebook
On 14/06/2019 07:35, Sijin John wrote: > I am trying to Download audios & videos using web scraping from news website > (eg: https://www.bbc.com/news/video_and_audio/headlines) or Facebook & I > could't. > So in real scenario is it really possible to download audios/videos using > python code ? Of course, just as its possible to do it in any other language. In fact there are several specialist libraries available to make the task easier. It may not be legal however and the web site may have taken steps to prevent you from succeeding or at least make it very difficult. But that has nothing to do with Python, it would be just as difficult in any language. So, if you are having difficulty the problem likely lies with 1) your code and how you are using the tools. 2) the website you are scraping having anti-scraping measures in place. But since you haven't shown us any code ewe can't really comment or make any suggestions. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Download audios & videos using web scraping from news website or facebook
On Fri, Jun 14, 2019 at 11:35:53AM +0500, Sijin John wrote: > I am trying to Download audios & videos using web scraping from news > website (eg: https://www.bbc.com/news/video_and_audio/headlines) or > Facebook & I could't. So in real scenario is it really possible to > download audios/videos using python code ? Please don't mistake "I don't know how to do this" for "this cannot be done". https://youtube-dl.org/ Scraping websites, especially scraping them for videos, can be *very* complex. -- Steven ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] os.is_file and os.is_dir missing from CPython 3.8.0b?
I'm trying to use os.is_dir, but I'm not finding it or os.is_file. What am I missing here? Python 3.8.0b1 (tags/v3.8.0b1:3b5deb01, Jun 13 2019, 22:28:20) [GCC 8.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. :>>> import os :>>> print(os.__dict__.keys()) dict_keys(['__name__', '__doc__', '__package__', '__loader__', '__spec__', '__file__', '__cached__', '__builtins__', 'abc', 'sys', 'st', '__all__', '_exists', '_get_exports_list', 'name', 'linesep', 'stat', 'access', 'ttyname', 'chdir', 'chmod', 'fchmod', 'chown', 'fchown', 'lchown', 'chroot', 'ctermid', 'getcwd', 'getcwdb', 'link', 'listdir', 'lstat', 'mkdir', 'nice', 'getpriority', 'setpriority', 'posix_spawn', 'posix_spawnp', 'readlink', 'copy_file_range', 'rename', 'replace', 'rmdir', 'symlink', 'system', 'umask', 'uname', 'unlink', 'remove', 'utime', 'times', 'execv', 'execve', 'fork', 'register_at_fork', 'sched_get_priority_max', 'sched_get_priority_min', 'sched_getparam', 'sched_getscheduler', 'sched_rr_get_interval', 'sched_setparam', 'sched_setscheduler', 'sched_yield', 'sched_setaffinity', 'sched_getaffinity', 'openpty', 'forkpty', 'getegid', 'geteuid', 'getgid', 'getgrouplist', 'getgroups', 'getpid', 'getpgrp', 'getppid', 'getuid', 'getlogin', 'kill', 'killpg', 'setuid', 'seteuid', 'setreuid', 'setgid', 'setegid', 'setregid', 'setgroups', 'initgroups', 'getpgid', 'setpgrp', 'wait', 'wait3', 'wait4', 'waitid', 'waitpid', 'getsid', 'setsid', 'setpgid', 'tcgetpgrp', 'tcsetpgrp', 'open', 'close', 'closerange', 'device_encoding', 'dup', 'dup2', 'lockf', 'lseek', 'read', 'readv', 'pread', 'preadv', 'write', 'writev', 'pwrite', 'pwritev', 'sendfile', 'fstat', 'isatty', 'pipe', 'pipe2', 'mkfifo', 'mknod', 'major', 'minor', 'makedev', 'ftruncate', 'truncate', 'posix_fallocate', 'posix_fadvise', 'putenv', 'unsetenv', 'strerror', 'fchdir', 'fsync', 'sync', 'fdatasync', 'WCOREDUMP', 'WIFCONTINUED', 'WIFSTOPPED', 'WIFSIGNALED', 'WIFEXITED', 'WEXITSTATUS', 'WTERMSIG', 'WSTOPSIG', 'fstatvfs', 'statvfs', 'confstr', 'sysconf', 'fpathconf', 'pathconf', 'abort', 'getloadavg', 'urandom', 'setresuid', 'setresgid', 'getresuid', 'getresgid', 'getxattr', 'setxattr', 'removexattr', 'listxattr', 'get_terminal_size', 'cpu_count', 'get_inheritable', 'set_inheritable', 'get_blocking', 'set_blocking', 'scandir', 'fspath', 'getrandom', 'memfd_create', 'environ', 'F_OK', 'R_OK', 'W_OK', 'X_OK', 'NGROUPS_MAX', 'TMP_MAX', 'WCONTINUED', 'WNOHANG', 'WUNTRACED', 'O_RDONLY', 'O_WRONLY', 'O_RDWR', 'O_NDELAY', 'O_NONBLOCK', 'O_APPEND', 'O_DSYNC', 'O_RSYNC', 'O_SYNC', 'O_NOCTTY', 'O_CREAT', 'O_EXCL', 'O_TRUNC', 'O_LARGEFILE', 'O_PATH', 'O_TMPFILE', 'PRIO_PROCESS', 'PRIO_PGRP', 'PRIO_USER', 'O_CLOEXEC', 'O_ACCMODE', 'SEEK_HOLE', 'SEEK_DATA', 'O_ASYNC', 'O_DIRECT', 'O_DIRECTORY', 'O_NOFOLLOW', 'O_NOATIME', 'EX_OK', 'EX_USAGE', 'EX_DATAERR', 'EX_NOINPUT', 'EX_NOUSER', 'EX_NOHOST', 'EX_UNAVAILABLE', 'EX_SOFTWARE', 'EX_OSERR', 'EX_OSFILE', 'EX_CANTCREAT', 'EX_IOERR', 'EX_TEMPFAIL', 'EX_PROTOCOL', 'EX_NOPERM', 'EX_CONFIG', 'ST_RDONLY', 'ST_NOSUID', 'ST_NODEV', 'ST_NOEXEC', 'ST_SYNCHRONOUS', 'ST_MANDLOCK', 'ST_WRITE', 'ST_APPEND', 'ST_NOATIME', 'ST_NODIRATIME', 'ST_RELATIME', 'POSIX_FADV_NORMAL', 'POSIX_FADV_SEQUENTIAL', 'POSIX_FADV_RANDOM', 'POSIX_FADV_NOREUSE', 'POSIX_FADV_WILLNEED', 'POSIX_FADV_DONTNEED', 'P_PID', 'P_PGID', 'P_ALL', 'WEXITED', 'WNOWAIT', 'WSTOPPED', 'CLD_EXITED', 'CLD_DUMPED', 'CLD_TRAPPED', 'CLD_CONTINUED', 'F_LOCK', 'F_TLOCK', 'F_ULOCK', 'F_TEST', 'RWF_DSYNC', 'RWF_HIPRI', 'RWF_SYNC', 'RWF_NOWAIT', 'POSIX_SPAWN_OPEN', 'POSIX_SPAWN_CLOSE', 'POSIX_SPAWN_DUP2', 'SCHED_OTHER', 'SCHED_FIFO', 'SCHED_RR', 'SCHED_BATCH', 'SCHED_IDLE', 'SCHED_RESET_ON_FORK', 'XATTR_CREATE', 'XATTR_REPLACE', 'XATTR_SIZE_MAX', 'RTLD_LAZY', 'RTLD_NOW', 'RTLD_GLOBAL', 'RTLD_LOCAL', 'RTLD_NODELETE', 'RTLD_NOLOAD', 'RTLD_DEEPBIND', 'GRND_RANDOM', 'GRND_NONBLOCK', 'MFD_CLOEXEC', 'MFD_ALLOW_SEALING', 'MFD_HUGETLB', 'MFD_HUGE_SHIFT', 'MFD_HUGE_MASK', 'MFD_HUGE_64KB', 'MFD_HUGE_512KB', 'MFD_HUGE_1MB', 'MFD_HUGE_2MB', 'MFD_HUGE_8MB', 'MFD_HUGE_16MB', 'MFD_HUGE_32MB', 'MFD_HUGE_256MB', 'MFD_HUGE_512MB', 'MFD_HUGE_1GB', 'MFD_HUGE_2GB', 'MFD_HUGE_16GB', 'pathconf_names', 'confstr_names', 'sysconf_names', 'error', 'waitid_result', 'stat_result', 'statvfs_result', 'sched_param', 'times_result', 'uname_result', 'terminal_size', 'DirEntry', '_exit', 'path', 'curdir', 'pardir', 'sep', 'pathsep', 'defpath', 'extsep', 'altsep', 'devnull', 'supports_dir_fd', 'supports_effective_ids', 'supports_fd', 'supports_follow_symlinks', 'SEEK_SET', 'SEEK_CUR', 'SEEK_END', 'makedirs', 'removedirs', 'renames', 'walk', 'fwalk', '_fwalk', 'execl', 'execle', 'execlp', 'execlpe', 'execvp', 'execvpe', '_execvpe', 'get_exec_path', 'MutableMapping', '_Environ', '_putenv', '_unsetenv', 'getenv', 'supports_bytes_environ', 'environb', 'getenvb', 'fsencode', 'fsdecode', 'P_WAIT', 'P_NOWAIT', 'P_NOWAITO', '_spawnvef', 'spawnv', 'spawnve
Re: [Tutor] os.is_file and os.is_dir missing from CPython 3.8.0b?
On 14/06/2019 15:53, Tom Hale wrote: > I'm trying to use os.is_dir, but I'm not finding it or os.is_file. I've never heard of these functions, but I'm still on v3.6, never having found a reason to upgrade. So I assume... > Python 3.8.0b1 (tags/v3.8.0b1:3b5deb01, Jun 13 2019, 22:28:20) ...these are new introductions in 3.8? If so, how do they differ from the os.path.isfile() and os.path.isdir() functions that already exist? Could you use those as alternatives? As to why the new functions aren't showing up, I've no idea, sorry. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] os.is_file and os.is_dir missing from CPython 3.8.0b?
Tom Hale wrote: > I'm trying to use os.is_dir, but I'm not finding it or os.is_file. > > What am I missing here? Scroll up a bit in the documentation: https://docs.python.org/3.8/library/os.html#os.DirEntry Both is_file() and is_dir() are methods of the DirEntry object. See also https://docs.python.org/3.8/library/os.path.html#os.path.isfile ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Download audios & videos using web scraping from news website or facebook
On 6/14/19 12:35 AM, Sijin John wrote: > Hello Sir/Mam, > I am trying to Download audios & videos using web scraping from news website > (eg: https://www.bbc.com/news/video_and_audio/headlines) or Facebook & I > could't. So in real scenario is it really possible to download audios/videos > using python code ? as others have pointed out, remove the "using python code" from this question to be more accurate. Modern media-serving websites often try quite hard to not have you be able to grab the media objects except on their terms. This often means they are delivered in a streaming manner through a (non-open) player app, and there never is a file at all that you're allowed to directly access. And there's often some wrapping which ends up delivering advertising, because that's probably how the site monetizes their content. If you're expected to access it, there's usually an API for that, which you would use rather than scraping. Obviously people figure out ways, as the youtube downloader shows. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] deleting elements out of a list.
All, I am not sure how to tackle this issue. I am using Windows 10 and Python 3.6 from Activestate. I have a list of x number of elements. Some of the elements are have similar words in them. For example: Dog food Pal Dog Food Pal qx1323 Cat food kitty Absolute cleaning inv123 Absolute Domestic cleaning inv 222 Absolute d Fitness first 02/19 Fitness first I wish to remove duplicates. I could use the collection.Count method. This fails due to the strings are not unique, only some of the words are. My thinking and is only rough sudo code as I am not sure how to do this and wish to learn and not sure how to do without causing gtraceback errors. I want to delete the match pattern from the list of strings. Below is my attempt and I hope this makes sense. description = load_files() # returns a list for text in description: words = text.split() for i in enumerate(words): Word = ' '.join(words[:i]) print (word) answer = input('Keep word?') if answer == 'n': continue for i, v in enumerate(description): if word in description[i]: description.pop[i] The initial issues I see with the above is the popping of an element from description list will cause a error. If I copy the description list into a new list. And use the new list for the outer loop. I will receive multiple occurrences of the same text. This could be addressed by a if test. But I am wondering if there is a better method. 2nd code example: description = load_files() # returns a list search_txt = description.copy() # I have not verify if this is the right syntax for the copy method.] for text in search_txt: words = text.split() for i in enumerate(words): Word = ' '.join(words[:i]) print (word) answer = input('Keep word (ynq)?') if answer == 'n': continue elif answer = 'q': break for i, v in enumerate(description): if word in description[i]: description.pop[i] Any improvements? Sean ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Differences between while and for
All, In C, Perl and other languages. While only uses a conditional statement and for uses an iteration. In python while and for seems to be the same and I cannot see the difference. Python does not have an until (do while) where the test is done at the end of the loop. Permitting a once through the loop block. Am I correct or is there a difference and if so what is it? Why doesn't Python have an until statement? ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor