[julia-users] Re: Parallel file access

Zachary Roth Mon, 17 Oct 2016 10:08:20 -0700

Thanks for the responses.

Raph, thank you again.  I very much appreciate your "humble offering". 
 I'll take a further look into your gist.


Steven, I'm happy to use the right tool for the job...so long as I have an 
idea of what it is.  Would you care to offer more insights or suggestions 
for the ill-informed (such as myself)?

---Zachary



On Sunday, October 16, 2016 at 7:51:19 AM UTC-4, Steven Sagaert wrote:
>
> that because SQLLite isn't a multi-user DB server but a single user 
> embedded (desktop) db. Use the right tool for the job.
>
> On Saturday, October 15, 2016 at 7:02:58 PM UTC+2, Ralph Smith wrote:
>>
>> How are the processes supposed to interact with the database?  Without 
>> extra synchronization logic, SQLite.jl gives (occasionally)
>> ERROR: LoadError: On worker 2:
>> SQLite.SQLiteException("database is locked")
>> which on the face of it suggests that all workers are using the same 
>> connection, although I opened the DB separately in each process.
>> (I think we should get "busy" instead of "locked", but then still have no 
>> good way to test for this and wait for a wake-up signal.)
>> So we seem to be at least as badly off as the original post, except with 
>> DB calls instead of simple writes.
>>
>> We shouldn't have to stand up a separate multithreaded DB server just for 
>> this. Would you be kind enough to give us an example of simple (i.e. not 
>> client-server) multiprocess DB access in Julia?
>>
>> On Saturday, October 15, 2016 at 9:40:17 AM UTC-4, Steven Sagaert wrote:
>>>
>>> It still surprises me how in the scientific computing field people still 
>>> refuse to learn about databases and then replicate database functionality 
>>> in files in a complicated and probably buggy way. HDF5  is one example, 
>>> there are many others. If you want to to fancy search (i.e. speedup search 
>>> via indices) or do things like parallel writes/concurrency you REALLY 
>>> should use databases. That's what they were invented for decades ago. 
>>> Nowadays there a bigger choice than ever: Relational or non-relational 
>>> (NOSQL), single host or distributed, web interface or not,  disk-based or 
>>> in-memory,... There really is no excuse anymore not to use a database if 
>>> you want to go beyond just reading in a bunch of data in one go in memory.
>>>
>>> On Monday, October 10, 2016 at 5:09:39 PM UTC+2, Zachary Roth wrote:
>>>>
>>>> Hi, everyone,
>>>>
>>>> I'm trying to save to a single file from multiple worker processes, but 
>>>> don't know of a nice way to coordinate this.  When I don't coordinate, 
>>>> saving works fine much of the time.  But I sometimes get errors with 
>>>> reading/writing of files, which I'm assuming is happening because multiple 
>>>> processes are trying to use the same file simultaneously.
>>>>
>>>> I tried to coordinate this with a queue/channel of `Condition`s managed 
>>>> by a task running in process 1, but this isn't working for me.  I've tried 
>>>> to simiplify this to track down the problem.  At least part of the issue 
>>>> seems to be writing to the channel from process 2.  Specifically, when I 
>>>> `put!` something onto a channel (or `push!` onto an array) from process 2, 
>>>> the channel/array is still empty back on process 1.  I feel like I'm 
>>>> missing something simple.  Is there an easier way to go about coordinating 
>>>> multiple processes that are trying to access the same file?  If not, does 
>>>> anyone have any tips?
>>>>
>>>> Thanks for any help you can offer.
>>>>
>>>> Cheers,
>>>> ---Zachary
>>>>
>>>

[julia-users] Re: Parallel file access

Reply via email to