I think the requirement was stated that old versions will be kept, which is
consistent with Cassandra and the LSM data model - it would avoid the need
for compactions of the actual chunked blob data.
Throughput mostly comes down to adequately provisioning your cluster.
-- Jack Krupansky
On Wed,
The answer to this questions is very much dependent on the throughput,
desired latency and access patters (R/W or R/O)? In general what I have
seen working for high throughput environment is to either use a distributed
file system like Ceph/Gluster or object store like S3 and keep the pointer
in th
There's also the 'support' issue.. C* is hard enough as it is... maybe you
can bring in another system like ES or HDFS but the more you bring in the
more your complexity REALLY goes through the roof.
Better to keep things simple.
I really like the chunking idea for C*... seems like an easy way to
On Tue, Jan 19, 2016 at 2:07 PM, Richard L. Burton III
wrote:
> I would ask why do this over say HDFS, S3, etc. seems like this problem
> has been solved with other solutions that are specifically designed for
> blob storage?
>
HDFS's default block size is 64mb. If you are storing objects smalle
Just adding one more item to the discussion. I believe this was announced
on the list some time ago. I haven't tried it out yet, just pointing it out
since it's on the OP's topic:
http://pithos.io/
It's a Cassandra-backed object store using an S3 API.
On Tue, Jan 19, 2016 at 2:07 PM, Richard L.
I would ask why do this over say HDFS, S3, etc. seems like this problem has
been solved with other solutions that are specifically designed for blob
storage?
On Tue, Jan 19, 2016 at 4:23 PM, wrote:
> I recently started noodling with this concept and built a working blob
> storage service using n
I recently started noodling with this concept and built a working blob storage
service using node.js and C*. I setup a basic web server using the express web
server where you could POST binary files to the server where they would get
chunked and assigned to a user and bucket, in the spirit of S
Lots of interesting feedback.. I like the ideal of chunking the IO into
pages.. it would require more thinking but I could even do cassandra async
IO and async HTTP to serve the data and then use HTTP chunks for each
range.
On Tue, Jan 19, 2016 at 10:47 AM, Robert Coli wrote:
> On Mon, Jan 18, 2
On Mon, Jan 18, 2016 at 6:52 PM, Kevin Burton wrote:
> Internally we have the need for a blob store for web content. It's MOSTLY
> key, ,value based but we'd like to have lookups by coarse grained tags.
>
I know you know how to operate and scale MySQL, so I suggest MogileFS for
the actual blob
On Mon, Jan 18, 2016 at 8:52 PM, Kevin Burton wrote:
> Internally we have the need for a blob store for web content. It's MOSTLY
> key, ,value based but we'd like to have lookups by coarse grained tags.
>
> This needs to store normal web content like HTML , CSS, JPEG, SVG, etc.
>
> Highly doubt
There is also an excellent tutorial video done by Patrick McFadin and Aaron
Morton on the subject of data model for storing images into Cassandra:
http://youtu.be/gk-B75xgFUg
I guess it can be adapted to store binary objects other than images
On Tue, Jan 19, 2016 at 6:37 AM, Jack Krupansky
wrote
Chunk the blobs and store them in a separate table from the metadata.
Here's an old attempt at a chunked object store, for reference:
https://github.com/Netflix/astyanax/wiki/Chunked-Object-Store
Picking an appropriate chunk size may be key (or not). Somewhere between 8K
and 512K, I would guess,
Internally we have the need for a blob store for web content. It's MOSTLY
key, ,value based but we'd like to have lookups by coarse grained tags.
This needs to store normal web content like HTML , CSS, JPEG, SVG, etc.
Highly doubt that anything over 5MB would need to be stored.
We also need the
13 matches
Mail list logo