Re: Seeking advice on Schema and Caching

2011-11-15 Thread Aditya Narayan
Regarding the first option that you suggested through composite columns, can I store the username & id both in the column name and keep the column valueless? Will I be able to retrieve both the username and id from the composite col name ? Thanks a lot On Wed, Nov 16, 2011 at 10:56 AM, Ad

Re: Seeking advice on Schema and Caching

2011-11-15 Thread Aditya Narayan
Got the first option that you suggested. However, In the second one, are you suggested to use, for e.g, key='Marcos' & store cols, for all users of that name, containing userId inside that row. That way it would have to read multiple rows while user is doing a single search. On Wed, Nov 16, 2011

Re: Seeking advice on Schema and Caching

2011-11-15 Thread Aditya Narayan
onsidered using > apache solr - you could then include just the row keys pointing back > to Cassandra where the actual data is. > > Solr seems quite capable of performing google like searches and is fast. > > > > Cheers > Ben > > On 16/11/2011, at 1:50 AM, Adity

Re: Seeking advice on Schema and Caching

2011-11-15 Thread Aditya Narayan
Any insights on this ? On Tue, Nov 15, 2011 at 9:40 PM, Quintero wrote: > > > Aditya Narayan wrote: > > >Hi > > > >I need to add 'search users' functionality to my application. (The trigger > >for fetching searched items(like google instant search

Seeking advice on Schema and Caching

2011-11-15 Thread Aditya Narayan
Hi I need to add 'search users' functionality to my application. (The trigger for fetching searched items(like google instant search) is made when 3 letters have been typed in). For this, I make a CF with String type keys. Each such key is made of first 3 letters of a user's name. Thus all names

Re: Store profile pics of users in Cassandra or file system ?

2011-11-11 Thread Aditya Narayan
just forgot to add the paper link if this is useful at all : To BLOB or Not To BLOB: Large Object Storage in a Database or a Filesystem<http://research.microsoft.com/apps/pubs/default.aspx?id=64525> On Sat, Nov 12, 2011 at 12:34 AM, Aditya Narayan wrote: > Would it be recommended to

Store profile pics of users in Cassandra or file system ?

2011-11-11 Thread Aditya Narayan
Would it be recommended to store the profile pics of users on an application in Cassandra ? Or file system would be a better way to go. I came across an interesting paper which advocates storing in DB for blobs sized up to 1 MB. I was planning to store the image bytes in the same row that contained

Re: Concatenating ids with extension to keep multiple rows related to an entity in a single CF

2011-11-03 Thread Aditya Narayan
10:11 AM, Tyler Hobbs wrote: > On Thu, Nov 3, 2011 at 3:48 PM, Aditya Narayan wrote: > >> I am concatenating two Integer ids through bitwise operations(as >> described below) to create a single primary key of type long. I wanted to >> know if this is a good practice. This w

Concatenating ids with extension to keep multiple rows related to an entity in a single CF

2011-11-03 Thread Aditya Narayan
I am concatenating two Integer ids through bitwise operations(as described below) to create a single primary key of type long. I wanted to know if this is a good practice. This would help me in keeping multiple rows of an entity in a single column family by appending different extensions to the en

Re: Cassandra Cluster Admin - phpMyAdmin for Cassandra

2011-11-01 Thread Aditya Narayan
Yes that would be pretty nice feature to see! On Mon, Oct 31, 2011 at 10:45 PM, Ertio Lew wrote: > Thanks so much SebWajam for this great piece of work! > > Is there a way to set a data type for displaying the column names/ values > of a CF ? It seems that your project always uses String Seri

Re: Programmatically allow only one out of two types of rows in a CF to enter the CACHE

2011-10-29 Thread Aditya Narayan
> > On Sat, Oct 29, 2011 at 2:21 PM, Mohit Anchlia > wrote: > > On Sat, Oct 29, 2011 at 11:23 AM, Aditya Narayan > wrote: > >> @Mohit: > >> I have stated the example scenarios in my first post under this heading. > >> Also I have stated above why I

Re: Programmatically allow only one out of two types of rows in a CF to enter the CACHE

2011-10-29 Thread Aditya Narayan
do you mean exactly by "indexing some of the higher levels of data" ? Thanks you guys! > Anthony > > > On 28/10/2011, at 21:42 PM, Aditya Narayan wrote: > > > I need to keep the data of some entities in a single CF but split in two > rows for each entity. On

Re: Programmatically allow only one out of two types of rows in a CF to enter the CACHE

2011-10-29 Thread Aditya Narayan
..so that I can retrieve them through a single query. For reading cols from two CFs you need two queries, right ? On Sat, Oct 29, 2011 at 9:53 PM, Mohit Anchlia wrote: > Why not use 2 CFs? > > On Fri, Oct 28, 2011 at 9:42 PM, Aditya Narayan wrote: > > I need to keep t

Programmatically allow only one out of two types of rows in a CF to enter the CACHE

2011-10-28 Thread Aditya Narayan
I need to keep the data of some entities in a single CF but split in two rows for each entity. One row contains an overview information for the entity & another row contains detailed information about entity. I am wanting to keep both rows in single CF so they may be retrieved in a single query whe

Re: Storing counters in the standard column families along with non-counter columns ?

2011-07-14 Thread Aditya Narayan
Thanks Aaron & Chris, I appreciate your help. With dedicated CF for counters, in addition to the issue pointed by Chris, the major drawback I see is that I cant read *in a single query* the counters with the regular columns row which is widely required by my application. My use case is like storin

Re: Storing counters in the standard column families along with non-counter columns ?

2011-07-11 Thread Aditya Narayan
Oops that's really very much disheartening and it could seriously impact our plans for going live in near future. Without this facility I guess counters currently have very little usefulness. On Mon, Jul 11, 2011 at 8:16 PM, Chris Burroughs wrote: > On 07/10/2011 01:09 PM, Aditya Naray

Re: Storing counters in the standard column families along with non-counter columns ?

2011-07-10 Thread Aditya Narayan
ff, where as normal CF simply just add or > replace. > > > On Sun, Jul 10, 2011 at 10:39 PM, Aditya Narayan wrote: > >> Thanks for info. >> >> Is there any target version in near future for which this has been >> promised ? >> >> >> On Sun, Jul

Re: Storing counters in the standard column families along with non-counter columns ?

2011-07-10 Thread Aditya Narayan
ved ... > > https://issues.apache.org/jira/browse/CASSANDRA-2614 > > -sd > > On Sun, Jul 10, 2011 at 5:04 PM, Aditya Narayan wrote: > > Is it now possible to store counters in the standard column families > along > > with non counter type columns ? How to achieve this ? >

Storing counters in the standard column families along with non-counter columns ?

2011-07-10 Thread Aditya Narayan
Is it now possible to store counters in the standard column families along with non counter type columns ? How to achieve this ?

Re: Design for 'Most viewed Discussions' in a forum

2011-05-18 Thread Aditya Narayan
1 > > Then you just query in the following way : > > MGET <http://redis.io/commands/mget> topics:*:timestampN > > * is the wildcard, you order by viewcount and you have what you are asking > for ! > This is a simplified version of what you should do but personnally I r

Re: Design for 'Most viewed Discussions' in a forum

2011-05-18 Thread Aditya Narayan
help minimize several versions of the same column in the row parts in different SST tables. On Wed, May 18, 2011 at 11:04 PM, Aditya Narayan wrote: > * > For a discussions forum, I need to show a page of most viewed discussions. > > For implementing this, I maintain a count

Design for 'Most viewed Discussions' in a forum

2011-05-18 Thread Aditya Narayan
* For a discussions forum, I need to show a page of most viewed discussions. For implementing this, I maintain a count of views of a discussion & when this views count of a discussion passes a certain threshold limit, the discussion Id is added to a row of most viewed discussions. Thi

Re: Does the memtable replace the old version of column with the new overwriting version or is it just a simple append ?

2011-03-08 Thread Aditya Narayan
d the reconciliation of that happens > during read (read repair). This is why reads are slower than writes because > conflict resolution happens during read. > > Hope this answers the question! > > Thanks, > -Naren > > On Tue, Mar 8, 2011 at 10:44 PM, Aditya Narayan wrote: &

Does the memtable replace the old version of column with the new overwriting version or is it just a simple append ?

2011-03-08 Thread Aditya Narayan
, since Cassandra will have to read so many versions of the same column. If this is just replacement with old column then I guess read will be much better since it needs to see just single existing version of column. Thanks Aditya Narayan

Re: Splitting the data of a single blog into 2 CFs (to implement effective caching) according to views.

2011-03-08 Thread Aditya Narayan
CF2 as well (use a batch_mutation > through whatever client you have). So when serving the second page you only > need to read one row from CF2. > > > Aaron > > On 8/03/2011, at 8:13 PM, Norman Maurer wrote: > > Yeah this make sense as far as I can tell. > > > Bye,

Splitting the data of a single blog into 2 CFs (to implement effective caching) according to views.

2011-03-07 Thread Aditya Narayan
My application displays list of several blogs' overview data (like blogTitle/ nameOfBlogger/ shortDescrption for each blog) on 1st page (in very much similar manner like Digg's newsfeed) and when the user selects a particular blog to see., the application takes him to that specific blog's full pag

Re: What would be a good strategy for Storing the large text contents like blog posts in Cassandra.

2011-03-06 Thread Aditya Narayan
t; try one and be prepared to change. > > Note that counters are only in the 0.8 trunk and are still under development, > they are not going to be released for a couple of months. > > Your per column data size is nothing to be concerned abut. > > Hope that helps. > Aaron >

What would be a good strategy for Storing the large text contents like blog posts in Cassandra.

2011-03-06 Thread Aditya Narayan
What would be a good strategy to store large text content/(blog posts of around 1500-3000 characters) in cassandra? I need to store these blog posts along with their metadata like bloggerId, blogTags. I am looking forward to store this data in a single row giving each attribute a single column. So

Re: Splitting a single row into multiple

2011-02-23 Thread Aditya Narayan
so a > single row read gets what you need. > > Aaron > > On 24/02/2011, at 5:59 AM, Aditya Narayan wrote: > >> Does it make any difference if I split a row, that needs to be >> accessed together, into two or three rows and then read those multiple >> rows

Splitting a single row into multiple

2011-02-23 Thread Aditya Narayan
Does it make any difference if I split a row, that needs to be accessed together, into two or three rows and then read those multiple rows ?? (Assume the keys of all the three rows are known to me programatically since I split columns by certain categories). Would the performance be any better if a

Re: Confused about get_slice SliceRange behavior with bloom filter

2011-02-14 Thread Aditya Narayan
Thanks for the clarifications.. On Mon, Feb 14, 2011 at 6:13 PM, Sylvain Lebresne wrote: > On Mon, Feb 14, 2011 at 11:27 AM, Aditya Narayan wrote: > >> Thanks Sylvain, >> >> I guess I might have misunderstood the meaning of column_index_size_in_kb, >> My previou

Re: Confused about get_slice SliceRange behavior with bloom filter

2011-02-14 Thread Aditya Narayan
to be sequential on disk). So if the columns you ask for are > really randomly distributed, then yes, the biggest the row is, the biggest > the chance is to have to hit many blocks and the biggest the chance is for > these block to be far apart on disk. > > -- > Sylvain > > On

Re: Confused about get_slice SliceRange behavior with bloom filter

2011-02-13 Thread Aditya Narayan
Jonathan, If I ask for around 150-200 columns (totally random not sequential) from a very wide row that contains more than a million or even more columns then, is the read performance of the SliceQuery operation affected by or "depends on the length of the row" ?? (For my use case, I would use the

Re: Merging the rows of two column families(with similar attributes) into one ??

2011-02-12 Thread Aditya Narayan
Any comments/view points on this? --On Sat, Feb 12, 2011 at 5:05 PM, Aditya Narayan wrote: What if the caching requirements, sorting needs of two kind of data are very much similar, is it preferable to go with a single CF in those cases ? Regards Aditya > > >>> On Sat, Fe

Re: Merging the rows of two column families(with similar attributes) into one ??

2011-02-12 Thread Aditya Narayan
What if the caching requirements, sorting needs of two kind of data are very much similar, is it preferable to go with a single CF in those cases ? Regards Aditya >>> On Sat, Feb 5, 2011 at 10:43 AM, Tyler Hobbs  wrote: >> >> I read somewhere that more no of column families is not a good

Re: Calculating the size of rows in KBs

2011-02-10 Thread Aditya Narayan
t may then be partially or fully read from > disk during subsequent reads or compactions. > > On disk format  described here may help > http://wiki.apache.org/cassandra/ArchitectureSSTable > > Hope that helps > Aaron > On 10/02/2011, at 11:56 PM, Aditya Narayan wrote: >

Calculating the size of rows in KBs

2011-02-10 Thread Aditya Narayan
How can I get or calculate the size of rows/ columns ? what are the any overheads on memory for each column/row ?

Re: Does variation in no of columns in rows over the column family has any performance impact ?

2011-02-07 Thread Aditya Narayan
Thanks for the detailed explanation Peter! Definitely cleared my doubts ! On Mon, Feb 7, 2011 at 1:52 PM, Peter Schuller wrote: >> Does huge variation in no. of columns in rows, over the column family >> has *any* impact on the performance ? >> >> Can I have like just 100 columns in some rows a

Does variation in no of columns in rows over the column family has any performance impact ?

2011-02-06 Thread Aditya Narayan
Does huge variation in no. of columns in rows, over the column family has *any* impact on the performance ? Can I have like just 100 columns in some rows and like hundred thousands of columns in another set of rows, without any downsides ?

Re: Sorting in time order without using TimeUUID type column names

2011-02-04 Thread Aditya Narayan
need for the user name if this is in a row just for the user. > > Hope that helps. > Aaron > > On 4 Feb 2011, at 01:32, Aditya Narayan wrote: > >> If I use : : : >> as key pattern for the rows of reminders, then I am storing the key, >> just as it is, a

Re: Using Cassandra to store files

2011-02-04 Thread Aditya Narayan
yes, definitely a database for mapping ofcourse! On Fri, Feb 4, 2011 at 11:17 PM, buddhasystem wrote: > > Even when storage is in NFS, Cassandra can still be quite useful as a file > catalog. Your physical storage can change, move etc. Therefore, it's a good > idea to provide mapping of logical n

Re: Using Cassandra to store files

2011-02-04 Thread Aditya Narayan
I am also looking to possible solutions to store pdfs & word documents. But why wont you store in them in the filesystem instead of a database unless your files are too small in which case it would be recommended to use a database. -Aditya On Fri, Feb 4, 2011 at 5:30 PM, Daniel Doubleday wrote

Column Sorting of integer names

2011-02-04 Thread Aditya Narayan
Is there any way to sort the columns named as integers in the descending order ? Regards -Aditya

Re: Sorting in time order without using TimeUUID type column names

2011-02-03 Thread Aditya Narayan
perhaps not aware of ? On Thu, Feb 3, 2011 at 5:43 PM, Sylvain Lebresne wrote: > On Thu, Feb 3, 2011 at 11:27 AM, Aditya Narayan wrote: >> >> Hey all, >> >> I want to store some columns that are reminders to the users on my >> application, in time sorted order

Re: Sorting in time order without using TimeUUID type column names

2011-02-03 Thread Aditya Narayan
perhaps not aware of ? On Thu, Feb 3, 2011 at 5:43 PM, Sylvain Lebresne wrote: > On Thu, Feb 3, 2011 at 11:27 AM, Aditya Narayan wrote: >> >> Hey all, >> >> I want to store some columns that are reminders to the users on my >> application, in time sorted order

Sorting in time order without using TimeUUID type column names

2011-02-03 Thread Aditya Narayan
timeline in the order of their due time.) Basically I am trying to avoid 16 bytes long timeUUID first because they are too long and the above defined key pattern is guaranteeing me a unique key/Id for the reminder row always. Thanks Aditya Narayan

Re: Schema Design Question : Supercolumn family or just a Standard column family with columns containing serialized aggregate data?

2011-02-03 Thread Aditya Narayan
Thanks Tyler! On Thu, Feb 3, 2011 at 12:06 PM, Tyler Hobbs wrote: > On Wed, Feb 2, 2011 at 3:27 PM, Aditya Narayan wrote: >> >> Can I have some more feedback about my schema perhaps somewhat more >> criticisive/harsh ? > > It sounds reasonable to me. > > Since

Re: Schema Design Question : Supercolumn family or just a Standard column family with columns containing serialized aggregate data?

2011-02-02 Thread Aditya Narayan
Can I have some more feedback about my schema perhaps somewhat more criticisive/harsh ? Thanks again, Aditya Narayan On Wed, Feb 2, 2011 at 10:27 PM, Aditya Narayan wrote: > @Bill > Thank you BIll! > > @Cassandra users > Can others also leave their suggestions and comments

Re: Schema Design Question : Supercolumn family or just a Standard column family with columns containing serialized aggregate data?

2011-02-02 Thread Aditya Narayan
standard type column family. Thanks -Aditya Narayan On Wed, Feb 2, 2011 at 10:11 PM, William R Speirs wrote: > I did not understand before... sorry. > > Again, depending upon how many reminders you have for a single user, this > could be a long/wide row. Again, it really comes down

Re: Schema Design Question : Supercolumn family or just a Standard column family with columns containing serialized aggregate data?

2011-02-02 Thread Aditya Narayan
ers) for a standard SQL/relational model, then it's > probably too much for a single row. > > I'm not familiar with the TTL functionality of Cassandra... sorry cannot > help/comment there, still learning :-) > > Yea, my $0.02 is that this is an effective way to lev

Re: Schema Design Question : Supercolumn family or just a Standard column family with columns containing serialized aggregate data?

2011-02-02 Thread Aditya Narayan
ws of the reminder details would be picked up.." Is supercolumn a preferable choice for this ? Can there be a better schema than this ? -Aditya Narayan On Wed, Feb 2, 2011 at 8:54 PM, William R Speirs wrote: > To reiterate, so I know we're both on the same page, your schema would be

Re: Schema Design Question : Supercolumn family or just a Standard column family with columns containing serialized aggregate data?

2011-02-02 Thread Aditya Narayan
of tags associated with particular reminder. All tags set at once during first write. The no of tags(subcolumns) will be around 8 maximum. Any comments, suggestions and feedback on the schema design are requested.. Thanks Aditya Narayan On Wed, Feb 2, 2011 at 7:49 PM, Aditya Narayan wrote: >

Schema Design Question : Supercolumn family or just a Standard column family with columns containing serialized aggregate data?

2011-02-02 Thread Aditya Narayan
or just a standard column family containing "all the subcolumns data serialized in single column(s) " ? Thanks Aditya Narayan