Anyway, what I realised during Eric's presentation, was that he already added some stuff from the next Cassandra release 0.7. First of all, every time he was showing configuration, he had an excerpt from the a cassandra.yaml file. For instance this snippet from his timeseries example:
new yaml configuration in Cassandra
#conf/cassandra.yaml
keyspaces:
-name: Sites
column_families:
-name Stats
compare_with: LongType
Apparently as of version 0.7, the cassandra.yaml file is replacing the cassandra.xml file. I have not come in contact with yaml really, I believe it is common in the Ruby world. Another very cool feature is the addition of secondary indexes to Cassandra. In previous versions, Cassandra did not have indexes out of the box. To mimic the behavior of a secondary index, what you could have done is to create another Column Family (I believe it was called). This new Column Family would then be sorted differently and contain a key to the "original" entry. As a example, imagine having a Column Family to store addresses. To be able to search by the city, you could create another Column Family called "byCity" with two properties, "city" and "address key". Every time you insert or update an address, your code has to alter the byCity Column Family.
It looks like Cassandra will do this for you from version 0.7 on. There two new per-column settings called index_name and index_type. If I understood Eric correctly, adding this to your configuration will create you an inverted index, which can be used as a secondary access path. I think this is a very nice, yet very undocumented, feature. No clue when version 0.7 is going to be released but I hope it will be very soon, because we are only weeks away from starting a very big Cassandra project in my company.
0 Kommentare:
Kommentar veröffentlichen