Reading the bit about using write optimised data stores. At what point do write perf requirements make you think about using a log structured storage vs something based on B trees? Any numbers for write throughput etc. that can be used to reason about this?
This is a good question and there probably isn't a hard number value that you can assign to your throughput to decide at what point you may want to start considering either of the alternatives. I would instead suggest thinking in terms of access patterns on your data.
If the writes in your application are orders of magnitude more than the reads then log-structured storage may be a natural fit for your use-case since the write-amplification of B-trees (node splits, rebalancing) may pose a performance bottleneck.
In case of a transactional system where reads and writes are comparable - it may be well suited to use a b-tree based solution since random read performance for B-trees is generally better (due to a more structured on-disk layout).
Finally, empirical analysis might be the best way to judge the suitability of either solution to your use-case since the background operations in either system may affect your specific usage patterns in different ways as well.
Reading the bit about using write optimised data stores. At what point do write perf requirements make you think about using a log structured storage vs something based on B trees? Any numbers for write throughput etc. that can be used to reason about this?
This is a good question and there probably isn't a hard number value that you can assign to your throughput to decide at what point you may want to start considering either of the alternatives. I would instead suggest thinking in terms of access patterns on your data.
If the writes in your application are orders of magnitude more than the reads then log-structured storage may be a natural fit for your use-case since the write-amplification of B-trees (node splits, rebalancing) may pose a performance bottleneck.
In case of a transactional system where reads and writes are comparable - it may be well suited to use a b-tree based solution since random read performance for B-trees is generally better (due to a more structured on-disk layout).
Finally, empirical analysis might be the best way to judge the suitability of either solution to your use-case since the background operations in either system may affect your specific usage patterns in different ways as well.