mongodb - Handling large amounts of denormalized read model updates in CQRS -
i'm designing cqrs event-sourced system (not first) read models denormalized , stored in read-optimized document database (mongodb). nothing special. particular read model, document contains user id , potentially large array of groups user member of:
{ "userid": 1, "username": "aaron", "groups": [ { "groupid": 1, "name": "group 1" }, { "groupid": 2, "name": "group 2" } ] }
there 10's of thousands of users member of single group (just 1 example: imagine group every staff member member of).
keeping in mind reason i'm using cqrs in first place, need scale reads (or rather, handle reads differently given need avoid lots of joins), , i'm expecting significant volume of writes. isn't only reason i'm using cqrs , event-sourcing, 1 major catalyst.
now problem have when updates group name, (which i'm predicting happen quite frequently) read model needs updating. means single user modification single piece of data, going cause 10's of thousands of updates in read store.
i aware of techniques can apply handle dispatching update avoid temporal coupling, concerned number of documents updated per single user modification.
i've read several answers ask exact type of question, , answers suggest either need strike balance, or not worry mass updates. imo, not option. there no balance had in type of read model (any re-modelling of document still necessitate group name appearing just many times, no matter how it's re-modelled), , accepting mass quantities of updates counter-productive idea of super fast read store, under severe load due constant updates going queued up. happen, denormalizing process going bottleneck, , queue going grow on time (until there respite users updating group names), , reading become slow side-effect.
before jumps on me , asks whether know bottleneck occur, answer "it should, can't sure". but, based on knowing how many changes made in existing system i'm replacing, and, keeping in mind not type of model in document database require updating, have pretty cause concerned. said, there several other read models - may not have same number of updates - nevertheless add write load in read store. and, read store can take many writes.
i can think of 2 solutions (one dumb, 1 not dumb):
store version in each document, , not update read model when event occurs. when read occurs particular document, check staleness, , if version stale (due proceeding command), apply last change document before storing , returning it. however, instinct tells me every document going updated regardless, , adding additional overhead read. have no idea how versioning work
use relational read model , have single join. seems sensible option, update join table, , good. reads wouldn't fast, , feels bit more inferior pure select * tablename approach.
my question:
are there standard techniques combating type of issue? second option offered best can hope for?
i have thought type of problem occur time in cqrs event-sourced systems, denormalized data needs kept in sync, there seems lack of discussion in community leads me believe i'm missing obvious solution, or read model needs improvement.
i think when expect 1 user member of 10s of thousands of groups, model have chosen wrong. need remove list of groups user document , stick relational model, keeping group ids. imagine groups need more attributes names , face same issue again. , again.
Comments
Post a Comment