php - Which database for dealing with very large result-sets? -

September 15, 2014

i working on php application (pre-release).

background

we have table in our mysql database expected grow extremely large - not unusual single user own 250,000 rows in table. each row in table given amount , date, among other things.

furthermore, particular table read (and written to) - on majority of pages. given each row has date, i'm using group date minimise size of result-set given mysql - rows contained in same year can seen 1 total.

however, typical page still have result-set between 1000-3000 results. there places many sum()'s performed, totalling many tens - if not hundreds - of thousands of rows.

trying mysql

on usual page, mysql taking around around 600-900ms. using limit , offsets weren't helping performance , data has been heavily normalised, , doesn't seem further normalisation help.

to make matters worse, there parts of application require retrieval of 10,000-15,000 rows database. results used in calculation php , formatted accordingly. given this, performance of mysql wasn't acceptable.

trying mongodb

i have converted table mongodb, , it's speed faster - takes around 250ms retrieve 2,000 documents. however, $group command in aggregation pipeline - needed aggregate fields depending on year fall in - slows things down. unfortunately, keeping total , updating whenever document removed/updated/inserted out of question, because although can use yearly total parts of app, in other parts calculations require each amount falls on specific date.

i've considered redis, although think complexity of data beyond redis designed for.

the final straw

on top of of this, speed important. performance there terms of priorities.

questions:

what best way store data read/written , rapidly growing, knowledge queries retrieve large result-set?
is there solution problem? i'm totally open suggestions.

i'm little stuck @ moment, haven't been able retrieve such large result-set in acceptable amount of time. seems datastores great small retrieval sizes - on large amounts of data - haven't been able find on retrieving large amounts of data larger table/collection.

i read first 2 lines using aggregation (group by) , expecting realtime?

i new internals of databases not undermine try , you.

the group operator in both mysql , mongodb in-memory. in other words takes whatever data structure povide, whether index or document (row) , go through each row/document taking field , grouping up.

this means can speed in both mysql , mongodb making sure using index grouping, still goes far, housing index in direct working set in mongodb (memory).

in fact using limit offset slowing things down further frankly. since after writing out set mysql needs query again answer.

once done write out result, mysql write out result set (memory , io being used here) , mongodb reply inline if have not set $out, maximum size of inline output being 16mb (the maximum size of document).

the final point take away here is: aggregation horrible

there no silver bullet save here, databases attempt boast speed etc etc fact big aggregators use called "pre-aggregated reports". can find quick introduction within mongodb documentation: http://docs.mongodb.org/ecosystem/use-cases/pre-aggregated-reports/

this means put effort of aggregating , grouping onto other process enough allowing reading thread, 1 needs realtime it's thang in realtime.

Search This Blog

Ruby Code

php - Which database for dealing with very large result-sets? -

Comments

Post a Comment

Popular posts from this blog

php - failed to open stream: HTTP request failed! HTTP/1.0 400 Bad Request -

command line - Use qwinsta in PowerShell ISE -

java - Show Soft Keyboard when EditText Appears -