python - Appengine Query Offset - Not For Paging - EDIT With Memcache? -


i have application there list of items users page through. have handled paging through index field (i needed other things eitherway figured why not).

my issue want implement "goto" feature; user can skip directly item instead of paging through them using provided navigation buttons (next , previous). instance, can enter 1000 in "goto" box , have 1000th item displayed. there disconnect between nth item , index - index guaranteed in order not guaranteed sequential can't filter index. thought using offset parameter of fetch, remember when first started programming using appengine told not use due performance issues.

would offset best way go here, or there better way? also, costs associated take longer results, or count towards datastore reads/small operations?

edit: don't mean in bad way in order stave off people tell me use cursors... :-) handle paging in way more useful me if use cursors. thank in advance concern. additionally, thought i'd spell out i'm trying bit in code:

q = item.all() #orders highest index first how client handles items q = q.order('-index') #count determined automatically @ least 25 , not greater 300 q = q.fetch(limit=count, offset=i) 

edit 2: based on comments decided try storing items in memcache, , of filtering, ordering, offsets, etc... in memory. item grouped category hold 1500 items, , store each category in memcache under own key. issue think of each item can worst-case scenario 2kb in size. it's not category have anywhere near 1500 items in it, or item reach worst-case scenario size, if does, exceed 1mb memcache limit. suggestions on how handle that? also, there around 10 categories; storage in memcache cause flush more often? , finally, worth use offset when fetch entities or memcache better solution (items accessed quite frequently, in small groups (25-30))?

edit 3: have sequential way of referencing items. each item has id uniquely identifies across categories, index way of ordering items within category non-sequentially, , num sequential, isn't implicit item (everytime pull items out of memcache order index, , iterate through list of items, assigning each item num given current number of iterations) guess that's convoluted way of saying:

for in range(0, len(items)): items[i]['num'] = 

edit 4: item model:

class item(db.model): item_id = db.integerproperty() index = db.integerproperty() #i used stringproperty instead of referenceproperty because i'm cheapo memory category = db.stringproperty() 

i kept num separate model because of cost associated updating sequential on adds , removes. therefore, use index maintain (nonsequential) order of items, , everytime list of dicts representing items specific category kicked out of datastore, run through them , add sequential "num" each item. num client (read: browser) since ui entirely dynamic (all ajax; no page reloads whatsoever) , cache every item sent browser in javascript. server-side don't need sequential order items; there functions on client side need it, , server fine non-sequential index.

the main crux of question seems have turned whether should keep model, ie storing items category in memcache, or going retrieving items directly datastore. items requested lot (i don't have exact amount or estimate of how many times per second, should many items requested per second). know there's no way precisely determine how long items in memcache before getting kicked out, can assume won't happening every few minutes? because if otherise, feel best way go memcache, missing something. oh, , last edit before steal of so's disk space ;)

edit 5 no more edits... chart of calculations time complexity when using memcache , datastore or datastore (left out time complexity datastore because i'm not sure is. it's late go read bigtable paper again try , figure out i'll assume it's same operations on hashtable). these best cases. memcache solution, worst case need add n datastore reads (since items in category must read memcache). chart leaving not having storing or retrieving data (ie sorts, filters) out of equation both memcache , datastore solutions. memcache solution, num not stored in datastore. datastore solution is, why there cost associated add or remove (updating num each item).

n ds = number of datastore operations w = write r = read n = number of items in category (for add , remove number before operation performed) c = count of items read o = offset +------------------------------------------------------------------------------+ | memcache | datastore | |------------------------------------------------------------------------------| | | | | | | reads | o(o + c) | reads | c ds r | |-------+-------------------------------|-------+------------------------------| | | | | | |reads w| o(o + c) |reads w| o + c ds r | |offset | |offset | | |-------+-------------------------------|-------+------------------------------| | | | | | | adds | 1 ds w + o(n) | adds | 1 + n ds w & n - 1 ds r | |-------+-------------------------------|-------+------------------------------| | | | | | |removes| 1 ds rw + o(o + n) |removes| n - o ds wr | |-------+-------------------------------|-------+------------------------------| | | | | | | edits | 1 ds rw + o(o) | edits | 1 ds rw | |-------+-------------------------------|-------+------------------------------| 

so question is, worse time complexity memcache solution outweigh potential more ds operations come datastore solution, unless memcache eviction cause more ds operations in memcache solution datastore solution (because each time items evicted mecache have n ds r repopulate memcache). assuming reads happen more writes in application case once initial data loading done.

updated edit 4.

your item model looks reasonable, biggest issue how how manage sequential index. i'm still hesitent rely on memcache in way describe, because cache eviction dramatically slows read operations (which common , user-facing) unless have datastore backing state of data.

so, feel free continue storing items in memcache. however, on inserts or deletes, make sure update num in datastore well. (if have entire set of items in memcache, no read ops required. update items in memcache , write them datastore simultaneously.)

the worst case scenario still described before 4th edit. inserting element 1 read + 1 write. removing element n reads + n writes n number of items in category. looking item 1 read. each of these scenarios assume memcache empty.

if using offset, each insert 1 write. removing element 1 write. but, reading element n reads, n sequential index of item retrieving. if you're using memcache, aren't backing value of num in datastore, you'll fall scenario.

in cases, reads far more common writes, maintaining num in datastore far more efficient.

an addendum:

cloud sql option if data size isn't large. sql in general better @ sequential queries 1 trying do, @ cost of scaling poorly large data sets.

the per use pricing relatively cheap if suspect you'll have minimal usage.


Comments

Popular posts from this blog

javascript - backbone.js Collection.add() doesn't `construct` (`initialize`) an object -

c++ - Accessing inactive union member and undefined behavior? -

php - Get uncommon values from two or more arrays -