tickstore query slowly #69

zoe0316 · 2015-12-24T01:47:22Z

Arctic said that can query millions of rows per second per client, but when I try to use it in our team, and found that it only thousand of rows per second, Here the code, Does anyone got the same problem or I use it with wrong way.

    @property
    def arctic(self):
        if not self._arctic:
            log.info("init arctic")
            mongo_conn = MongoDB()
            self._arctic = Arctic(mongo_host=mongo_conn.client)
            library = self._arctic.list_libraries()
            if self.tick_db not in library:
                self._arctic.initialize_library(self.tick_db, lib_type=arctic.TICK_STORE)
            if self.bar_db not in library:
                self._arctic.initialize_library(self.bar_db, lib_type=arctic.TICK_STORE)
        return self._arctic

...
# res is a dict of tick data
index = self.int_to_date(tick_time)
data = pd.DataFrame(res, [index])
self.arctic[self.tick_db].write(symbol, data)

...

>>> now = time.time(); ac['tick'].read('IF1601', date_range=dr); print(time.time() - now)
Output：
[4021 rows x 26 columns]
3.56284999847

thanks.

femtotrader · 2015-12-24T18:50:56Z

For performance comparison with "pure" pymongodb see

In [234]: %time df_retrieved = pd.DataFrame(list(db.ticks.find()))
CPU times: user 39.6 s, sys: 27.1 s, total: 1min 6s
Wall time: 1min 21s

In [236]: df_retrieved
Out[236]:
             Ask      Bid   Spread  Volume                       _id
0        0.88922  0.88796  0.00126       1  567c324fcc9915206eb18cc8
1        0.88914  0.88805  0.00109       1  567c324fcc9915206eb18cc9
2        0.88910  0.88809  0.00101       1  567c324fcc9915206eb18cca
3        0.88908  0.88811  0.00097       1  567c324fcc9915206eb18ccb
4        0.88887  0.88808  0.00079       1  567c324fcc9915206eb18ccc
...          ...      ...      ...     ...                       ...
1913358  0.87589  0.87525  0.00064       1  567c32b1cc9915206ecebed6
1913359  0.87589  0.87527  0.00062       1  567c32b1cc9915206ecebed7
1913360  0.87588  0.87531  0.00057       1  567c32b1cc9915206ecebed8
1913361  0.87574  0.87531  0.00043       1  567c32b1cc9915206ecebed9
1913362  0.87574  0.87531  0.00043       1  567c32b1cc9915206ecebeda

[1913363 rows x 5 columns]

cityhunterok · 2015-12-25T09:54:23Z

we should use it store more ticks data in one record by pandas DataFrame , right?

femtotrader · 2015-12-25T10:06:24Z

Let's use same file for benchmarking https://drive.google.com/file/d/0B8iUtWjZOTqla3ZZTC1FS0pkZXc/view?usp=sharing

see also pydata/pandas-datareader#153

I wonder if they (manahl Arctic dev team) shouldn't use Monary instead of pymongo
https://github.com/ksuarz/monary https://monary.readthedocs.org/

Read this https://pypi.python.org/pypi/Monary/0.4.0.post2

It is possible to get (much) more speed from the query if we bypass the PyMongo
driver. To demonstrate this, I've developed *monary*, a simple C library and
accompanying Python wrapper which make use of MongoDB C driver.

see https://bitbucket.org/djcbeach/monary/issues/19/use-pandas-series-dataframe-and-panel-with

jamesblackburn · 2015-12-28T10:02:45Z

I think there's quite a lot of overlap between what Monary does and Arctic.

Monary makes it fast to marshall primitive types (numpy int, floats, etc) into and out of MongoDB. We do something similar, except we do compression and batching on the client side. A lot of the win (in network and disk I/O terms) comes from financial data being highly compressible. Because we batch in the client, we end up performing few pymongo operations relative to the number of ticks/rows.

For profiling perhaps try: %prun in ipython

zoe0316 · 2015-12-31T08:28:24Z

Thanks for your comments. I have made a mistake, that I should not insert single row to Arctic but with batch way. Happy new year. XD

zoe0316 closed this as completed Dec 31, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tickstore query slowly #69

tickstore query slowly #69

zoe0316 commented Dec 24, 2015

femtotrader commented Dec 24, 2015

cityhunterok commented Dec 25, 2015

femtotrader commented Dec 25, 2015

jamesblackburn commented Dec 28, 2015

zoe0316 commented Dec 31, 2015

tickstore query slowly #69

tickstore query slowly #69

Comments

zoe0316 commented Dec 24, 2015

femtotrader commented Dec 24, 2015

cityhunterok commented Dec 25, 2015

femtotrader commented Dec 25, 2015

jamesblackburn commented Dec 28, 2015

zoe0316 commented Dec 31, 2015